有temp表包含A,B两列,请使用SQL对该B列进行处理,形成C列,按A列顺序,B列值不变,则C列累计计数,C列值变化,则C列重新开始计数,期望结果如下
样例数据
+-------+----+
| a | b |
+-------+----+
| 2010 | 1 |
| 2011 | 1 |
| 2012 | 1 |
| 2013 | 0 |
| 2014 | 0 |
| 2015 | 1 |
| 2016 | 1 |
| 2017 | 1 |
| 2018 | 0 |
| 2019 | 0 |
+-------+----+
期望结果
+-------+----+----+
| a | b | c |
+-------+----+----+
| 2010 | 1 | 1 |
| 2011 | 1 | 2 |
| 2012 | 1 | 3 |
| 2013 | 0 | 1 |
| 2014 | 0 | 2 |
| 2015 | 1 | 1 |
| 2016 | 1 | 2 |
| 2017 | 1 | 3 |
| 2018 | 0 | 1 |
| 2019 | 0 | 2 |
+-------+----+----+
维度 | 评分 |
---|---|
题目难度 | ⭐️⭐️⭐️⭐️ |
题目清晰度 | ⭐️⭐️⭐️⭐️⭐️ |
业务常见度 | ⭐️⭐️⭐️ |
使用lag判断是否连续,并对连续打标为0,不连续打标为1
执行SQL
select a,
b,
if(b = lag(b,1) over (order by a asc), 0, 1) as is_conn
from t19_temp
SQL结果
+-------+----+----------+
| a | b | is_conn |
+-------+----+----------+
| 2010 | 1 | 1 |
| 2011 | 1 | 0 |
| 2012 | 1 | 0 |
| 2013 | 0 | 1 |
| 2014 | 0 | 0 |
| 2015 | 1 | 1 |
| 2016 | 1 | 0 |
| 2017 | 1 | 0 |
| 2018 | 0 | 1 |
| 2019 | 0 | 0 |
+-------+----+----------+
使sum()over(order by )方式计算出连续的分组id:conn_group_id
执行SQL
select a,
b,
sum(is_conn) over (order by a asc) as conn_group_id
from (select a,
b,
if(b = lag(b, 1) over (order by a asc), 0, 1) as is_conn
from t19_temp
) t
SQL结果
+-------+----+----------------+
| a | b | conn_group_id |
+-------+----+----------------+
| 2010 | 1 | 1 |
| 2011 | 1 | 1 |
| 2012 | 1 | 1 |
| 2013 | 0 | 2 |
| 2014 | 0 | 2 |
| 2015 | 1 | 3 |
| 2016 | 1 | 3 |
| 2017 | 1 | 3 |
| 2018 | 0 | 4 |
| 2019 | 0 | 4 |
+-------+----+----------------+
执行SQL
select a,
b,
count(1) over (partition by conn_group_id order by a asc) as c
from (select a,
b,
sum(is_conn) over (order by a asc) as conn_group_id
from (select
a,
b,
if(b = lag(b, 1) over (order by a asc), 0, 1) as is_conn
from db_interview_cj.t19_temp
) t
) tt
SQL结果
+-------+----+----+
| a | b | c |
+-------+----+----+
| 2010 | 1 | 1 |
| 2011 | 1 | 2 |
| 2012 | 1 | 3 |
| 2013 | 0 | 1 |
| 2014 | 0 | 2 |
| 2015 | 1 | 1 |
| 2016 | 1 | 2 |
| 2017 | 1 | 3 |
| 2018 | 0 | 1 |
| 2019 | 0 | 2 |
+-------+----+----+
--建表语句
CREATE TABLE IF NOT EXISTS t19_temp
(
a string, -- 用户id
b bigint -- 登陆日期
)
COMMENT 'temp';
--插入数据
INSERT INTO t19_temp VALUES
('2010',1),
('2011',1),
('2012',1),
('2013',0),
('2014',0),
('2015',1),
('2016',1),
('2017',1),
('2018',0),
('2019',0);