前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >腾讯大数据面试SQL-合并连续支付订单

腾讯大数据面试SQL-合并连续支付订单

作者头像
数据仓库晨曦
修改2024-06-18 19:42:26
671
修改2024-06-18 19:42:26
举报
文章被收录于专栏:数据仓库技术数据仓库技术

一、题目

现有一张用户支付表:t_user_pay包含字段订单ID,用户ID,商户ID,支付时间,支付金额。如果同一用户在同一商户存在多笔订单,且中间该用户没有其他商户的支付记录,则认为是连续订单,请把连续订单进行合并,时间取最早支付时间,金额求和。

样例数据如下:

代码语言:javascript
复制
+----------+---------+-------------+---------------------+--------------+
| order_id | user_id | merchant_id |      pay_time        |  pay_amount  |
+----------+---------+-------------+---------------------+--------------+
| 001      | user_01 | merchant_01 | 2023-03-01 12:30:00 | 50.0         |
| 002      | user_01 | merchant_01 | 2023-03-01 13:45:00 | 75.5         |
| 003      | user_01 | merchant_02 | 2023-03-01 14:00:00 | 100.0        |
| 004      | user_01 | merchant_03 | 2023-03-02 09:15:00 | 25.0         |
| 005      | user_01 | merchant_03 | 2023-03-02 10:30:00 | 150.25       |
| 006      | user_01 | merchant_01 | 2023-03-02 11:00:00 | 500.0        |
| 007      | user_01 | merchant_02 | 2023-03-03 08:00:00 | 80.0         |
| 008      | user_02 | merchant_01 | 2023-03-03 09:30:00 | 120.0        |
| 009      | user_02 | merchant_02 | 2023-03-04 13:45:00 | 65.0         |
| 010      | user_02 | merchant_03 | 2023-03-04 14:00:00 | 150.0        |
| 011      | user_02 | merchant_03 | 2023-03-05 11:30:00 | 20.0         |
| 012      | user_02 | merchant_03 | 2023-03-05 12:00:00 | 105.0        |
| 013      | user_03 | merchant_02 | 2023-03-05 13:15:00 | 250.0        |
| 014      | user_03 | merchant_01 | 2023-03-06 09:45:00 | 30.0         |
| 015      | user_03 | merchant_02 | 2023-03-06 10:00:00 | 90.5         |
+----------+---------+-------------+---------------------+--------------+

二、分析

这是个典型的连续问题,通常使用的row_number()开窗函数进行计算,但是该题目与连续登录不一样,连续登录可以计算日期差等方式,这个数据中不存在默认的日期差。

维度

评分

题目难度

⭐️⭐️⭐️

题目清晰度

⭐️⭐️⭐️⭐️⭐️

业务常见度

⭐️⭐️⭐️⭐️⭐️

三、SQL

1.分别按照用户分组按照支付时间进行排序得到 u_rn,用户和商户分组按照支付时间排序u_m_rn,执行语句和结果如下,为了能保持原来的数据顺序,我按照订单ID进行了排序。

执行SQL

代码语言:javascript
复制
select order_id,
       user_id,
       merchant_id,
       pay_time,
       pay_amount,
       row_number() over (partition by user_id order by pay_time)             as u_rn,
       row_number() over (partition by user_id,merchant_id order by pay_time) as u_m_rn
from t_user_pay
order by order_id asc;

执行结果

代码语言:javascript
复制
+-----------+----------+--------------+------------------------+-------------+-------+---------+
| order_id  | user_id  | merchant_id  |        pay_time        | pay_amount  | u_rn  | u_m_rn  |
+-----------+----------+--------------+------------------------+-------------+-------+---------+
| 001       | user_01  | merchant_01  | 2023-03-01 12:30:00.0  | 50.0        | 1     | 1       |
| 002       | user_01  | merchant_01  | 2023-03-01 13:45:00.0  | 75.5        | 2     | 2       |
| 003       | user_01  | merchant_02  | 2023-03-01 14:00:00.0  | 100.0       | 3     | 1       |
| 004       | user_01  | merchant_03  | 2023-03-02 09:15:00.0  | 25.0        | 4     | 1       |
| 005       | user_01  | merchant_03  | 2023-03-02 10:30:00.0  | 150.25      | 5     | 2       |
| 006       | user_01  | merchant_01  | 2023-03-02 11:00:00.0  | 500.0       | 6     | 3       |
| 007       | user_01  | merchant_02  | 2023-03-03 08:00:00.0  | 80.0        | 7     | 2       |
| 008       | user_02  | merchant_01  | 2023-03-03 09:30:00.0  | 120.0       | 1     | 1       |
| 009       | user_02  | merchant_02  | 2023-03-04 13:45:00.0  | 65.0        | 2     | 1       |
| 010       | user_02  | merchant_03  | 2023-03-04 14:00:00.0  | 150.0       | 3     | 1       |
| 011       | user_02  | merchant_03  | 2023-03-05 11:30:00.0  | 20.0        | 4     | 2       |
| 012       | user_02  | merchant_03  | 2023-03-05 12:00:00.0  | 105.0       | 5     | 3       |
| 013       | user_03  | merchant_02  | 2023-03-05 13:15:00.0  | 250.0       | 1     | 1       |
| 014       | user_03  | merchant_01  | 2023-03-06 09:45:00.0  | 30.0        | 2     | 1       |
| 015       | user_03  | merchant_02  | 2023-03-06 10:00:00.0  | 90.5        | 3     | 2       |
+-----------+----------+--------------+------------------------+-------------+-------+---------+

2.计算出u_rn与u_m_rn的差值,为了方便进行比较,我把该列放到了商户字段后面。

执行SQL

代码语言:javascript
复制
select order_id,
       user_id,
       merchant_id,
       u_rn - u_m_rn as diff,
       pay_time,
       pay_amount,
       u_rn,
       u_m_rn
from (select order_id,
             user_id,
             merchant_id,
             pay_time,
             pay_amount,
             row_number() over (partition by user_id order by pay_time)             as u_rn,
             row_number() over (partition by user_id,merchant_id order by pay_time) as u_m_rn
      from t_user_pay) t
order by order_id asc;

执行结果

代码语言:javascript
复制
+-----------+----------+--------------+-------+------------------------+-------------+-------+---------+
| order_id  | user_id  | merchant_id  | diff  |        pay_time        | pay_amount  | u_rn  | u_m_rn  |
+-----------+----------+--------------+-------+------------------------+-------------+-------+---------+
| 001       | user_01  | merchant_01  | 0     | 2023-03-01 12:30:00.0  | 50.0        | 1     | 1       |
| 002       | user_01  | merchant_01  | 0     | 2023-03-01 13:45:00.0  | 75.5        | 2     | 2       |
| 003       | user_01  | merchant_02  | 2     | 2023-03-01 14:00:00.0  | 100.0       | 3     | 1       |
| 004       | user_01  | merchant_03  | 3     | 2023-03-02 09:15:00.0  | 25.0        | 4     | 1       |
| 005       | user_01  | merchant_03  | 3     | 2023-03-02 10:30:00.0  | 150.25      | 5     | 2       |
| 006       | user_01  | merchant_01  | 3     | 2023-03-02 11:00:00.0  | 500.0       | 6     | 3       |
| 007       | user_01  | merchant_02  | 5     | 2023-03-03 08:00:00.0  | 80.0        | 7     | 2       |
| 008       | user_02  | merchant_01  | 0     | 2023-03-03 09:30:00.0  | 120.0       | 1     | 1       |
| 009       | user_02  | merchant_02  | 1     | 2023-03-04 13:45:00.0  | 65.0        | 2     | 1       |
| 010       | user_02  | merchant_03  | 2     | 2023-03-04 14:00:00.0  | 150.0       | 3     | 1       |
| 011       | user_02  | merchant_03  | 2     | 2023-03-05 11:30:00.0  | 20.0        | 4     | 2       |
| 012       | user_02  | merchant_03  | 2     | 2023-03-05 12:00:00.0  | 105.0       | 5     | 3       |
| 013       | user_03  | merchant_02  | 0     | 2023-03-05 13:15:00.0  | 250.0       | 1     | 1       |
| 014       | user_03  | merchant_01  | 1     | 2023-03-06 09:45:00.0  | 30.0        | 2     | 1       |
| 015       | user_03  | merchant_02  | 1     | 2023-03-06 10:00:00.0  | 90.5        | 3     | 2       |
+-----------+----------+--------------+-------+------------------------+-------------+-------+---------+

3.根据user_id,merchant_id,u_rn-u_m_rn进行分组,取最小订单时间和金额求和,得到最终结果

执行SQL

代码语言:javascript
复制
select user_id,
       merchant_id,
       min(pay_time)   as new_pay_time,
       sum(pay_amount) as new_pay_amount
from (select order_id,
             user_id,
             merchant_id,
             pay_time,
             pay_amount,
             row_number() over (partition by user_id order by pay_time)             as u_rn,
             row_number() over (partition by user_id,merchant_id order by pay_time) as u_m_rn
      from t_user_pay) t
group by user_id, merchant_id, u_rn - u_m_rn
;

执行结果

代码语言:javascript
复制
+----------+--------------+------------------------+-----------------+
| user_id  | merchant_id  |      new_pay_time      | new_pay_amount  |
+----------+--------------+------------------------+-----------------+
| user_01  | merchant_01  | 2023-03-01 12:30:00.0  | 125.5           |
| user_01  | merchant_01  | 2023-03-02 11:00:00.0  | 500.0           |
| user_01  | merchant_02  | 2023-03-01 14:00:00.0  | 100.0           |
| user_01  | merchant_02  | 2023-03-03 08:00:00.0  | 80.0            |
| user_01  | merchant_03  | 2023-03-02 09:15:00.0  | 175.25          |
| user_02  | merchant_01  | 2023-03-03 09:30:00.0  | 120.0           |
| user_02  | merchant_02  | 2023-03-04 13:45:00.0  | 65.0            |
| user_02  | merchant_03  | 2023-03-04 14:00:00.0  | 275.0           |
| user_03  | merchant_01  | 2023-03-06 09:45:00.0  | 30.0            |
| user_03  | merchant_02  | 2023-03-05 13:15:00.0  | 250.0           |
| user_03  | merchant_02  | 2023-03-06 10:00:00.0  | 90.5            |
+----------+--------------+------------------------+-----------------+

四、建表语句和数据插入

代码语言:javascript
复制
--建表语句
CREATE TABLE t_user_pay (
  order_id STRING,
  user_id STRING,
  merchant_id STRING,
  pay_time TIMESTAMP,
  pay_amount DOUBLE
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
--数据插入

INSERT INTO t_user_pay VALUES
  ('001', 'user_01', 'merchant_01', '2023-03-01 12:30:00', 50.0),
  ('002', 'user_01', 'merchant_01', '2023-03-01 13:45:00', 75.5),
  ('003', 'user_01', 'merchant_02', '2023-03-01 14:00:00', 100.0),
  ('004', 'user_01', 'merchant_03', '2023-03-02 09:15:00', 25.0),
  ('005', 'user_01', 'merchant_03', '2023-03-02 10:30:00', 150.25),
  ('006', 'user_01', 'merchant_01', '2023-03-02 11:00:00', 500.0),
  ('007', 'user_01', 'merchant_02', '2023-03-03 08:00:00', 80.0),
  ('008', 'user_02', 'merchant_01', '2023-03-03 09:30:00', 120.0),
  ('009', 'user_02', 'merchant_02', '2023-03-04 13:45:00', 65.0),
  ('010', 'user_02', 'merchant_03', '2023-03-04 14:00:00', 150.0),
  ('011', 'user_02', 'merchant_03', '2023-03-05 11:30:00', 20.0),
  ('012', 'user_02', 'merchant_03', '2023-03-05 12:00:00', 105.0),
  ('013', 'user_03', 'merchant_02', '2023-03-05 13:15:00', 250.0),
  ('014', 'user_03', 'merchant_01', '2023-03-06 09:45:00', 30.0),
  ('015', 'user_03', 'merchant_02', '2023-03-06 10:00:00', 90.5);

原文链接

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2024-06-16,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 数据仓库技术 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 一、题目
  • 二、分析
  • 三、SQL
    • 1.分别按照用户分组按照支付时间进行排序得到 u_rn,用户和商户分组按照支付时间排序u_m_rn,执行语句和结果如下,为了能保持原来的数据顺序,我按照订单ID进行了排序。
      • 2.计算出u_rn与u_m_rn的差值,为了方便进行比较,我把该列放到了商户字段后面。
        • 3.根据user_id,merchant_id,u_rn-u_m_rn进行分组,取最小订单时间和金额求和,得到最终结果
        • 四、建表语句和数据插入
        相关产品与服务
        大数据
        全栈大数据产品,面向海量数据场景,帮助您 “智理无数,心中有数”!
        领券
        问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档