HiveSQL刷题(查询销售件数高于品类平均数的商品)

阅读: 评论:0

HiveSQL刷题(查询销售件数高于品类平均数的商品)

HiveSQL刷题(查询销售件数高于品类平均数的商品)

Platform address:vue-sqleditor

题目需求:

从订单明细表(order_detail)中查询累积销售件数高于其所属品类平均数的商品

期望结果如下:

sku_id<string>    name<string>    sum_num<bigint>    cate_avg_num<bigint>

2    手机壳    6044    1546

5    破壁机    242    194

7    热水壶    252    194

8    微波炉    253    194

12    遮阳伞    20682    5373

需要用到的表:

商品信息表:sku_info

sku_id(商品id)    name(商品名称)    category_id(分类id)    from_date(上架日期)    price(商品价格)

1    xiaomi 10    1    2020-01-01    2000

6    洗碗机    2    2020-02-01    2000

9    自行车    3    2020-01-01    1000

订单明细表:order_detail

order_detail_id(订单明细id)    order_id(订单id)    sku_id(商品id)    create_date(下单日期)    price(商品单价)    sku_num(商品件数)

1    1    1    2021-09-30    2000.00    2

2    1    3    2021-09-30    5000.00    5

22    10    4    2020-10-02    6000.00    1

23    10    5    2020-10-02    500.00    24

24    10    6    2020-10-02    2000.00    5

建表语句:

CREATE TABLE `sku_info`(
      `sku_id` bigint, 
      `name` string, 
      `category_id` bigint, 
      `from_date` string, 
      `price` bigint)
ROW FORMAT DELIMITED  
  FIELDS TERMINATED BY 't'
STORED AS INPUTFORMAT  
  'org.apache.hadoop.mapred.TextInputFormat'  
OUTPUTFORMAT  
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
 

CREATE TABLE `order_detail`(
      `order_detail_id` bigint, 
      `order_id` bigint, 
      `sku_id` bigint, 
      `create_date` string, 
      `price` double, 
      `sku_num` bigint)

ROW FORMAT DELIMITED  

  FIELDS TERMINATED BY 't'

STORED AS INPUTFORMAT  

  'org.apache.hadoop.mapred.TextInputFormat'  

OUTPUTFORMAT  

  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';

插入样例数据:

Insert into sku_info 

Select 1,'xiaomi 10',1,'2020-01-01',2000

Union all

Select 6,'洗碗机',2,'2020-02-01',2000

Union all

Select 9,'自行车',3,'2020-01-01',1000;

Insert into category_info

Select 1,'数码'

Union all

Select 2,'厨卫'

Union all

Select 3,'户外';

Insert into order_detail 

Select 1,1,1,'2021-09-30',2000.00,2

Union all

Select 2,1,3,'2021-09-30',5000.00,5

Union all

Select 22,10,4,'2020-10-02',6000.00,1

Union all

Select 23,10,5,'2020-10-02',500.00,24

Union all

Select 24,10,6,'2020-10-02',2000.00,5;

需求分析:

select category_id,s1.sku_id,name,sum(sku_num) as sum_num from order_detail s1 join sku_info s2 on s1.sku_id=s2.sku_id group by category_id,s1.sku_id,name

首先获取到每个商品对应的品类号、名称及销量;

select category_id,sku_id,name,sum_num,avg(sum_num) over(partition by category_id) as cate_avg_num from (select category_id,s1.sku_id,name,sum(sku_num) as sum_num from order_detail s1 join sku_info s2 on s1.sku_id=s2.sku_id group by category_id,s1.sku_id,name)s3 group by category_id,sku_id,name,sum_num

其次对于每个品类开窗求得每个品类的销量均值;

select sku_id,name,sum_num,cast(cate_avg_num as bigint) as cate_avg_num from (select category_id,sku_id,name,sum_num,avg(sum_num) over(partition by category_id) as cate_avg_num from (select category_id,s1.sku_id,name,sum(sku_num) as sum_num from order_detail s1 join sku_info s2 on s1.sku_id=s2.sku_id group by category_id,s1.sku_id,name)s3 group by category_id,sku_id,name,sum_num)rs where sum_num>cate_avg_num

之后取出每个销量大于其品类销量均值的商品列表,并将销量均值由float转为bigint;

最终SQL:

selectsku_id,name,sum_num,cast(cate_avg_num as bigint) as cate_avg_num
from(selectcategory_id,sku_id,name,sum_num,avg(sum_num) over(partition by category_id) as cate_avg_numfrom(selectcategory_id,s1.sku_id,name,sum(sku_num) as sum_numfromorder_detail s1join sku_info s2 on s1.sku_id = s2.sku_idgroup bycategory_id,s1.sku_id,name) s3group bycategory_id,sku_id,name,sum_num) rs
wheresum_num > cate_avg_num

AC时间:

2022年12月20日22:29:58

本文发布于:2024-02-05 05:48:53,感谢您对本站的认可!

本文链接:https://www.4u4v.net/it/170725542763568.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:件数   平均数   品类   高于   商品
留言与评论(共有 0 条评论)
   
验证码:

Copyright ©2019-2022 Comsenz Inc.Powered by ©

网站地图1 网站地图2 网站地图3 网站地图4 网站地图5 网站地图6 网站地图7 网站地图8 网站地图9 网站地图10 网站地图11 网站地图12 网站地图13 网站地图14 网站地图15 网站地图16 网站地图17 网站地图18 网站地图19 网站地图20 网站地图21 网站地图22/a> 网站地图23