分析师A:哇 你们这个数据查询平台查Spark ,同样的SQL结果每次都不同,这能用?
我:???第一反应是你八成是在查一张,别人在更新数据的表吧。。(同个SQL也会去跑两遍 小声bb)
过了一会儿,分析师A:看了半天了没人动数据啊,你这个Spark有问题吧?
一脸蒙逼,开始排查问题。
首先把SQL copy执行了下,果然每次都不一样,查了下源表 也没人在变更。
SQL如下:
select b1.shop_grade,der_cnt) as sum_orders
from
(selectshop_id,sum(deal_price) deal_price,count(case when fundable_money,0)&al_price der_id end) order_cntfrom(selectshop_id,order_id,sum(deal_price) deal_price,sum(total_price) total_pricefrom aaawhere stat_day='20190617'and cid1 not in (293,-1)and cid2 not in (422,421,427,525,446)and substr(pay_time,1,10)>='2019-06-01'and substr(pay_time,1,10)<='2019-06-17'group by shop_id,order_id) a
left join(selectorder_id,sum(refundable_money) refundable_moneyfrom bbbwhere stat_day='20190617'and return_status=2group by order_id) b der_idder_id
group by a.shop_id) a1
inner join
(select distinct shop_id,shop_grade
from ccc
where stat_day='20190617'
and user_status=0
and shop_grade is not null) b1
on a1.shop_id=b1.shop_id
group by b1.shop_grade
粗粗看了下,没毛病,left join后inner join得到结果。
又去打了一下执行计划,注意到了有两个地方:
凉了呀,逻辑很通顺,那是咋回事呢??
通过把SQL拆成最小单元,一个一个排查,终于发现了,存在这样的数据:
并且每次执行结果都不一样。
那答案就很明了了:精度 又是坑在精度上了。
Spark在做sum的时候,如果对精度不做限制,并且之后会用到sum的结果做大小比较,就会出现这个问题。
给可能出现精度问题的字段,在sum之前加上cast( xx as decimal(a,b)),其中a是整数部分有效数字,b是小数部分的有效数字。
所以最后把SQL改成这样,问题就解决了:
select b1.shop_grade,der_cnt) as sum_orders
from
(selectshop_id,sum(deal_price) deal_price,count(case when fundable_money,0)&al_price der_id end) order_cntfrom(selectshop_id,order_id,sum(cast(deal_price as decimal(15,2))) deal_price,sum(cast(total_price as decimal(15,2))) total_pricefrom aaawhere stat_day='20190617'and cid1 not in (293,-1)and cid2 not in (422,421,427,525,446)and substr(pay_time,1,10)>='2019-06-01'and substr(pay_time,1,10)<='2019-06-17'group by shop_id,order_id) a
left join(selectorder_id,sum(cast(refundable_money as decimal(15,2))) refundable_moneyfrom bbbwhere stat_day='20190617'and return_status=2group by order_id) b der_idder_id
group by a.shop_id) a1
inner join
(select distinct shop_id,shop_grade
from ccc
where stat_day='20190617'
and user_status=0
and shop_grade is not null) b1
on a1.shop_id=b1.shop_id
group by b1.shop_grade
本文发布于:2024-02-04 20:00:06,感谢您对本站的认可!
本文链接:https://www.4u4v.net/it/170715362659129.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
留言与评论(共有 0 条评论) |