Say , we have this table products in Hive.
+----------+-------+-------+
| name | price | notes |
+----------+-------+-------+
| product1 | 100 | |
| product1 | 200 | note1 |
| product2 | 10 | note2 |
| product2 | 5 | note2 |
+----------+-------+-------+
and I expect to get this result (distinct of products with minimum price)
+----------+-------+-------+
| name | price | notes |
+----------+-------+-------+
| product1 | 100 | |
| product2 | 5 | note2 |
+----------+-------+-------+
How do we go about it:
1. The subquery approach:
2. The alternate and better approach is to use window partitioning:
select name,price,notes from (select *, min(price)over(partition by name) as min_price from products) as
where a.price = a.min_price;
The 2nd approach is better as it invokes a single-map task to read the table while the first approach will invoke 2 map tasks even on a single node system.
Jammin' Jars - Casino - Missouri - JSHub
ReplyDeleteJammin' Jars is a 3D slot 계룡 출장샵 machine with 5 reels and 창원 출장안마 3 충주 출장안마 rows with 20 fixed paylines. Hit 김해 출장안마 and miss on the 원주 출장안마 top symbols, and pay from anywhere. The symbols are