- **************************************************************************************
- ********************************************************************************************
- ******************************************
Table 1: orders
An eCommerce company's online order table.
col_name | col_type --------------+------------------- order_id | bigint product_id | bigint customer_id | bigint order_dt | date qty | integer unit_price_usd| float channel | varchar(20) -- mobile, desktop
Table 2: product_category
One product can only belong to one category
col_name | col_type ----------------------+------------------- product_id | bigint product_category_id | smallint
Sample results
product_category_id | product_id | quantities ---------------------+------------+------------+--------- 1 | 10000005 | 132 2 | 10000018 | 125 3 | 10000053 | 118
Solution postgres
WITH prod_cat_qty AS (
SELECT product_category_id, O.product_id, SUM(qty) AS quantities
FROM orders O
INNER JOIN product_category C
ON C.product_id = O.product_id
GROUP BY product_category_id, O.product_id
)
SELECT product_category_id, product_id, quantities
FROM (
SELECT product_category_id, product_id, quantities, RANK() OVER(PARTITION BY product_category_id ORDER BY quantities DESC) as ranking
FROM prod_cat_qty
) X
WHERE X.ranking=1;
Explanation
This query is trying to find the most popular products in each product category.
The first part of the query creates a common table expression (CTE) called "prod_cat_qty". It joins the "orders" table with the "product_category" table on the "product_id" column and groups the results by "product_category_id" and "product_id". It then calculates the sum of the "qty" column for each group and renames it as "quantities".
The second part of the query selects the "product_category_id", "product_id", and "quantities" columns from a subquery that is created from the "prod_cat_qty" CTE. This subquery uses the RANK() function to assign a ranking to each product within its product category based on the "quantities" column, with the highest quantity being ranked as 1.
Finally, the outer query filters the results to only show products that have a ranking of 1 within their respective product category. This essentially gives us the most popular products in each product category.
Expected results