110. Most popular product by category

medium amazon

  • **************************************************************************************
  • ********************************************************************************************
  • ******************************************

Table 1: orders

An eCommerce company's online order table.

  col_name    | col_type
--------------+-------------------
order_id      | bigint
product_id    | bigint
customer_id   | bigint
order_dt      | date
qty           | integer
unit_price_usd| float
channel       | varchar(20) -- mobile, desktop

Table 2: product_category

One product can only belong to one category

  col_name            | col_type
----------------------+-------------------
product_id            | bigint
product_category_id   | smallint

Sample results

 product_category_id | product_id | quantities 
---------------------+------------+------------+---------
                   1 |   10000005 |        132        
                   2 |   10000018 |        125        
                   3 |   10000053 |        118        

Solution postgres

WITH prod_cat_qty AS (    
    SELECT product_category_id, O.product_id, SUM(qty) AS quantities
    FROM orders O
    INNER JOIN product_category C
    ON C.product_id = O.product_id
    GROUP BY product_category_id, O.product_id
)

SELECT product_category_id, product_id, quantities
FROM (
	SELECT product_category_id, product_id, quantities, RANK() OVER(PARTITION BY product_category_id ORDER BY quantities DESC) as ranking
	FROM prod_cat_qty
) X
WHERE X.ranking=1;
    

Explanation

This query is trying to find the most popular products in each product category.

The first part of the query creates a common table expression (CTE) called "prod_cat_qty". It joins the "orders" table with the "product_category" table on the "product_id" column and groups the results by "product_category_id" and "product_id". It then calculates the sum of the "qty" column for each group and renames it as "quantities".

The second part of the query selects the "product_category_id", "product_id", and "quantities" columns from a subquery that is created from the "prod_cat_qty" CTE. This subquery uses the RANK() function to assign a ranking to each product within its product category based on the "quantities" column, with the highest quantity being ranked as 1.

Finally, the outer query filters the results to only show products that have a ranking of 1 within their respective product category. This essentially gives us the most popular products in each product category.

Expected results


More Amazon questions

ID Title Level FTPR
93 number of sales for each product this month amazon easy
20%
94 Top 3 products vs. bottom 3 products amazon medium
7%
95 Candidate products for subscription amazon hard
13%
108 Free premium membership amazon hard
16%
109 Top 3 most popular product categories amazon easy
11%
111 Second most popular product amazon medium
16%
112 Top 3 customers amazon easy
30%
113 Daily cumulative spend amazon easy
11%
114 Dod revenue change rate amazon easy
15%
115 Rolling average revenue amazon hard
28%
116 Top answers day by device amazon easy
14%
117 Rolling 7 days total answers amazon easy
35%
175 Employees' email account amazon easy
14%
176 Employees' annual bonus amazon easy
12%
177 Purchases by platform report amazon medium
10%