- *****************************************************************************************
Table 1: orders
An eCommerce company's online order table.
col_name | col_type --------------+------------------- order_id | bigint product_id | bigint customer_id | bigint order_dt | date qty | integer unit_price_usd| float channel | varchar(20) -- mobile, desktop
Table 2: product_category
One product can only belong to one category
col_name | col_type ----------------------+------------------- product_id | bigint product_category_id | smallint
Sample results
product_category_id | ranking ---------------------+------------------ 3 | 1 2 | 2 4 | 3
Solution postgres
WITH prod_cat_orders AS (
SELECT product_category_id, COUNT(*) AS num_orders
FROM orders O
INNER JOIN product_category C
ON C.product_id = O.product_id
GROUP BY product_category_id
)
SELECT
product_category_id,
RANK() OVER(ORDER BY num_orders DESC) AS ranking
FROM prod_cat_orders
LIMIT 3;
Explanation
This query is used to find the top 3 product categories that have the most orders.
The first part of the query creates a temporary table called "prod_cat_orders". This table lists the number of orders made for each product category by joining the "orders" table with the "product_category" table on their product_id columns and grouping by the product category.
The second part of the query selects the product_category_id from the temporary table and ranks them by the number of orders they have in descending order using the RANK() function. The result is limited to the top 3 rows using the LIMIT keyword.
Overall, this query helps a data analyst to identify the most popular product categories and can be used to make business decisions related to inventory management and marketing strategies.
Expected results