**************
- **********************************************************************************************************************************************************************************
- ******************************************************************************************************************************************************
- **************************************************************************************************************
- ************************************************************
Table: orders
An eCommerce company's online order table.
col_name | col_type --------------+------------------- order_id | bigint product_id | bigint customer_id | bigint order_dt | date qty | integer unit_price_usd| float channel | varchar(20) -- mobile, desktop
Sample results
product_id ------------ 10000076 10000094 10000092
Solution postgres
SELECT product_id FROM (
SELECT product_id, customer_id
FROM orders
WHERE order_dt >= '2021-08-01'
AND order_dt < '2021-09-01'
GROUP BY product_id, customer_id
HAVING COUNT(*) >= 2
) X
GROUP BY product_id
ORDER BY COUNT(DISTINCT customer_id) DESC
LIMIT 8;
Explanation
This query retrieves the top 8 most popular products based on the number of unique customers who purchased them during the month of August 2021.
First, it selects the product_id column from a subquery that selects all product_id and customer_id pairs from the orders table where the order date is between August 1, 2021 and September 1, 2021, and groups them by product_id and customer_id. The HAVING clause filters out any pairs that don't have at least two occurrences.
Then, the outer query groups the results by product_id and orders them by the count of distinct customer_ids in descending order. The LIMIT clause is used to only return the top 8 results.
Expected results