Instruction
- Write a query to return the number of customers who rented at least one movie in both May 2020 and June 2020.
Table: rental
col_name | col_type --------------+-------------------------- rental_id | integer rental_ts | timestamp with time zone inventory_id | integer customer_id | smallint return_ts | timestamp with time zone staff_id | smallint
Sample results
count ------- 123
Expected results
Solution 1: postgres
WITH may_cust AS (
SELECT DISTINCT customer_id AS may_cust_id
FROM rental
WHERE DATE(rental_ts) >= '2020-05-01'
AND DATE(rental_ts) <= '2020-05-31'
)
SELECT COUNT(DISTINCT customer_id)
FROM rental
WHERE DATE(rental_ts) >= '2020-06-01'
AND DATE(rental_ts) <= '2020-06-30'
AND customer_id IN (
SELECT may_cust_id
FROM may_cust
);
Explanation
This query is trying to count the number of customers who rented a product in June 2020 but also rented something in May 2020.
The first part of the query creates a common table expression (CTE) named "may_cust" which selects all the unique customer IDs who rented something between May 1, 2020, and May 31, 2020.
The second part of the query selects the count of distinct customer IDs who rented something between June 1, 2020, and June 30, 2020, and also rented something in May 2020.
The "IN" clause in the second part of the query is used to filter the results to only include customers who rented something in May 2020, as identified by the "may_cust" CTE.
Overall, this query is useful for analyzing customer retention and identifying which customers are returning to rent products.
Solution 2: postgres
-- You can also use INNER JOIN to get the overlap between May and June paid customers
WITH may_customers AS (
SELECT DISTINCT customer_id
FROM rental
WHERE DATE(rental_ts) >= '2020-05-01'
AND DATE(rental_ts) <= '2020-05-31'
),
june_customers AS (
SELECT DISTINCT customer_id
FROM rental
WHERE DATE(rental_ts) >= '2020-06-01'
AND DATE(rental_ts) <= '2020-06-30'
)
SELECT COUNT(M.customer_id)
FROM may_customers M
INNER JOIN june_customers J
ON M.customer_id = J.customer_id
;
Explanation
This query retrieves the number of customers who rented from a company in both May and June of 2020. It starts by creating two temporary tables, one for May customers and another for June customers, based on the rental data. Then, it uses an INNER JOIN to combine the two tables and only keep the rows where the customer ID is the same in both tables. Finally, it counts the number of rows in the resulting table, which represents the number of customers who rented in both months.