-
********************************5 *******************************
-
**********************************************************************
*********
************************************************************************************************************************************************************************************************************************************************************************************************************************
**************************************************************************************
Table 1: netflix_content
col_name | col_type -------------------+--------------------- content_id | bigint content_type | character varying(10) -- either TV or Movie original_language | character varying(2) -- 'EN', 'ES', 'CN', 'KO' release_date | date
Table 2: netflix_daily_streaming
Daily aggregated watch time by account by content.
col_name | col_type --------------+--------------------- date | date account_id | bigint content_id | bigint duration | int -- in seconds
Sample results
content_id | watch_time_in_seconds ------------+----------------------- 1000789 | 25212 1000638 | 24627 1000577 | 23831 1000409 | 23827 1000865 | 22249
Solution postgres
WITH movies AS (
SELECT content_id, release_date
FROM netflix_content
WHERE content_type = 'Movie'
)
SELECT M.content_id, SUM(duration) watch_time_in_seconds
FROM movies M
INNER JOIN netflix_daily_streaming S
ON S.content_id = M.content_id
WHERE S.date <= M.release_date + 28
GROUP BY 1
ORDER BY 2 DESC
LIMIT 5;
Explanation
This query is looking at two tables: "netflix_content" and "netflix_daily_streaming". It is selecting movies from the "netflix_content" table and then joining that with the "netflix_daily_streaming" table to calculate the total watch time in seconds for each movie within the first 28 days of its release.
The "WITH" statement creates a temporary table named "movies" that only includes movies (not TV shows or other types of content) from the "netflix_content" table along with their release dates.
The "INNER JOIN" clause then combines the "movies" table with the "netflix_daily_streaming" table using the "content_id" field as the common link between the two tables. The "WHERE" clause filters the results to only include movies that were watched within 28 days of their release date.
Finally, the "GROUP BY" clause groups the results by the "content_id" field, and the "SUM" function calculates the total watch time in seconds for each movie. The results are then ordered in descending order by the watch time and limited to the top 5 movies.
Expected results