184. Top 5 movies by watch time

medium netflix

  • ***************************************************************

  • **********************************************************************

*********

************************************************************************************************************************************************************************************************************************************************************************************************************************

**************************************************************************************

Table 1: netflix_content

 col_name          | col_type
-------------------+---------------------
 content_id        | bigint
 content_type      | character varying(10) -- either TV or Movie
 original_language | character varying(2) -- 'EN',  'ES', 'CN', 'KO'
 release_date      | date

Table 2: netflix_daily_streaming

Daily aggregated watch time by account by content.

  col_name    | col_type
--------------+---------------------
 date         | date
 account_id   | bigint
 content_id   | bigint
 duration     | int -- in seconds

Sample results

 content_id | watch_time_in_seconds
------------+-----------------------
    1000789 |                 25212
    1000638 |                 24627
    1000577 |                 23831
    1000409 |                 23827
    1000865 |                 22249

Solution postgres

WITH movies AS (
  SELECT content_id, release_date
  FROM netflix_content
  WHERE content_type = 'Movie'
)
SELECT M.content_id, SUM(duration) watch_time_in_seconds
FROM movies M
INNER JOIN netflix_daily_streaming S
ON S.content_id = M.content_id
WHERE S.date <= M.release_date + 28
GROUP BY 1
ORDER BY 2 DESC
LIMIT 5;
    

Explanation

This query is looking at two tables: "netflix_content" and "netflix_daily_streaming". It is selecting movies from the "netflix_content" table and then joining that with the "netflix_daily_streaming" table to calculate the total watch time in seconds for each movie within the first 28 days of its release.

The "WITH" statement creates a temporary table named "movies" that only includes movies (not TV shows or other types of content) from the "netflix_content" table along with their release dates.

The "INNER JOIN" clause then combines the "movies" table with the "netflix_daily_streaming" table using the "content_id" field as the common link between the two tables. The "WHERE" clause filters the results to only include movies that were watched within 28 days of their release date.

Finally, the "GROUP BY" clause groups the results by the "content_id" field, and the "SUM" function calculates the total watch time in seconds for each movie. The results are then ordered in descending order by the watch time and limited to the top 5 movies.

Expected results


More Netflix questions

ID Title Level FTPR
181 Users who watched less than one hour of Netflix netflix easy
19%
182 Top 5 content by number of watchers netflix medium
9%
183 Top 5 movies by number of watchers netflix easy
46%
185 Top 5 non-English contents by watch time netflix medium
49%