Instruction
- Write a query to return the percentage of revenue for each of the following films:
film_id <= 10
by its category. - Formula:
revenue (film_id x) * 100.0/ revenue of all movies in the same category
. - The order of your results doesn't matter.
- Return 3 columns: film_id, category name, and percentage.
Table 1: category
Movie categories.
col_name | col_type -------------+-------------------------- category_id | integer name | text
Table 2: film
col_name | col_type ----------------------+-------------------------- film_id | integer title | text description | text release_year | integer language_id | smallint original_language_id | smallint rental_duration | smallint rental_rate | numeric length | smallint replacement_cost | numeric rating | text
Table 3: film_category
A film can only belong to one category
col_name | col_type -------------+-------------------------- film_id | smallint category_id | smallint
Table 4: inventory
Each row is unique, inventoy_id is the primary key of this table.
col_name | col_type --------------+-------------------------- inventory_id | integer film_id | smallint store_id | smallint
Table 5: payment
Movie rental payment transactions table
col_name | col_type --------------+-------------------------- payment_id | integer customer_id | smallint staff_id | smallint rental_id | integer amount | numeric payment_ts | timestamp with time zone
Table 6: rental
col_name | col_type --------------+-------------------------- rental_id | integer rental_ts | timestamp with time zone inventory_id | integer customer_id | smallint return_ts | timestamp with time zone staff_id | smallint
Sample results
film_id | category_name | revenue_percent_category ---------+---------------+-------------------------- 1 | Documentary | 0.87183937479845975834 2 | Horror | 1.4218786097664498 3 | Documentary | 0.89815815929740700696 4 | Horror | 2.4652522202582108 5 | Family | 1.2276180943524362
Expected results
Solution postgres
WITH movie_revenue AS (
SELECT
I.film_id, SUM(P.amount) revenue
FROM payment P
INNER JOIN rental R
ON R.rental_id = P.rental_id
INNER JOIN inventory I
ON I.inventory_id = R.inventory_id
GROUP BY I.film_id
),rev_percentage AS (
SELECT
MR.film_id,
C.name category_name,
revenue * 100.0 / SUM(revenue) OVER(PARTITION BY C.name) revenue_percent_category
FROM movie_revenue MR
INNER JOIN film_category FC
ON FC.film_id = MR.film_id
INNER JOIN category C
ON C.category_id = FC.category_id
)
SELECT * FROM rev_percentage
WHERE film_id <= 10
ORDER BY film_id
;
Explanation
This query is trying to analyze the percentage of movie revenue for each category of films.
The first part of the query creates a temporary table called `movie_revenue` that sums up the total revenue for each movie by joining the `payment`, `rental`, and `inventory` tables.
The second part of the query creates another temporary table called `rev_percentage` that calculates the percentage of revenue for each movie within its corresponding category. It does this by joining the `movie_revenue` table with the `film_category` and `category` tables, and then using the `SUM()` function with the `OVER()` clause to calculate the total revenue for each category.
Finally, the query selects all the columns from the `rev_percentage` table where the film_id is less than or equal to 10 and sorts the results by film_id.
Overall, this query helps a data analyst understand how movie revenue is distributed among different categories of films.