Forum

Posted by Sylvia, April 4, 2022, 6:09 p.m.

Question #40 Top 2 Movies

My solution is giving me three top 2 movies because there is a tie for being the 2nd most popular movie. 

My code is listed as below: 

WITH rent_time AS 
(
  SELECT a.film_id, 
  count(b.rental_id) as num
  FROM inventory a 
  INNER JOIN rental b
  on a.inventory_id = b.inventory_id 
  WHERE EXTRACT(MONTH FROM b.rental_ts) = '6' AND EXTRACT(YEAR FROM b.rental_ts) = '2020'
  GROUP BY a.film_id
  ), 
  
 cte AS 
 (SELECT *, RANK() over (ORDER BY num DESC) as rank
  FROM rent_time
  )
  
  SELECT a.film_id, a.title 
  FROM film a
  INNER JOIN cte b
  on a.film_id = b.film_id 
  WHERE b.rank IN (1, 2)
 

In the case of having a TOP 1 but 2 TOP 2 movies, should I use RANK() OVER function or LIMIT? If it were in a real interview, should I ask the interviewer to clarify how to take care of the ties, such as "Would you like to see 3 results for the TOP 1 movie when there is a tie"? 

Answers

Hi Sylvia,

Your code looks great.

1. It's actually our fault, there seem to be some edge cases when we updated the database last time which introduced a tie situation for top 3, which we just fixed.

2. In a real interview, it's always a great idea to clarify the question and edge cases with the interviewer before starting coding.

3. For this question, I personally prefer to use LIMIT instead of RANK(), as there is no subgroup here, if the question is TOP 2 movies by different subgroups (e.g., movie genre, or month), then I will use a window function such as RANK().

I've written an article with some tips and tricks for SQL interviews, feel free to check it out here:

https://sqlpad.io/tutorial/faang-sql-interview-questions

Thanks and please feel free to let us know if you have any further questions.

Leon

Thank you very much for your detailed answer. 

I have a follow-up question on LIMIT - how will it return results when there is a tie? Will it randomly pick a result to return?

It's helpful to clarify the edge cases with the interviewer before coding. If the interviewer says that he would like ALL the options when there is a tie, does it mean that I have to use rank() or dense_rank()? Row_number() will not work in this case either. 

Sylvia, April 6, 2022, 7:44 p.m.

Sure thing.

Yes, the limit is going to randomly choose from the tied results. . 

If they want all the tied results you will want to use the RANK function instead of ROW_NUMBER. 

Leon, April 6, 2022, 8:06 p.m.
SQLPad user avatar

Leon (949)

April 5, 2022, 11:44 a.m.