102. Histogram by visit session duration

hard google

  • *************************************************************************************************
  • ************************************************************************************
    • *****************************< 50
    • **************************50, 100
    • ***************************100, 150
    • ***************************150, 200
    • ***************************200, 250
    • ****************************** >= 250
  • ********************************************************************************************

Table 1: session_web

Every time a user visits the website, a new session starts, it ends when the user leaves the site.

  col_name       | col_type
-----------------+---------------------
date             | date    
user_id          | bigint
session_id       | bigint
event            | varchar(20)

Table 2: session_web_duration

The number of seconds a user spends by session.

  col_name       | col_type
-----------------+---------------------
date             | date    
session_id       | bigint
duration         | int

Sample results

   date    |   uu_cnt
-----------+--------------
< 50       | 20000
50, 100     | 12000
100, 150     | 5000

Solution postgres

WITH session_user_duration AS (
    SELECT 
        W.session_id,
        W.user_id, 
        MAX(D.duration) AS duration
    FROM session_web W
    inner JOIN session_web_duration D
    ON D.session_id = W.session_id
    GROUP BY W.session_id, W.user_id
),
session_user_label AS (
    SELECT session_id,
           CASE WHEN duration < 50 THEN '< 50'
                WHEN duration < 100 THEN '50, 100'
                WHEN duration < 150 THEN '100, 150'
                WHEN duration < 200 THEN '150, 200'
                WHEN duration < 250 THEN '200, 250'
                WHEN duration >= 250 THEN '>= 250'
               ELSE NULL END AS duration_category,
           user_id
    FROM session_user_duration
)
SELECT duration_category, COUNT(DISTINCT user_id)
FROM session_user_label
GROUP BY duration_category
ORDER BY duration_category;
    

Explanation

This query is used to group web sessions by their duration and user ID, and then categorize those sessions based on their duration time.

The first part of the query, session_user_duration, calculates the maximum duration for each session and user pair by joining the session_web and session_web_duration tables, and grouping the results by session ID and user ID.

The second part of the query, session_user_label, uses the results from the first part to categorize the sessions based on their duration time. The CASE statement creates categories based on different duration ranges, and assigns each session to the appropriate category.

Finally, the query returns the count of unique user IDs for each duration category, sorted by the duration category.

Overall, this query can be used to analyze user engagement with a website or application, by identifying which duration ranges are most popular among users.

Expected results


More Google questions

ID Title Level FTPR
101 Average number of visits per user google easy
14%
103 Median and average session duration by day of the week google medium
20%
104 MAU: Monthly active users google easy
20%
105 Number of days gap between last two actions google hard
8%
106 Mobile vs. web google medium
10%
107 Customer count by platform google medium
17%
147 2 days streak customers google hard
13%
148 Most popular video category google medium
13%
149 fans by video category google easy
36%
150 Countries with above average customers google easy
15%
186 Average comments per question google easy
10%
187 Comment distribution at Google Forum google easy
17%
188 Top 1 popular question by department google hard
14%