187. Comment distribution at Google Forum

easy google

  • *****************************************************************************************************************
  • *****************************************************************************************************
    • 0
    • 1
    • >=2, <5
    • >=5

Table: google_forum

Google's internal Q&A forum.

   col_name   | col_type
--------------+--------------------------
 created_dt   | date
 post_id      | int
 post_type    | varchar(20) -- either a question or a comment 
 employee_id  | int
 content      | text 

Sample results

 num_comments | cnt
--------------+-----
 0            |  42
 1            |  40
 >=2, <5      |  13

Solution postgres

WITH questions AS (
    SELECT post_id, created_dt
    FROM google_forum
    WHERE post_type = 'question'
),
comments AS (
    SELECT post_id, created_dt
    FROM google_forum
    WHERE post_type = 'comment'
),
comments_stat AS (
    SELECT
        CASE WHEN num_comments = 0 THEN '0'
             WHEN num_comments = 1 THEN '1'
             WHEN num_comments <5  THEN '>=2, <5'
             ELSE '>=5' END AS num_comments,
        post_id
    FROM (
        SELECT Q.post_id, COUNT(C.post_id) AS num_comments
        FROM questions Q
        LEFT JOIN comments C
        ON Q.post_id = C.post_id
        AND Q.created_dt = C.created_dt
        GROUP BY 1
    ) X
)
SELECT num_comments, COUNT(DISTINCT post_id) as cnt
FROM comments_stat
GROUP BY 1;
    

Explanation

This query is used to analyze a Google forum and group posts by the number of comments they have received.

The query makes use of Common Table Expressions (CTEs) to break down the data into smaller, more manageable pieces.

The first CTE, "questions," selects all posts from the "google_forum" table where the post type is "question" and retrieves the post ID and creation date.

The second CTE, "comments," selects all posts from the "google_forum" table where the post type is "comment" and retrieves the post ID and creation date.

The third CTE, "comments_stat," creates a new table that counts the number of comments for each post and groups them into categories based on the number of comments.

The final SELECT statement selects the number of comments and the count of posts for each category and groups them by the number of comments.

Overall, this query provides insight into the level of engagement and activity on the Google forum, and can help identify popular or important posts based on the number of comments they have received.

Expected results


More Google questions

ID Title Level FTPR
101 Average number of visits per user google easy
14%
102 Histogram by visit session duration google hard
17%
103 Median and average session duration by day of the week google medium
20%
104 MAU: Monthly active users google easy
20%
105 Number of days gap between last two actions google hard
8%
106 Mobile vs. web google medium
10%
107 Customer count by platform google medium
17%
147 2 days streak customers google hard
13%
148 Most popular video category google medium
13%
149 fans by video category google easy
36%
150 Countries with above average customers google easy
15%
186 Average comments per question google easy
10%
188 Top 1 popular question by department google hard
14%