187. Comment distribution at Google Forum

easy google

  • Write a query to report the distribution of the number of comments made within the same day a question is posted.
  • Using the following buckets to count the number of questions with the following number of comments.
    • 0
    • 1
    • >=2, < 5

Table: google_forum

Google's internal Q&A forum.

   col_name   | col_type
--------------+--------------------------
 created_dt   | date
 post_id      | int
 post_type    | varchar(20) -- either a question or a comment 
 employee_id  | int
 content      | text 

Sample results

 num_comments | cnt
--------------+-----
 0            |  42
 1            |  40
 >=2, <5      |  13

Expected results

Solution postgres

WITH questions AS (
    SELECT post_id, created_dt
    FROM google_forum
    WHERE post_type = 'question'
),
comments AS (
    SELECT post_id, created_dt
    FROM google_forum
    WHERE post_type = 'comment'
),
comments_stat AS (
    SELECT
        CASE WHEN num_comments = 0 THEN '0'
             WHEN num_comments = 1 THEN '1'
             WHEN num_comments <5  THEN '>=2, <5'
             WHEN num_comments <10  THEN '>=5, <10'
             ELSE '>=10' END AS num_comments,
        post_id
    FROM (
        SELECT Q.post_id, COUNT(C.post_id) AS num_comments
        FROM questions Q
        LEFT JOIN comments C
        ON Q.post_id = C.post_id
        AND Q.created_dt = C.created_dt
        GROUP BY 1
    ) X
)
SELECT num_comments, COUNT(DISTINCT post_id) as cnt
FROM comments_stat
GROUP BY 1;
    

Explanation

This query is used to analyze a Google forum and group posts by the number of comments they have received.

The query makes use of Common Table Expressions (CTEs) to break down the data into smaller, more manageable pieces.

The first CTE, "questions," selects all posts from the "google_forum" table where the post type is "question" and retrieves the post ID and creation date.

The second CTE, "comments," selects all posts from the "google_forum" table where the post type is "comment" and retrieves the post ID and creation date.

The third CTE, "comments_stat," creates a new table that counts the number of comments for each post and groups them into categories based on the number of comments.

The final SELECT statement selects the number of comments and the count of posts for each category and groups them by the number of comments.

Overall, this query provides insight into the level of engagement and activity on the Google forum, and can help identify popular or important posts based on the number of comments they have received.


More Google questions

ID Title Level FTPR
186 Average comments per question google easy
50%
188 Top 1 popular question by department google hard
100%