Posted by Luming, Oct. 4, 2022, 1:36 p.m.
#193
Hi Leon,
For this question, I got my query like this:
WITH controlgroup AS
(SELECT COUNT(CASE WHEN reply_tweet_id IS NOT NULL THEN tweet_id END) as replies,
COUNT(DISTINCT CASE WHEN reply_tweet_id IS NULL THEN tweet_id END) as original_tweets
FROM tweets
JOIN twitter_experiment_users
ON tweets.user_id = twitter_experiment_users.user_id
WHERE is_treatment IS FALSE AND exp_id = 2001)
,treatmentgroup AS
(SELECT COUNT(CASE WHEN reply_tweet_id IS NOT NULL THEN tweet_id END) as replies,
COUNT(DISTINCT CASE WHEN reply_tweet_id IS NULL THEN tweet_id END) as original_tweets
FROM tweets
JOIN twitter_experiment_users
ON tweets.user_id = twitter_experiment_users.user_id
WHERE is_treatment IS TRUE AND exp_id = 2001)
SELECT treatment_replies - control_replies as diff
FROM
(SELECT controlgroup.replies * 1.0/controlgroup.original_tweets as control_replies,
treatmentgroup.replies * 1.0/treatmentgroup.original_tweets as treatment_replies
FROM controlgroup,treatmentgroup
) sub
I felt like the "replies" part might be the cause of my fault. Can you shed some light on this part?
Best
Ming