- *********************************************************************
- *********************************
- ***************************************************************************************
Table 1: apple_search
Search id by two different ab testing groups: --0: control, 1, testing
col_name | col_type -------------------------+-------------------------- search_id | uuid ab_testing_bucket_id | smallint --0: control, 1, testing
Table 2: apple_search_results
Search results history, if a user keeps scrolling down, more page_url will be shown to the user.
col_name | col_type --------------+-------------------------- search_id | uuid result_id | uuid page_url | url is_shown | boolean --true or false
Sample results
ab_testing_bucket_id | percentage_shown ----------------------+--------------------- 0 | 36.3970088375254929 1 | 37.4620522161505768
Solution postgres
SELECT
s.ab_testing_bucket_id,
(SUM(CASE WHEN r.is_shown = TRUE THEN 1 ELSE 0 END)* 100.0 / COUNT(r.result_id)) AS percentage_shown
FROM
apple_search s
JOIN
apple_search_results r ON s.search_id = r.search_id
GROUP BY
s.ab_testing_bucket_id;
Explanation
This PostgreSQL query is used to calculate the percentage of search results that were shown to users, grouped by A/B testing bucket.
Here's a breakdown of what the query does:
-
SELECT
: This is the command used to select data from a database. -
s.ab_testing_bucket_id
: This is the id used for A/B testing. A/B testing is a method of comparing two versions of a webpage or other user experience to see which performs better. -
SUM(CASE WHEN r.is_shown = TRUE THEN 1 ELSE 0 END)
: This line is using aCASE
statement inside theSUM()
function. It counts the number of timesis_shown
is true (meaning the result was shown to a user). -
100.0 / COUNT(r.result_id)
: This line is calculating the percentage. It's dividing the number of timesis_shown
is true by the total number of results, and then multiplying by 100 to get a percentage. -
FROM apple_search s
: This line is specifying the table we are selecting data from (apple_search
), and aliasing it ass
for easier reference later in the query. -
JOIN apple_search_results r ON s.search_id = r.search_id
: This line is joining theapple_search
table with theapple_search_results
table on thesearch_id
field. This allows us to use data from both tables in our query. -
GROUP BY s.ab_testing_bucket_id
: This line is grouping the results by A/B testing bucket. This means we will get a separate percentage for each bucket.
Overall, this query is useful for understanding how often search results are actually shown to users, and how this varies between different A/B testing buckets.
Expected results