165. Session stitching

hard walmart

  • **********************************************************************************************************************************************************
  • ***************************************************************************************************************************************************************************************************
  • ********************************************************************************************************************************************************************************************************************************************
  • ******************************************************************************************
  • **************************************************************************************************************************** > ****.

Table: walmart_logins

Every time a user sign in to walmart.com, we take a snapshot and add a new raw_session_id

  col_name      | col_type
----------------+-------------------
cust_id         | bigint
login_ts        | timestamp
raw_session_id  | bigint

Sample results

customer_id | raw_session_id |      login_ts
-------------+----------------+---------------------
        8001 |          20001 | 2020-02-14 01:15:01
        8001 |          20003 | 2020-02-14 01:55:01
        8001 |          20004 | 2020-02-15 15:10:01
        8001 |          20008 | 2020-02-15 17:15:01
        8001 |          20010 | 2020-02-16 08:15:01
        8002 |          20011 | 2020-02-15 01:15:01
        8002 |          20013 | 2020-02-15 01:55:01
        8003 |          20014 | 2020-02-16 15:10:01
        8003 |          20016 | 2020-02-16 15:35:01
        8004 |          20017 | 2020-02-17 17:15:01
        8005 |          20019 | 2020-02-18 08:15:01

Solution postgres

-- Stitching sessions together
WITH last_logins AS (
    SELECT customer_id, login_ts, raw_session_id,
    LAG(login_ts, 1) OVER(PARTITION BY customer_id ORDER BY login_ts) AS last_login_ts
    FROM walmart_logins
)
SELECT
    customer_id,
    raw_session_id,
    login_ts
FROM last_logins
WHERE EXTRACT(EPOCH FROM login_ts - last_login_ts)/60 >= 5
OR last_login_ts IS NULL;
    

Explanation

This query is used to stitch together sessions of customers who have logged into the Walmart website. It does this by using the "WITH" statement to create a temporary table called "last_logins". This table selects the customer ID, login timestamp, and session ID from the "walmart_logins" table. It also uses the "LAG" function to find the timestamp of the customer's previous login.

The main query then selects the customer ID, session ID, and login timestamp from the "last_logins" table. It filters the results to only show sessions where the time between the current login and the previous login is greater than or equal to 5 minutes, or where there is no previous login timestamp available (i.e. the first login).

This query is helpful for analyzing customer behavior on the Walmart website, as it allows analysts to see how long customers typically spend on the site and how frequently they log in.

Expected results


More Walmart questions

ID Title Level FTPR
162 Number of orders per brand walmart easy
41%
163 Average number of orders per brand walmart easy
32%
164 Orders distribution by brand walmart easy
28%
166 Top 10 customers based on spend growth walmart medium
11%