228. Analyzing Sequential Purchase Behavior

hard microsoft

Using the tables msft_customers, msft_products, and msft_sales, construct a query to identify customers who made a higher-value purchase immediately after a lower-value purchase. Specifically, find instances where a customer's subsequent purchase is at least 50% more expensive than their previous purchase. For each identified instance, return the customer's name, the product names of the lower and higher-value purchases, the sale dates of these purchases, and the percentage increase in the purchase value.

Hints:

  • Use the LAG function to access data from a previous row (purchase) and compare it with the current row.
  • Join msft_sales with msft_customers and msft_products to access the necessary information about customers and products.
  • Ensure your query accounts for the order of purchases using an appropriate ordering criterion, such as the sale date.

Expected Output Columns:

  • Customer Name
  • Lower-Value Product Name
  • Higher-Value Product Name
  • Lower-Value Sale Date
  • Higher-Value Sale Date
  • Percentage Increase in Value

Table 1: msft_customers

Contains details about customers, including their ID, name, age, gender, and the date they joined.


customer_id: int, name: varchar, age: int, gender: varchar, join_date: date

Table 2: msft_products

Holds information on products, such as product ID, name, price, and category.


product_id: int, product_name: varchar, price: decimal, category: varchar

Table 3: msft_sales

Records details of sales transactions, including the sale ID, customer ID, product ID, date of the transaction, and the quantity of products sold.


sale_id: int, customer_id: int, product_id: int, sale_date: date, quantity: int

Solution postgres


SELECT 
    c.name AS customer_name,
    prev_product.product_name AS lower_value_product_name,
    curr_product.product_name AS higher_value_product_name,
    prev_sale.sale_date AS lower_value_sale_date,
    curr_sale.sale_date AS higher_value_sale_date,
    ((curr_product.price - prev_product.price) / prev_product.price) * 100 AS percentage_increase
FROM
    (SELECT *,
            LAG(product_id) OVER (PARTITION BY customer_id ORDER BY sale_date) AS prev_product_id,
            LAG(sale_date) OVER (PARTITION BY customer_id ORDER BY sale_date) AS prev_sale_date
     FROM msft_sales) AS curr_sale
JOIN msft_sales AS prev_sale ON curr_sale.prev_product_id = prev_sale.product_id AND curr_sale.prev_sale_date = prev_sale.sale_date
JOIN msft_customers c ON curr_sale.customer_id = c.customer_id
JOIN msft_products curr_product ON curr_sale.product_id = curr_product.product_id
JOIN msft_products prev_product ON prev_sale.product_id = prev_product.product_id
WHERE curr_product.price > 1.5 * prev_product.price
ORDER BY c.name, curr_sale.sale_date;

    

Last Submission postgres

Expected results


More Microsoft questions

ID Title Level FTPR
218 Total Amount Spent by Each Customer microsoft medium -
219 Inactive Customers in the New Year of 2024 microsoft medium -
220 Most Expensive Product in Each Category microsoft hard -
221 Customers With More Than 5 Products in a Single Transaction microsoft hard -
222 Product Sales Ranking Within Categories microsoft hard -
223 Cumulative Total Sales by Day microsoft hard -
224 Categories with High Average Product Price microsoft medium -
225 Customer Behavior Analysis microsoft hard -
226 Customer Lifetime Value (CLV) and Segmentation microsoft hard -
227 Aggregate Sale Amounts per Customer microsoft hard -
229 Time Between Repeat Purchases microsoft hard -
230 Customer Purchase Trend Analysis microsoft easy -