SQL

SQL Code Analysis

grace21110 2023. 10. 10. 17:23
반응형

Let's analyze the following code:

SELECT 
	*
    , ROW_NUMBER() OVER(ORDER BY RET_RATIO DESC) RNK
FROM (
	SELECT 
		product_id
		, SUM(reordered) / COUNT(*) AS RET_RATIO
	FROM
		order_products__prior
	GROUP BY 1
) A 
WHERE product_id IN (
SELECT product_id 
	FROM (
		SELECT 
			product_id
			, SUM(reordered) AS TOTAL 
		FROM
			order_products__prior
		GROUP BY 1 
	) A
	WHERE TOTAL >= 50
)
;

SELECT product_id 
FROM (
	SELECT 
		product_id
		, SUM(reordered) AS TOTAL
	FROM
		order_products__prior
	GROUP BY 1 
) A
WHERE TOTAL >= 50
;

 

Step 1: Calculate Repurchase Rates and Rank Products

SELECT 
    *
    , ROW_NUMBER() OVER(ORDER BY RET_RATIO DESC) RNK
FROM (
    SELECT 
        product_id
        , SUM(reordered) / COUNT(*) AS RET_RATIO
    FROM
        order_products__prior
    GROUP BY 1
) A 
WHERE product_id IN (
    SELECT product_id 
    FROM (
        SELECT 
            product_id
            , SUM(reordered) AS TOTAL
        FROM
            order_products__prior
        GROUP BY 1 
    ) A
    WHERE TOTAL >= 50
);
  • This part of the code calculates the repurchase rates (RET_RATIO) for each product and assigns a rank (RNK) based on the descending order of these rates.
  • It uses a subquery (the inner query) to calculate the repurchase rates for each product. The subquery groups the data by product_id and calculates the sum of reordered items divided by the count of all items for each product.
  • The outer query then uses the ROW_NUMBER() function to assign a rank to each product based on their repurchase rates in descending order.
  • The WHERE clause filters the products, selecting only those whose product_id is included in another subquery (the second subquery) that filters products with a total order quantity (TOTAL) greater than or equal to 50.

Step 2: List Products Meeting the Order Quantity Threshold

SELECT product_id 
FROM (
    SELECT 
        product_id
        , SUM(reordered) AS TOTAL
    FROM
        order_products__prior
    GROUP BY 1 
) A
WHERE TOTAL >= 50;
  • This part of the code lists the products that meet a minimum total order quantity threshold of 50 or more.
  • The innermost subquery calculates the total order quantity (TOTAL) for each product by summing the reordered items for each product_id.
  • The middle subquery groups the data by product_id and calculates the total order quantity.
  • The outer query selects and displays only the product_id values for products whose total order quantity is greater than or equal to 50.

 

The code is referred a book called SQL로 맛보는 데이터 전처리 분석(노수영 저). 

https://product.kyobobook.co.kr/detail/S000001934242