SQL
SQL Code Analysis
grace21110
2023. 10. 10. 17:23
반응형
Let's analyze the following code:
SELECT
*
, ROW_NUMBER() OVER(ORDER BY RET_RATIO DESC) RNK
FROM (
SELECT
product_id
, SUM(reordered) / COUNT(*) AS RET_RATIO
FROM
order_products__prior
GROUP BY 1
) A
WHERE product_id IN (
SELECT product_id
FROM (
SELECT
product_id
, SUM(reordered) AS TOTAL
FROM
order_products__prior
GROUP BY 1
) A
WHERE TOTAL >= 50
)
;
SELECT product_id
FROM (
SELECT
product_id
, SUM(reordered) AS TOTAL
FROM
order_products__prior
GROUP BY 1
) A
WHERE TOTAL >= 50
;
Step 1: Calculate Repurchase Rates and Rank Products
SELECT
*
, ROW_NUMBER() OVER(ORDER BY RET_RATIO DESC) RNK
FROM (
SELECT
product_id
, SUM(reordered) / COUNT(*) AS RET_RATIO
FROM
order_products__prior
GROUP BY 1
) A
WHERE product_id IN (
SELECT product_id
FROM (
SELECT
product_id
, SUM(reordered) AS TOTAL
FROM
order_products__prior
GROUP BY 1
) A
WHERE TOTAL >= 50
);
- This part of the code calculates the repurchase rates (RET_RATIO) for each product and assigns a rank (RNK) based on the descending order of these rates.
- It uses a subquery (the inner query) to calculate the repurchase rates for each product. The subquery groups the data by product_id and calculates the sum of reordered items divided by the count of all items for each product.
- The outer query then uses the ROW_NUMBER() function to assign a rank to each product based on their repurchase rates in descending order.
- The WHERE clause filters the products, selecting only those whose product_id is included in another subquery (the second subquery) that filters products with a total order quantity (TOTAL) greater than or equal to 50.
Step 2: List Products Meeting the Order Quantity Threshold
SELECT product_id
FROM (
SELECT
product_id
, SUM(reordered) AS TOTAL
FROM
order_products__prior
GROUP BY 1
) A
WHERE TOTAL >= 50;
- This part of the code lists the products that meet a minimum total order quantity threshold of 50 or more.
- The innermost subquery calculates the total order quantity (TOTAL) for each product by summing the reordered items for each product_id.
- The middle subquery groups the data by product_id and calculates the total order quantity.
- The outer query selects and displays only the product_id values for products whose total order quantity is greater than or equal to 50.
The code is referred a book called SQL로 맛보는 데이터 전처리 분석(노수영 저).