|
Getting your Trinity Audio player ready...
|
When working with business data, sometimes one query is not enough.
You may want to ask questions like:
- Which products are priced above the average price?
- Which customers made purchases greater than the average sale?
- Which employees earn more than the average salary?
- Which cities have sales higher than the overall company average?
To answer such questions, we use Subqueries.
What is a Subquery?
A Subquery is a query written inside another SQL query.
It is also called:
- Inner Query
- Nested Query
The inner query runs first.
The outer query uses its result.
FMCG Company Tables Used
We will use only these tables:
- employees (employee_id, name, department, salary, hire_date)
- customers (customer_id, customer_name, city, status)
- products (product_id, product_name, category, price)
- sales (sale_id, customer_id, product_id, sale_amount, quantity, sale_date)
- inventory (product_id, stock_quantity, warehouse)
Subquery in WHERE Clause (Most Common)
This is the most frequently used type.
Example 1: Products Priced Above Average Price
Step 1: Find average price
Step 2: Compare each product with that value
SELECT product_name, price
FROM products
WHERE price > (
SELECT AVG(price)
FROM products
);How it works:
- Inner query calculates average price.
- Outer query selects products greater than that.
Works same in PostgreSQL and MySQL.
Business Meaning:
Find premium products.
Example 2: Employees Earning More Than Average Salary
SELECT name, salary
FROM employees
WHERE salary > (
SELECT AVG(salary)
FROM employees
);Used in HR performance analysis.
Subquery with IN Operator
Used when inner query returns multiple values.
Example 3: Customers Who Made Purchases
SELECT customer_name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM sales
);Business Meaning:
List customers who have at least one sale.
Example 4: Products That Were Sold
SELECT product_name
FROM products
WHERE product_id IN (
SELECT product_id
FROM sales
);Great for checking active products.
Subquery with NOT IN
Example 5: Products Never Sold
SELECT product_name
FROM products
WHERE product_id NOT IN (
SELECT product_id
FROM sales
);Business Meaning:
Dead inventory.
⚠ Important:
If subquery returns NULL, NOT IN may behave unexpectedly.
We will discuss safer alternatives later.
Subquery in SELECT Clause
This type returns a value for each row.
Read More: Select Data in SQL
Example 6: Show Each Employee With Average Salary
SELECT
name,
salary,
(SELECT AVG(salary) FROM employees) AS company_avg_salary
FROM employees;Here:
- Subquery runs once.
- Its result appears in each row.
Subquery in FROM Clause (Derived Table)
This is powerful and very useful for reporting.
Example 7: Find Cities With Total Sales Above 1,00,000
Step 1: Create grouped result
Step 2: Filter it
SELECT city, total_sales
FROM (
SELECT city, SUM(sale_amount) AS total_sales
FROM sales
GROUP BY city
) AS city_sales
WHERE total_sales > 100000;This is extremely useful in real reporting systems.
Works same in PostgreSQL and MySQL.
Correlated Subquery
This is more advanced.
A correlated subquery depends on outer query values.
Example 8: Employees Earning Above Their Department Average
SELECT e.name, e.salary, e.department
FROM employees e
WHERE e.salary > (
SELECT AVG(salary)
FROM employees
WHERE department = e.department
);What happens:
For each employee:
- Subquery calculates department average
- Compares salary
This runs multiple times (once per row).
Very common interview question.
EXISTS Subquery
Better alternative to IN in many cases.
Example 9: Customers Who Made Purchases
SELECT customer_name
FROM customers c
WHERE EXISTS (
SELECT 1
FROM sales s
WHERE s.customer_id = c.customer_id
);EXISTS checks if at least one matching row exists.
PostgreSQL and MySQL both support it.
NOT EXISTS (Safer Than NOT IN)
Example 10: Products Never Sold (Safer Version)
SELECT product_name
FROM products p
WHERE NOT EXISTS (
SELECT 1
FROM sales s
WHERE s.product_id = p.product_id
);Better than NOT IN when NULL values exist.
Subquery With Aggregate Comparison
Example 11: Customers Who Made Above Average Purchase
SELECT customer_id, sale_amount
FROM sales
WHERE sale_amount > (
SELECT AVG(sale_amount)
FROM sales
);Used in revenue performance tracking.
Also Read: Aggregate Functions in SQL
Execution Order
In subqueries:
- Inner query runs first
- Outer query uses result
Except correlated subqueries:
They run once per row.
PostgreSQL vs MySQL Differences
| Feature | PostgreSQL | MySQL |
|---|---|---|
| Subqueries in WHERE | Yes | Yes |
| Correlated Subqueries | Yes | Yes |
| EXISTS | Yes | Yes |
| Performance | Generally optimized well | Good but depends on indexing |
PostgreSQL usually handles complex subqueries more efficiently.
When to Use Subquery vs JOIN?
Subquery is good when:
- You need comparison with aggregated value
- Logic is simple and readable
- You don’t need columns from other table
JOIN is better when:
- You need columns from multiple tables
- Performance is critical
- Data volume is large
Read More: Joins in SQL
Common Mistakes
❌ Using NOT IN with NULL values
❌ Forgetting alias in FROM subquery
❌ Writing complex nested subqueries unnecessarily
❌ Using correlated subquery when JOIN is better
Real-Life FMCG Analysis Examples
Top Paying Customers (Above Average Sale)
SELECT customer_id, SUM(sale_amount) AS total_spent
FROM sales
GROUP BY customer_id
HAVING SUM(sale_amount) > (
SELECT AVG(sale_amount)
FROM sales
);Products Priced Higher Than Category Average
SELECT p.product_name, p.category, p.price
FROM products p
WHERE p.price > (
SELECT AVG(price)
FROM products
WHERE category = p.category
);Correlated logic again.
Performance Note
For large datasets:
- Index foreign keys
- Avoid deeply nested subqueries
- Consider rewriting as JOIN
- Use EXPLAIN (Advanced SQL topic)
Why Subqueries Matter for Data Analysts
Subqueries allow you to:
- Compare data with averages
- Filter data dynamically
- Build conditional reports
- Solve interview-level SQL problems
- Prepare logic before learning JOIN
Without subqueries, complex business questions become difficult.
Summary
Subqueries in SQL are queries written inside another query to perform dynamic comparisons and advanced filtering. They can be used in WHERE, SELECT, or FROM clauses and may include correlated logic.
Both PostgreSQL and MySQL support subqueries with similar syntax. Subqueries are powerful for comparing values against averages, filtering grouped data, and solving real-world analytical problems in FMCG business scenarios. Mastering subqueries prepares you for understanding joins and advanced SQL concepts.