|
Getting your Trinity Audio player ready...
|
In real-world databases, data is not stored in one big table.
Instead:
- customers table stores customer details
- products table stores product information
- sales table stores transactions
- employees table stores staff details
- inventory table stores stock
To generate meaningful reports, we must combine data from multiple tables.
That is done using JOINS in SQL.
What is a JOIN?
A JOIN combines rows from two or more tables based on a related column.
Usually:
- Primary Key in one table
- Foreign Key in another table
Example:sales.customer_id connects with customers.customer_id
FMCG Company Tables Used
We will use:
- customers (customer_id, customer_name, city, status)
- products (product_id, product_name, category, price)
- sales (sale_id, customer_id, product_id, quantity, sale_amount, sale_date)
- employees (employee_id, name, department, salary)
- inventory (product_id, stock_quantity, warehouse)
INNER JOIN (Most Common Joins in SQL)
Returns only matching rows in both tables.
Example 1: Show Sales with Customer Name
SELECT s.sale_id,
c.customer_name,
s.sale_amount,
s.sale_date
FROM sales s
INNER JOIN customers c
ON s.customer_id = c.customer_id;What happens:
- Matches sales.customer_id with customers.customer_id
- Only rows that exist in BOTH tables are returned
Works same in PostgreSQL and MySQL.
Example 2: Show Sales with Product Name
SELECT s.sale_id,
p.product_name,
s.quantity,
s.sale_amount
FROM sales s
JOIN products p
ON s.product_id = p.product_id;Note:
INNER JOIN and JOIN mean the same thing.
LEFT JOIN (Left Outer Join)
Returns:
- All rows from left table
- Matching rows from right table
- NULL if no match
Example 3: Show All Customers and Their Sales
SELECT c.customer_name,
s.sale_amount
FROM customers c
LEFT JOIN sales s
ON c.customer_id = s.customer_id;If a customer has no sale:
sale_amount will be NULL.
Business Meaning:
You can identify customers who never purchased.
RIGHT JOIN (Right Outer Join)
Returns:
- All rows from right table
- Matching rows from left table
- NULL if no match
Example 4: Show All Products and Their Sales
SELECT p.product_name,
s.sale_amount
FROM sales s
RIGHT JOIN products p
ON s.product_id = p.product_id;Products not sold yet will show NULL in sale_amount.
LEFT JOIN vs RIGHT JOIN
| Feature | LEFT JOIN | RIGHT JOIN |
|---|---|---|
| Keeps all rows from | Left table | Right table |
| Unmatched values | NULL on right side | NULL on left side |
| Practical usage | More common | Less common |
In real projects, LEFT JOIN is used more frequently.
FULL OUTER JOIN
Returns:
- All rows from both tables
- NULL where no match
PostgreSQL
SELECT c.customer_name,
s.sale_amount
FROM customers c
FULL OUTER JOIN sales s
ON c.customer_id = s.customer_id;MySQL
MySQL does NOT directly support FULL OUTER JOIN.
Workaround:
SELECT c.customer_name,
s.sale_amount
FROM customers c
LEFT JOIN sales s
ON c.customer_id = s.customer_idUNIONSELECT c.customer_name,
s.sale_amount
FROM customers c
RIGHT JOIN sales s
ON c.customer_id = s.customer_id;PostgreSQL has advantage here 🔥
3-Table JOIN Example
This is where reporting becomes powerful.
Example 5: Sales Report with Customer & Product Details
SELECT c.customer_name,
p.product_name,
s.quantity,
s.sale_amount,
s.sale_date
FROM sales s
JOIN customers c
ON s.customer_id = c.customer_id
JOIN products p
ON s.product_id = p.product_id;Now we combined:
- Sales table
- Customer info
- Product info
This is real dashboard-level query.
JOIN + GROUP BY + HAVING
Example 6: Total Sales Per City Above 1,00,000
SELECT c.city,
SUM(s.sale_amount) AS total_sales
FROM sales s
JOIN customers c
ON s.customer_id = c.customer_id
GROUP BY c.city
HAVING SUM(s.sale_amount) > 100000
ORDER BY total_sales DESC;Execution flow:
- Join tables
- Group by city
- Apply HAVING filter
- Sort results
This is business reporting.
Also Read: GROUP BY in SQL
SELF JOIN
Joining a table with itself.
Example 7: Employees in Same Department
SELECT e1.name AS employee,
e2.name AS colleague,
e1.department
FROM employees e1
JOIN employees e2
ON e1.department = e2.department
AND e1.employee_id <> e2.employee_id;Used in organizational analysis.
INNER vs OUTER JOIN
| Feature | INNER JOIN | OUTER JOIN |
|---|---|---|
| Returns only matching rows | Yes | No |
| Returns unmatched rows | No | Yes |
| Can show NULLs | No | Yes |
| Used for strict relationships | Yes | No |
Outer joins include:
- LEFT JOIN
- RIGHT JOIN
- FULL JOIN
Joins vs Subqueries
This is important.
Subquery Example
SELECT customer_name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM sales
);Join Equivalent
SELECT DISTINCT c.customer_name
FROM customers c
JOIN sales s
ON c.customer_id = s.customer_id;When to Use JOIN?
✔ Need columns from multiple tables
✔ Better performance on large data
✔ Cleaner reporting queries
When to Use Subquery?
✔ Comparing with aggregate value
✔ Simpler logical conditions
✔ Readability matters
Generally:
JOIN is preferred for performance.
Real FMCG Business Queries
Total Revenue Per Product Category
SELECT p.category,
SUM(s.sale_amount) AS total_revenue
FROM sales s
JOIN products p
ON s.product_id = p.product_id
GROUP BY p.category;Customers With No Purchases
SELECT c.customer_name
FROM customers c
LEFT JOIN sales s
ON c.customer_id = s.customer_id
WHERE s.sale_id IS NULL;Classic interview question 🔥
Inventory Status With Product Names
SELECT p.product_name,
i.stock_quantity,
i.warehouse
FROM inventory i
JOIN products p
ON i.product_id = p.product_id;PostgreSQL vs MySQL Differences
| Feature | PostgreSQL | MySQL |
|---|---|---|
| INNER JOIN | Yes | Yes |
| LEFT JOIN | Yes | Yes |
| RIGHT JOIN | Yes | Yes |
| FULL OUTER JOIN | Yes | No |
| Performance | Strong optimizer | Good optimizer |
PostgreSQL has native FULL OUTER JOIN support.
Common Mistakes
❌ Forgetting ON condition (creates Cartesian product)
❌ Using wrong join type
❌ Not understanding NULL behavior
❌ Not using table aliases
Why Joins Are Critical for Data Analysts
With Joins you can:
- Build dashboards
- Combine multiple datasets
- Create relational reports
- Analyze customer behavior
- Track product performance
- Perform multi-table aggregation
Without Joins, SQL remains incomplete.
Execution Order Reminder
- FROM
- JOIN
- WHERE
- GROUP BY
- HAVING
- SELECT
- ORDER BY
- LIMIT
Understanding this prevents logical errors.
Final Summary
Joins in SQL are used to combine data from multiple tables based on related columns. INNER JOIN returns matching rows, while LEFT, RIGHT, and FULL OUTER JOIN include unmatched rows with NULL values. Joins can combine two or three tables for advanced reporting.
Compared to subqueries, joins are often more efficient and preferred for multi-table data retrieval. PostgreSQL supports FULL OUTER JOIN natively, while MySQL requires a UNION workaround. Mastering joins is essential for real-world data analysis and reporting in FMCG and other business domains.
What’s Next?
👉 Data Cleaning in SQL