SQL Mastery

SQL Fundamentals — Quiz 1

SQL Fundamentals — Quiz 1 — Study Guide

SQL Fundamentals — Study Guide

SQL (Structured Query Language) is the universal language for talking to relational databases. Whether you're building a web app, analyzing data, or working in data engineering, SQL is a skill you'll use constantly. This guide covers the core building blocks you need to read, filter, sort, group, and combine data from one or more tables.


Retrieving Data: SELECT

The SELECT statement is how you ask a database for data. Think of it like placing an order — you specify exactly what you want.

-- Select specific columns
SELECT first_name, last_name FROM employees;

-- Select all columns SELECT * FROM employees;


Filtering Results: WHERE

The WHERE clause filters rows based on a condition. Only rows that satisfy the condition are returned.

SELECT * FROM employees
WHERE department = 'Engineering';

SELECT * FROM products WHERE price > 50 AND stock > 0;

Common operators in WHERE: =, !=, >, <, >=, <=, LIKE, IN, BETWEEN, IS NULL


Sorting Results: ORDER BY

ORDER BY sorts your result set by one or more columns. Use ASC (ascending, the default) or DESC (descending).

SELECT name, salary FROM employees
ORDER BY salary DESC;

-- Sort by multiple columns SELECT name, department, salary FROM employees ORDER BY department ASC, salary DESC;


Limiting Rows: LIMIT

LIMIT restricts how many rows are returned. This is useful for previewing data or implementing pagination.

-- Return only the top 5 highest-paid employees
SELECT name, salary FROM employees
ORDER BY salary DESC
LIMIT 5;

Analogy: LIMIT is like telling a waiter, "Just bring me 5 items from the menu" — you don't need to see everything.


Aliases

Aliases give temporary, readable names to columns or tables using the AS keyword (the keyword is optional but recommended for clarity).

-- Column alias
SELECT first_name AS "First Name", salary * 12 AS annual_salary
FROM employees;

-- Table alias (very useful with JOINs) SELECT e.name, d.name FROM employees AS e JOIN departments AS d ON e.dept_id = d.id;


Grouping Data: GROUP BY and HAVING

GROUP BY

GROUP BY collapses rows that share the same value in a column into a single summary row. It's almost always used with aggregate functions like COUNT(), SUM(), AVG(), MAX(), and MIN().

-- Count employees in each department
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;

HAVING

HAVING filters *groups* after GROUP BY has been applied — think of it as WHERE for aggregated data.

-- Only show departments with more than 10 employees
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;

ClauseFiltersUsed With
WHEREIndividual rowsAny query
HAVINGGroups/aggregatesGROUP BY

Combining Tables: JOINs

JOINs let you combine rows from two or more tables based on a related column. Imagine two spreadsheets — a JOIN merges them on a shared key.

INNER JOIN

Returns only the rows where there is a match in both tables.

SELECT employees.name, departments.name AS dept
FROM employees
INNER JOIN departments ON employees.dept_id = departments.id;

LEFT JOIN (LEFT OUTER JOIN)

Returns all rows from the left table, plus matching rows from the right table. If there's no match, the right side columns are NULL.

SELECT employees.name, departments.name AS dept
FROM employees
LEFT JOIN departments ON employees.dept_id = departments.id;
-- Employees with no department will still appear, with NULL for dept

RIGHT JOIN (RIGHT OUTER JOIN)

Returns all rows from the right table, plus matching rows from the left table. NULLs fill in where there's no left-side match.

SELECT employees.name, departments.name AS dept
FROM employees
RIGHT JOIN departments ON employees.dept_id = departments.id;
-- All departments appear, even those with no employees

FULL OUTER JOIN

Returns all rows from both tables. NULLs appear wherever there is no match on either side.

SELECT employees.name, departments.name AS dept
FROM employees
FULL OUTER JOIN departments ON employees.dept_id = departments.id;

CROSS JOIN

Returns the Cartesian product — every row from the left table paired with every row from the right table. No ON condition is needed.

SELECT colors.name, sizes.name
FROM colors
CROSS JOIN sizes;
-- If colors has 3 rows and sizes has 4, result has 12 rows

Self-Join

A self-join joins a table to itself. This is useful for hierarchical data, like finding an employee's manager when both are stored in the same table.

SELECT e.name AS employee, m.name AS manager
FROM employees AS e
JOIN employees AS m ON e.manager_id = m.id;

JOIN Comparison Table

JOIN TypeReturns
INNER JOINOnly matching rows from both tables
LEFT JOINAll left rows + matching right rows (NULLs for gaps)
RIGHT JOINAll right rows + matching left rows (NULLs for gaps)
FULL OUTER JOINAll rows from both tables (NULLs for gaps)
CROSS JOINEvery combination of rows from both tables
Self-JoinA table joined with itself using aliases

Putting It All Together

Here's a query that uses many of these concepts at once:

SELECT d.name AS department, COUNT(e.id) AS total_employees, AVG(e.salary) AS avg_salary
FROM employees AS e
LEFT JOIN departments AS d ON e.dept_id = d.id
WHERE e.status = 'active'
GROUP BY d.name
HAVING COUNT(e.id) > 5
ORDER BY avg_salary DESC
LIMIT 10;

Reading order of clauses:

  • FROM / JOIN — which tables?
  • WHERE — filter rows
  • GROUP BY — group results
  • HAVING — filter groups
  • SELECT — choose columns
  • ORDER BY — sort
  • LIMIT — cap the output

  • Key Takeaways

  • SELECT retrieves data; WHERE filters rows; ORDER BY sorts them; LIMIT caps the count.
  • GROUP BY aggregates rows into groups, and HAVING filters those groups — unlike WHERE, which filters before grouping.
  • INNER JOIN returns only matches; LEFT JOIN keeps all left rows; RIGHT JOIN keeps all right rows; FULL OUTER JOIN keeps everything.
  • CROSS JOIN produces every row combination, and a self-join joins a table to itself using table aliases.
  • Aliases (AS) make queries more readable and are essential when joining a table to itself or renaming calculated columns.