When you are deep into SQL and database queries, one of the most powerful and incredible features that you will come across is Common Table Expressions, commonly known as CTEs.
In SQL, the WITH
clause is also known as CTE. It is a powerful feature that enables us to create temporary result sets within a query. One major role of CTEs is simplifying complex queries into smaller and reusable subqueries. This helps to make the code more readable and maintainable in the long run.
Requirements
For demonstration purposes, we are going to use:
- MySQL version 8.0 and above.
- The sakila sample database.
With the above requirements met, we can proceed to learn more about CTEs and the WITH clause.
SQL WITH Clause
The WITH clause allows us to define one or more temporary result sets known as Common Table Expressions.
We can reference the resulting CTEs in the main query like any other table or result set. This plays a crucial role in creating modular SQL queries.
Although the syntax of a CTE may vary slightly depending on your requirements, the following shows the basic syntax of a CTE in SQL.
WITH cte_name (
column1,
column2,
...
)AS (
-- CTE Query
SELECT
...
FROM
...
WHERE
...
)
-- Main query
SELECT
...
FROM
...
JOIN cte_name ON...
WHERE
...
We start with the WITH
keyword which tells the SQL database that we wish to create and use a CTE.
Next, we specify the name for the CTE which will allow us to reference it in other queries.
We also specify an optional list of column names if the CTE includes column aliases.
Next, we proceed to define the CTE query. This contains all the tasks or the data that the CTE carries out enclosed in a pair of parentheses.
Lastly, we specify the main query which will reference the CTE.
Example Usage
One of the best ways to understand how to use and work with CTEs is to look at a practical example.
Take for example the sakila sample database. Suppose we wish to find the top 10 customers with the highest number of rentals?
Take a look at the CTE shown below:
-- Using the SQL WITH Clause to find top 10 customers with the highest rental counts
WITH CustomerRentals AS (
SELECT
c.customer_id,
c.first_name,
c.last_name,
COUNT(r.rental_id) AS rental_count
FROM
customer c
JOIN rental r ON c.customer_id = r.customer_id
GROUP BY
c.customer_id,
c.first_name,
c.last_name
)
SELECT
*
FROM
CustomerRentals
ORDER BY
rental_count DESC
LIMIT 10;
In the example above, we start by defining a new CTE using the WITH keyword followed by the name we wish to assign to the CTE. In this case, we call it CustomerRentals.
Inside the CTE body, we calculate the rental count for each customer by joining the customer and rentals table.
Lastly, in the main query, we select all the columns from the CTE, order the results based on the rental count (descending order), and limit the output to just the top 10 rows.
This will allow us to fetch the customers with the highest number of rentals as shown in the output below:
Recursive CTEs
In some other cases, you might be dealing with hierarchical data structures. This is where recursive CTEs come into play.
Let us take for example we want to navigate the hierarchical organization or to represent a tree-like structure. We can use the WITH RECURSIVE
keyword to create a recursive CTE.
Since there is no hierarchical data that we can use in the sakila database to demonstrate recursive CTE, let us setup a basic example.
CREATE TABLE department (
department_id INT PRIMARY KEY AUTO_INCREMENT,
department_name VARCHAR(255) NOT NULL,
parent_department_id INT,
FOREIGN KEY (parent_department_id) REFERENCES department (department_id)
);
Insert sample data.
INSERT INTO department (department_name, parent_department_id)
VALUES('Corporate', NULL), ('Finance', 1), ('HR', 1), ('Accounting', 2), ('Recruiting', 3), ('Payroll', 4);
In this case, we have a sample department table with some random data. To find the hierarchical structure of the departments, we can use a recursive CTE as shown:
WITH RECURSIVE DepartmentHierarchy AS (
SELECT
department_id,
department_name,
parent_department_id
FROM
department
WHERE
parent_department_id IS NULL
UNION ALL
SELECT
d.department_id,
d.department_name,
d.parent_department_id
FROM
department d
JOIN DepartmentHierarchy dh ON d.parent_department_id = dh.department_id
)
SELECT
*
FROM
DepartmentHierarchy;
In this case, the recursive CTE starts with departments having a NULL parent_department_id (root departments)
and recursively retrieves child departments.
Conclusion
In this tutorial, we learned about of the most fundamental and useful features in SQL databases; Common Table Expressions by understanding how to work with the WITH keyword.