PostgreSQL Table Partitioning

Table partitioning is a technique used to divide large database tables into smaller, more manageable parts called partitions. Partitioning can improve query performance, facilitate data maintenance, and enhance data management in scenarios where large amounts of data need to be handled efficiently.

In this tutorial, we will explore various table partition methods using the Pagila database as an example.

Prerequisites

To follow this tutorial, you should have the following prerequisites in place:

A working installation of the PostgreSQL database management system.
The Pagila database installed and configured in your PostgreSQL instance.
Basic knowledge of SQL and PostgreSQL.

Let’s get started!

Creating the Base Table

We will begin by creating the base table that will be partitioned. In this case, we will create a new table called payment_partition with the same structure as the original payment table.

CREATE TABLE payment_partition (
  payment_id SERIAL PRIMARY KEY,
  customer_id SMALLINT NOT NULL,
  staff_id SMALLINT NOT NULL,
  rental_id INTEGER NOT NULL,
  amount NUMERIC(5,2) NOT NULL,
  payment_date TIMESTAMP NOT NULL
);

Creating the Partitioned Tables

In this step, we will create the individual partition tables that will hold the data. We will partition the payment_partition table based on the payment year.

-- Create partition tables for years 2005-2010
CREATE TABLE payment_partition_2005 PARTITION OF payment_partition
    FOR VALUES FROM ('2005-01-01') TO ('2006-01-01');

CREATE TABLE payment_partition_2006 PARTITION OF payment_partition
    FOR VALUES FROM ('2006-01-01') TO ('2007-01-01');

CREATE TABLE payment_partition_2007 PARTITION OF payment_partition
    FOR VALUES FROM ('2007-01-01') TO ('2008-01-01');

-- Continue creating partition tables for other years...

Creating the Partition Function

To define the partitioning logic, we need to create a partition function that determines which partition each row should be placed in based on the payment date.

CREATE OR REPLACE FUNCTION payment_partition_function(payment_date TIMESTAMP)
  RETURNS TABLE(payment_partition_name TEXT) AS $$
BEGIN
  IF payment_date >= '2005-01-01' AND payment_date < '2006-01-01' THEN
    RETURN QUERY VALUES ('payment_partition_2005');
  ELSIF payment_date >= '2006-01-01' AND payment_date < '2007-01-01' THEN
    RETURN QUERY VALUES ('payment_partition_2006');
  ELSIF payment_date >= '2007-01-01' AND payment_date < '2008-01-01' THEN
    RETURN QUERY VALUES ('payment_partition_2007');
  -- Add more conditionals for other years...
  ELSE
    RAISE EXCEPTION 'Date out of range. Ensure partition is defined.';
  END IF;
END;
$$ LANGUAGE plpgsql;

Creating the Partition Trigger

To automatically route the rows to the appropriate partition, we need to create a partition trigger.

CREATE TRIGGER payment_partition_trigger
  BEFORE INSERT ON payment_partition
  FOR EACH ROW
  EXECUTE FUNCTION payment_partition_function(NEW.payment_date);

Verify the Partitioned Tables

To verify that the partitioning setup is working correctly, insert a few sample rows into the payment_partition table.

INSERT INTO payment_partition (customer_id, staff_id, rental_id, amount, payment_date)
VALUES (1, 1, 1, 9.99, '2005-01-01'),
       (2, 2, 2, 4.99, '2006-02-01'),
       (3, 3, 3, 19.99, '2007-03-01');

Querying the Partitioned Tables

You can now query the partitioned tables just like any other table. The partitioning logic will ensure that only relevant partitions are scanned for each query.

-- Query all payments from the year 2005
SELECT * FROM payment_partition_2005;

-- Query all payments from the year 2006
SELECT * FROM payment_partition_2006;

-- Query all payments from the year 2007
SELECT * FROM payment_partition_2007;

Maintenance Operations

Partitioning can also simplify data maintenance operations. For example, if you want to drop all data older than a certain year, you can drop the corresponding partition.

-- Drop the partition for year 2005
DROP TABLE payment_partition_2005;

Conclusion

In this tutorial, you learned how to work with table partition methods using the Pagila database as an example. You created a base table, defined partitioned tables, created a partition function and trigger, and performed maintenance operations. By partitioning your tables, you can improve query performance and manage large amounts of data more efficiently.

PostgreSQL Table Partitioning