Join optimization for inheritance tables
Motivation
We are using Postgres to host a data warehouse containing daily reports for production and sales. The tables are partitioned per day as follows:
CREATE TABLE production (id int, date timestamp, bigint quantity, ...); CREATE TABLE production_jan_1 ( CHECK ('2009-01-01'<=date AND date<'2009-01-02') ) INHERITS (production); CREATE TABLE production_jan_2 ( CHECK ('2009-01-02'<=date AND date<'2009-01-03') ) INHERITS (production); ...
CREATE TABLE sales(id int, date timestamp, price float, ...); CREATE TABLE sales_jan_1 ( CHECK ('2009-01-01'<=date AND date<'2009-01-02') ) INHERITS (sales); CREATE TABLE sales_jan_2 ( CHECK ('2009-01-02'<=date AND date<'2009-01-03') ) INHERITS (sales); ...
If we want to know what has been produced on Jan 1st we execute:
SELECT id, sum(quantity) FROM production WHERE '2009-01-01'<=date AND date<'2009-01-02' GROUP BY id;
The planner filters tables based on check constraints and smartly scans ONLY production_jan_1.
If we want to know the sales per day, the query looks like:
SELECT p.id, p.date, sum(s.price) FROM production p, sales s WHERE p.date = s.date AND p.id = s.id GROUP BY p.id, p.date
The planner will then treat each hierarchy as a single table and generate the following plan:
Group aggregate -> merge join -> sort -> Append -> Seq Scan on production -> Seq Scan on production_jan_1 -> Seq Scan on production_jan_2 -> ... -> sort -> Append -> Seq Scan on sales -> Seq Scan on sales_jan_1 -> Seq Scan on sales_jan_2 -> ...
We would like to be able to use check constraints to join child tables directly and generate the following plan:
Hash aggregate -> Append -> Hash Join -> Seq Scan on production_jan_1 -> Seq Scan on sales_jan_1 -> Hash Join -> Seq Scan on production_jan_2 -> Seq Scan on sales_jan_2 -> ...
Solution overview
The intuition behind our algorithm is as follows:
- Check if we can join child tables directly
- Get child table constraints
- Find possible child joins
- Generate plans for each child join
- Combine results from child joins
A naive n^2 implementation examines constraints from each possible pair of child tables.
Another approach treat constraints as intervals and use the interval tree for the matching of child table. This offers an n.log(n) complexity.
The graph below compares the 2 approaches for different number of child tables:
A common scenario is that the parent table is empty (and it has no constraint) so the matching algorithm still needs to join the parent table with all child tables. We propose to add an empty constraint to the parent table to avoid unnecessary planning and execution work. The result of this optimization are shown below:
Implementation
The proposed patch has been posted on the mailing list: http://archives.postgresql.org/message-id/43826FCDC252204EA7823B2E7CF3CCEC25FC2F4B@Pandora.AsterData.local
Performance evaluation
We have evaluated planning time and execution time by varying 3 factors:
- number of child tables per hierarchy
- number of rows in hierarchy
- number of joins in query
The first graph shows the planning time difference with the patch (feature on) and without the patch (feature off) for different number of child tables per hierarchy. These results were obtained by executing a query that joins 3 hierarchies together, each partitioned in a fixed number of child tables.
To get a better insight, let us consider the case with 730 child tables. We are joining 3 hierarchies, each having 730 child tables. With the feature turned off, the optimizer takes about 2 seconds to perform the planning. This time is spent to find the best access method for 2190 (3*730) table scans and the best join methods to join the 3 appends. With the feature turned on, the same work will be done. In addition, the patch will look at the check constraints of all the tables to figure out with child tables can be joined together. In this particular case, the matching is 1-to-1. Therefore, the planner will find the best way to join 2190 different joins for each triple of child tables. All this work is done in less than 5 seconds, but the extra planning time is gained back during execution.
The following shows the planning time difference with the patch (feature on) and without the patch (feature off) for 1 to 3 joins with 730 child tables per hierarchy.
The last graph shows the total execution time with the patch (feature on) and without the patch (feature off) for 2 joins (3 hierarchies with 365 child tables each) varying the total number of rows from 100K to 500K.