CTEReadme
Usage
Definition
The WITH clause allows to define a table expression being valid within a same SELECT statement. The table expression is called "Common Table Expression"(CTE). The use case for CTE is similar to VIEW but it is more handy than VIEW. Unlike VIEW you do not need to define a CTE beforehand.
Examples
CTE allows to use recursive queries. To use a recursive query, you need to add the RECURSIVE keyword. Here is an example for WITH RECURSIVE clause usage. Table "department" represents the structure of an organization as an adjacency list.
CREATE TABLE department ( id INTEGER PRIMARY KEY, -- department ID parent_department INTEGER REFERENCES department, -- upper department ID name TEXT -- department name ); INSERT INTO department (id, parent_department, "name") VALUES (0, NULL, 'ROOT'), (1, 0, 'A'), (2, 1, 'B'), (3, 2, 'C'), (4, 2, 'D'), (5, 0, 'E'), (6, 4, 'F'), (7, 5, 'G'); -- department structure represented here is as follows: -- -- ROOT-+->A-+->B-+->C -- | | -- | +->D-+->F -- +->E-+->G
To extract all departments under A, you can use the following recursive query:
WITH RECURSIVE subdepartment AS ( -- non-recursive term SELECT * FROM department WHERE name = 'A' UNION ALL -- recursive term SELECT d.* FROM department AS d JOIN subdepartment AS sd ON (d.parent_department = sd.id) ) SELECT * FROM subdepartment ORDER BY name;
How Recursion Works
The meaning of the query above can be explained as follows.
Given three tables, intermediate table (IntermediateTable); work table (WorkTable); result table (ResultTable).
1) Initialize
Execute non recursive term (SELECT * FROM department WHERE name = 'A') and assign the result to ResultTable and WorkTable.
ResultTable = WorkTable = ('A')
Make IntermediateTable empty.
IntermediateTable = ()
2) Execute recursive query
Replace subdepartment with WorkTable and execute the recursive term
SELECT d.* FROM department AS d JOIN WT AS sd ON d.parent_department = sd.id
Next, assign the result of the query to IntermediateTable. Add IntermediateTable to ResultTable and replace IntermediateTable to WorkTable. Make IntermediateTable empty.
3) Check whether recursion is ended
If IntermediateTable is empty in 2), execution of recursion is ended. Return ResultTable.
Otherwise go to 2).
"subdepartment" is a CTE including a recursive expression since it refers to itself within its definition. In the above query first, the non-recursive term is evaluated. Then the recursive term is evaluated and the result is added to the previous result set over and over until there's no more data to process. Finally the last SELECT is executed and data is extracted from the result set.
Implementation
Parsing recursive queries
To represent CTE, new structure member "withClause" is added to SelectStmt structure which is used to store parsed information of a SELECT stament. The definition of withClause is as follows:
/* * WithClause - * reprensentation of WITH clause */ typedef struct WithClause { NodeTag type; /* T_WithClause */ List *subquery; /* subquery list (list of RangeSubselect) */ bool recursive; /* true = WITH RECURSIVE */ } WithClause;
If RECURSIVE key word is not given, CTE is converted to a WithClause.subquery and WithClause.recursive is set to false.
If RECURSIVE key word is given, the raw parse tree stored in WithClause.subquery represents a subquery which is an UNION ALL of a non recursive term and a recursive term. Set operations other than UNION ALL between a non recursive term and a recursive term are not permitted. WithClause.subquery.subquery.larg represents "SELECT * FROM department WHERE name = 'A'" and WithClause.subquery.subquery.rarg represents "SELECT d.* FROM department AS d JOIN subdepartment AS sd ON d.parent_department = sd.id" in the example above.
Analyzing recursive queries
To analyze CTEs, we add following new members "p_ctenamespace", "p_recursive_namespace", and "p_in_with_clause" to ParseState structure.
p_ctenamespace is a namespace for non recursive CTEs and a list of RangeSubselects. Here is an excerpt from comments of ParseState:
* [1] Note that p_ctenamespace is a namespace for "relations" but distinct * from p_relnamespace. p_ctenamespace is a list of relations that can be * referred to in a FROM or JOIN clause (in addition to normal tables and * views). p_relnamespace is the list of relations which already have been * listed in such clauses and therefore can be referred to in qualified * variable references. Also, note that p_ctenamespace is a list of * RangeSubselects, not a list of range table entries.
p_recursive_namespace is a namespace for recursive CTEs and is a list of RangeRecursive. RangeRecursive is a newly introduced structure:
/* * RangeRecursive - recursive call appearing in a FROM clause */ typedef struct RangeRecursive { NodeTag type; Node *subquery; /* the untransformed sub-select clause */ Alias *alias; /* table alias & optional column aliases */ List *coldeflist; /* list of ColumnDef nodes to describe result */ Node *non_recursive_term; /* analyze result of non recusive term */ bool recursive; /* true if recursive query */ } RangeRecursive;
ParseState.p_rtable is a list of RangeTblEntry. To process CTEs, new members "self_reference" and "non_recursive_query" are added to RangeTblEntry.
/* * Fields valid for a RECURSIVE CTE (else NIL) */ bool self_reference; /* delay analyzing SelectStmt */ Query *non_recursive_query; /* non-recursive term */
If a CTE using "RECURSIVE" keyword is not actually recursive, "recursive" is set to false and the CTE is treated as a subquery and added to ParseState.p_rtable.
If the CTE refers to itself, analyzing will be delayed, self_reference is set to true, and non_recursive_term is filled.
It is possible that more than one CTE elements (query names) exist.
WITH RECURSIVE x AS (SELECT * from y), y AS (SELECT * FROM pg_class) SELECT * FROM x;
In this case we should analyze y first since x depends on y. To find such dependency information, a topological sort is performed. See makeDependencyGraph() for more details.
Next we check if the recursive query is following the SQL standard. Following example queries are not allowed by the standard.
- non-recursive term and recursive term must be combined with UNION ALL
WITH RECURSIVE x(n) AS (SELECT 1 UNION SELECT n+1 FROM x) SELECT * FROM x;
WITH RECURSIVE x(n) AS (SELECT 1 INTERSECT SELECT n+1 FROM x) SELECT * FROM x;
- non-recursive term does not exist
WITH RECURSIVE x(n) AS (SELECT n FROM x) SELECT * FROM x;
- recursive term must be in the left hand side
WITH RECURSIVE x(n) AS (SELECT n FROM x UNION ALL SELECT 1) SELECT * FROM x;
- LEFT JOIN in the right hand side of recursive term not allowed
WITH RECURSIVE x(n) AS (SELECT a FROM y WHERE a = 1 UNION ALL SELECT x.n+1 FROM y LEFT JOIN x ON x.n = y.a WHERE n < 10) SELECT * FROM x;
- RIGHT JOIN in the left hand side of recursive term not allowed
WITH RECURSIVE x(n) AS (SELECT a FROM y WHERE a = 1 UNION ALL SELECT x.n+1 FROM x RIGHT JOIN y ON x.n = y.a WHERE n < 10) SELECT * FROM x;
- FULL JOIN in a recursive term not allowed
WITH RECURSIVE x(n) AS (SELECT a FROM y WHERE a = 1 UNION ALL SELECT x.n+1 FROM x FULL JOIN y ON x.n = y.a WHERE n < 10) SELECT * FROM x;
- WHERE clause having subquery which uses recursive name in a recursive
term not allowed WITH RECURSIVE x(n) AS (SELECT 1 UNION ALL SELECT n+1 FROM x WHERE n IN (SELECT * FROM x)) SELECT * FROM x;
- GROUP BY, HAVING in a recursive term not allowed
WITH RECURSIVE x(n) AS (SELECT 1 UNION ALL SELECT n+1 FROM x GROUP BY n) SELECT * FROM x;
-- aggregate functions a recursive term not allowed
WITH RECURSIVE x(n) AS (SELECT 1 UNION ALL SELECT count(*) FROM x) SELECT * FROM x;
In addition to above restrictions, we add more restrictions:
- ORDER BY in a recursive query not allowed
WITH RECURSIVE x(n) AS (SELECT 1 UNION ALL SELECT n+1 FROM x ORDER BY 1) SELECT * FROM x;
- LIMIT OFFSET in a recursive query not allowed
WITH RECURSIVE x(n) AS (SELECT 1 UNION ALL SELECT n+1 FROM x LIMIT 10 OFFSET 1) SELECT * FROM x;
- FOR UPDATE in a recursive query not allowed
WITH RECURSIVE x(n) AS (SELECT 1 UNION ALL SELECT n+1 FROM x FOR UPDATE) SELECT * FROM x;
- target list having subquery which uses recursive name in a recursive term not allowed
WITH RECURSIVE x(id) AS (values (1) UNION ALL SELECT (SELECT * FROM x) FROM x WHERE id < 5 ) SELECT * FROM x;
- mutual recursive call is not supported
WITH RECURSIVE x (id) AS (SELECT 1 UNION ALL SELECT id+1 FROM y WHERE id < 5), y (id) AS (SELECT 1 UNION ALL SELECT id+1 FROM x WHERE id < 5) SELECT * FROM x;
Note that SQL Server and Firebird do not allow mutual recursive query either. Oracle does not support WITH RECURSIVE, but has its own CONNECT BY syntax.
Tables defined in the WITH RECURSIVE clause are identified as RTE_RECURSIVE out of RTEKind in the RangeTblEntry in the parse tree nodes. A name space "p_recursive_namespace", whose structure type is RangeRecursive, is added to in ParseState structure.
typedef struct RangeRecursive { NodeTag type; Node *subquery; /* whole subquery */ Alias *alias; /* table alias & optional column aliases */ List *coldeflist; /* list of ColumnDef nodes */ Node *non_recursive_term; /* analyze result of non recursive term */ bool recursive; /* true if recursive */ } RangeRecursive;
non_recursive_term keeps the result of analyzing on non recursive term. Using this, the type of recursive query is inferenced.
Rewriter
The changes applied to the rewriter is small. fireRIRrules() recursively expand subqueries. We just do the same thing if it's a recursive query.
Generating a plan
New plan nodes "Recursion" and "Recursive scan" are added. Recursion plan is the top level plan for execution of WITH RECURSIVE query. In the example below, Recursion plan represents the execution plan for "SELECT * FROM subdepartment" part. Recursion plan always has "Append" subplan which represents the execution plan for "UNION ALL" for non recursive part and recursive part. Plan for non recursive part is nothing special and is an ordinary plan generated according to the non recursive query part. Recursive scan represents the plan for recursive part.
Below is an example EXPLAIN output.
EXPLAIN WITH RECURSIVE subdepartment AS ( -- non recursive term SELECT * FROM department WHERE name = 'A' UNION ALL -- recursive term SELECT d.* FROM department AS d, subdepartment AS sd WHERE d.parent_department = sd.id ) SELECT * FROM subdepartment; QUERY PLAN ----------------------------------------------------------------------------------- Recursion on subdepartment (cost=0.00..50.76 rows=12 width=40) -> Append (cost=0.00..50.64 rows=12 width=40) -> Seq Scan on department (cost=0.00..24.50 rows=6 width=40) Filter: (name = 'A'::text) -> Hash Join (cost=0.01..26.02 rows=6 width=40) Hash Cond: (d.parent_department = sd.id) -> Seq Scan on department d (cost=0.00..21.60 rows=1160 width=40) -> Hash (cost=0.00..0.00 rows=1 width=4) -> Recursive Scan on sd (cost=0.00..0.00 rows=1 width=4)
The structure which represents Recursion plan is as follows:
typedef struct Recursion { Scan scan; Plan *subplan; List *subrtable; /* temporary workspace for planner */ } Recursion;
The structure which represents RecursiveScan plan is as follows:
typedef struct RecursiveScan { Scan scan; } RecursiveScan;
Executor
To execute plans for CTEs, we add new members "es_tuplestorestate", "es_rscan_tupledesc" and "es_disallow_tuplestore". es_tuplestorestate is used to remember TupleStoreState of the working table(WT). es_rscan_tupledesc is used to remeber the type information of the tuples returned by the CTEs. es_disallow_tuplestore is not actually used.
To manage recursive query execution state following structures are added.
typedef struct RecursionState { ScanState ss; PlanState *subplan; /* plan for non recursive query */ bool recursive_empty; /* if recursive term does exist, true */ Tuplestorestate *intermediate_tuplestorestate; /* intermediate table (IT) */ Tuplestorestate *working_table; /* working table (WT) */ bool init_done; /* initialization flag */ } RecursionState; typedef struct RecursiveScanState { ScanState ss; TupleDesc tupdesc; Tuplestorestate **working_table; /* working table */ } RecursiveScanState;
Initializing Recursion Plan
This is done by ExecInitRecursion(). The working table (WT) is created and its pointer is stored in RecursionState.working_table. The intermediate table (IT) is created and its pointer is stored in RecursionState.intermediate_tuplestorestate.
While the initialization WT created above must be used and pointer to WT is stored in es_tuplestorestate in the master executor state (Estate).
Type info for scan is taken from the non recursive query (saved in RecursionState.subquery) and stored in RecursionState.ss and Estate/es_rscan_tupledesc.
Executing Recursion Plan
Recursion Plan is executed by ExecRecursion(). The work horse is RecursionNext(). First it execute a plan for non recursive term and the result is stored in WT. Then execute a plan for recursive term, which is a subplan of the recursion plan. If the plan returns one or more rows, they are store in IT. IT and WT are swapped and recreate IT. If no row is returned, the recursion is ended and ExecRecursion() returns NULL.
Initializing RecursiveScan plan
This is very much same as the initialization of RecursionScan except that type info for scan is taken from Estate.es_rscan_tupledesc which is initialized by ExecInitRecursion().
Executing RecursiveScan plan
Recursive scan is executed by ExecRecursiveScan(). The work horse is RecursivescanNext(). It just returns tuples store in WT. Work table is stored in Estate structure.
Miscellaneous changes in Executor
- Hash join plan does not always recreate hash until hash join ended. If hash join has RecursiveScan as its subplan, the hash needs to recreate. To solve the problem, "has_recursivescan" is added to PlanState structure, which is true if upper plan has RecursiveScan as a subplan and hash is recreated.
- exec_append_initialize_next() is executed while processing recursion. The function is now a global one.
Limitations
- SEARCH and CYCLE clauses are not implemented.
- Mutual recursion is not allowed.
- Only the last SELECT of UNION ALL can include the recursion name.
- Cost of Recursion and RecursiveScan plan are always 0.
More fun with CTEs
CTEs are being used more and more for SQL expressions. A CTE bounty page has been started to encourage their use. There are also a few snippets that rely on them