Efficient min/max over partitioned table
From PostgreSQL wiki
(Update to document version compatibility) |
(Reorder example to read more better) |
||
| Line 46: | Line 46: | ||
== Example usage == | == Example usage == | ||
<source lang="sql"> | <source lang="sql"> | ||
| + | db=> explain select max(foo_id) from foo; | ||
| + | QUERY PLAN | ||
| + | -------------------------------------------------------------------------------------- | ||
| + | Aggregate (cost=177032.00..177032.01 rows=1 width=4) | ||
| + | -> Append (cost=0.00..150477.20 rows=10621920 width=4) | ||
| + | -> Seq Scan on foo (cost=0.00..34.00 rows=2400 width=4) | ||
| + | -> Seq Scan on foo_child1 foo (cost=0.00..150443.20 rows=10619520 width=4) | ||
| + | |||
db=> select max(foo_id) from foo; | db=> select max(foo_id) from foo; | ||
max | max | ||
| Line 52: | Line 60: | ||
Time: 2636.540 ms | Time: 2636.540 ms | ||
| + | |||
db=> select spc_max_part_int('foo', 'foo_id'); | db=> select spc_max_part_int('foo', 'foo_id'); | ||
spc_max_part_int | spc_max_part_int | ||
| Line 58: | Line 67: | ||
Time: 0.556 ms | Time: 0.556 ms | ||
| - | |||
| - | |||
| - | |||
| - | |||
| - | |||
| - | |||
| - | |||
| - | |||
</source> | </source> | ||
Latest revision as of 17:00, 22 June 2011
Find maximal value from partitioned table
8.3 to 9.0
PL/pgSQL
PL/pgSQL
PostgreSQL 9.0 and earlier don't support indexed MIN(x) or MAX(x) queries over partitioned/inherited tables. See MAX Partitioning with MIN Pain for a test case showing the problem, and how it is resolved starting in PostgreSQL 9.1. However, it's possible to do this efficiently in earlier versions with a custom PL/pgSQL function.
This snippet was contributed by Shaun Thomas on the pgsql-performance mailing list, then tweaked with feedback from others. The original version of the function presented there worked with PostgreSQL 8.2, and may be necessary for users on that version. The simplified version here is only compatible with PostgreSQL 8.3 and later due to its use of regclass features introduced in that version.
This function works with bigint (and integer/smallint) columns and uses the MAX() aggregate. It shouldn't be difficult to port this to other datatypes or make it work with min().
Function definition
/** * Return the Maximum bigint Value for a Partitioned Table Column * * @param string Name of table, may be schema-qualified. * @param string Name of column to search. */ CREATE OR REPLACE FUNCTION spc_max_part_int(rParent regclass, sColName text) RETURNS bigint AS $$ DECLARE rChild regclass; nMax bigint; nTemp bigint; BEGIN EXECUTE ' SELECT max(' || quote_ident(sColName) ||') FROM ONLY ' || rParent INTO nMax; FOR rChild IN SELECT inhrelid FROM pg_inherits WHERE inhparent=rParent LOOP nTemp := spc_max_part_int(rChild, sColName); nMax := greatest(nTemp, nMax); END LOOP; RETURN nMax; END; $$ LANGUAGE plpgsql STABLE;
Example usage
db=> EXPLAIN SELECT max(foo_id) FROM foo; QUERY PLAN -------------------------------------------------------------------------------------- Aggregate (cost=177032.00..177032.01 rows=1 width=4) -> Append (cost=0.00..150477.20 rows=10621920 width=4) -> Seq Scan ON foo (cost=0.00..34.00 rows=2400 width=4) -> Seq Scan ON foo_child1 foo (cost=0.00..150443.20 rows=10619520 width=4) db=> SELECT max(foo_id) FROM foo; max ---------- 10000000 Time: 2636.540 ms db=> SELECT spc_max_part_int('foo', 'foo_id'); spc_max_part_int ------------------ 10000000 Time: 0.556 ms
