Efficient min/max over partitioned table

From PostgreSQL wiki

Revision as of 17:00, 22 June 2011 by Gsmith (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Library Snippets

Find maximal value from partitioned table

Works with PostgreSQL

8.3 to 9.0

Written in

PL/pgSQL

Depends on

PL/pgSQL


PostgreSQL 9.0 and earlier don't support indexed MIN(x) or MAX(x) queries over partitioned/inherited tables. See MAX Partitioning with MIN Pain for a test case showing the problem, and how it is resolved starting in PostgreSQL 9.1. However, it's possible to do this efficiently in earlier versions with a custom PL/pgSQL function.

This snippet was contributed by Shaun Thomas on the pgsql-performance mailing list, then tweaked with feedback from others. The original version of the function presented there worked with PostgreSQL 8.2, and may be necessary for users on that version. The simplified version here is only compatible with PostgreSQL 8.3 and later due to its use of regclass features introduced in that version.

This function works with bigint (and integer/smallint) columns and uses the MAX() aggregate. It shouldn't be difficult to port this to other datatypes or make it work with min().

Function definition

/**
* Return the Maximum bigint Value for a Partitioned Table Column
*
* @param string  Name of table, may be schema-qualified.
* @param string  Name of column to search.
*/
CREATE OR REPLACE FUNCTION spc_max_part_int(rParent regclass, sColName text)
RETURNS bigint AS
$$
DECLARE
  rChild regclass;
  nMax bigint;
  nTemp bigint;
BEGIN
 
  EXECUTE '
   SELECT max(' || quote_ident(sColName) ||')
     FROM ONLY ' || rParent
  INTO nMax;
 
  FOR rChild IN
    SELECT inhrelid
      FROM pg_inherits
      WHERE inhparent=rParent
  LOOP
    nTemp := spc_max_part_int(rChild, sColName);
    nMax := greatest(nTemp, nMax);
  END LOOP;
 
  RETURN nMax;
 
END;
$$ LANGUAGE plpgsql STABLE;

Example usage

db=> EXPLAIN SELECT max(foo_id) FROM foo;
                                      QUERY PLAN                                      
--------------------------------------------------------------------------------------
 Aggregate  (cost=177032.00..177032.01 rows=1 width=4)
   ->  Append  (cost=0.00..150477.20 rows=10621920 width=4)
         ->  Seq Scan ON foo  (cost=0.00..34.00 rows=2400 width=4)
         ->  Seq Scan ON foo_child1 foo  (cost=0.00..150443.20 rows=10619520 width=4)
 
db=> SELECT max(foo_id) FROM foo;
   max    
----------
 10000000
 
Time: 2636.540 ms
 
db=> SELECT spc_max_part_int('foo', 'foo_id');
 spc_max_part_int 
------------------
         10000000
 
Time: 0.556 ms
Personal tools