Count estimate

From PostgreSQL wiki

Revision as of 16:54, 4 August 2009 by 3manuek (Talk | contribs)

Jump to: navigation, search

Snippets

count_estimate function

Works with PostgreSQL

Any version

Written in

plpgsql

Depends on

Nothing

Author: Michael Fuhr

Comment by: Emanuel Calvo Franco

This function performs the way that Postgres 'count' the rows of a query.

CREATE FUNCTION count_estimate(query text) RETURNS integer AS $$
DECLARE
    rec   record;
    rows  integer;
BEGIN
    FOR rec IN EXECUTE 'EXPLAIN ' || query LOOP
        rows := substring(rec."QUERY PLAN" FROM ' rows=([[:digit:]]+)');
        EXIT WHEN rows IS NOT NULL;
    END LOOP;
 
    RETURN rows;
END;
$$ LANGUAGE plpgsql STABLE STRICT;


Taste it :)

CREATE TABLE foo (r double precision);
INSERT INTO foo SELECT random() FROM generate_series(1, 1000);
ANALYZE foo;
SELECT count_estimate('SELECT * FROM foo WHERE r < 0.1');
 count_estimate 
----------------
             97
(1 row)

EXPLAIN SELECT * FROM foo WHERE r < 0.1;
                     QUERY PLAN                      
-----------------------------------------------------
 Seq Scan on foo  (cost=0.00..17.50 rows=97 width=8)
   Filter: (r < 0.1::double precision)
(2 rows)


For large tables could be more faster than the count traditional function.

Other way, but requires a recently ANALYZE is to search in the catalog the reltuples of this object:

SELECT reltuples FROM pg_class WHERE relname = 'tabla';

It works faster, but *remember*, it depends on your statistics collect and it only is useful if you want to know *all* the tuples of a table.

In the other hand, you can use something weird like this:

SELECT COUNT(ctid)
   FROM TABLE;


References: http://archives.postgresql.org/pgsql-sql/2005-08/msg00046.php

Personal tools