Finding useless columns

From PostgreSQL wiki

Revision as of 15:13, 31 August 2011 by Intgr (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Administrative Snippets

Finding useless columns

Works with PostgreSQL

8.2+

Written in

SQL

Depends on

Nothing


This query finds columns in the whole database that have no more than 1 distinct value in its table, using planner's estimates.

This is useful for finding redundant and unused columns.

SELECT nspname, relname, attname, typname,
    (stanullfrac*100)::int AS null_percent,
    case when stadistinct >= 0 then stadistinct else abs(stadistinct)*reltuples end AS "distinct",
    case 1 when stakind1 then stavalues1 when stakind2 then stavalues2 end AS "values"
FROM pg_class c
JOIN pg_namespace ns ON (ns.oid=relnamespace)
JOIN pg_attribute ON (c.oid=attrelid)
JOIN pg_type t ON (t.oid=atttypid)
JOIN pg_statistic ON (c.oid=starelid AND staattnum=attnum)
WHERE nspname NOT LIKE E'pg\\_%' AND nspname != 'information_schema'
  AND relkind='r' AND NOT attisdropped AND attstattarget != 0
  AND reltuples >= 100              -- ignore tables with fewer than 100 rows
  AND stadistinct BETWEEN 0 AND 1   -- 0 to 1 distinct values
ORDER BY nspname, relname, attname
;

See also: pg_statistic catalog documentation

Personal tools