From PostgreSQL wiki
One day, you will probably need to cope with routine reindexing on your database, particularly if you don't use VACUUM aggressively enough. A particularly handy command in this area is CLUSTER, which can help with other types of cleanup.
Avoid using VACUUM FULL in versions 8.4 and earlier.
Here's a sample query to pull the number of rows, indexes, and some info about those indexes for each table. (Only works on 8.3; ditch the pg_size_pretty if you’re on an earlier version)
SELECT pg_class.relname, pg_size_pretty(pg_class.reltuples::BIGINT) AS rows_in_bytes, pg_class.reltuples AS num_rows, COUNT(indexname) AS number_of_indexes, CASE WHEN x.is_unique = 1 THEN 'Y' ELSE 'N' END AS UNIQUE, SUM(CASE WHEN number_of_columns = 1 THEN 1 ELSE 0 END) AS single_column, SUM(CASE WHEN number_of_columns IS NULL THEN 0 WHEN number_of_columns = 1 THEN 0 ELSE 1 END) AS multi_column FROM pg_namespace LEFT OUTER JOIN pg_class ON pg_namespace.oid = pg_class.relnamespace LEFT OUTER JOIN (SELECT indrelid, MAX(CAST(indisunique AS INTEGER)) AS is_unique FROM pg_index GROUP BY indrelid) x ON pg_class.oid = x.indrelid LEFT OUTER JOIN ( SELECT c.relname AS ctablename, ipg.relname AS indexname, x.indnatts AS number_of_columns FROM pg_index x JOIN pg_class c ON c.oid = x.indrelid JOIN pg_class ipg ON ipg.oid = x.indexrelid ) AS foo ON pg_class.relname = foo.ctablename WHERE pg_namespace.nspname='public' AND pg_class.relkind = 'r' GROUP BY pg_class.relname, pg_class.reltuples, x.is_unique ORDER BY 2;
Index size/usage statistics
Table & index sizes along which indexes are being scanned and how many tuples are fetched. See Disk Usage for another view that includes both table and index sizes.
SELECT t.tablename, indexname, c.reltuples AS num_rows, pg_size_pretty(pg_relation_size(quote_ident(t.tablename)::text)) AS table_size, pg_size_pretty(pg_relation_size(quote_ident(indexrelname)::text)) AS index_size, CASE WHEN indisunique THEN 'Y' ELSE 'N' END AS UNIQUE, idx_scan AS number_of_scans, idx_tup_read AS tuples_read, idx_tup_fetch AS tuples_fetched FROM pg_tables t LEFT OUTER JOIN pg_class c ON t.tablename=c.relname LEFT OUTER JOIN ( SELECT c.relname AS ctablename, ipg.relname AS indexname, x.indnatts AS number_of_columns, idx_scan, idx_tup_read, idx_tup_fetch, indexrelname, indisunique FROM pg_index x JOIN pg_class c ON c.oid = x.indrelid JOIN pg_class ipg ON ipg.oid = x.indexrelid JOIN pg_stat_all_indexes psai ON x.indexrelid = psai.indexrelid ) AS foo ON t.tablename = foo.ctablename WHERE t.schemaname='public' ORDER BY 1,2;
Finds multiple indexes that have the same set of columns, same opclass, expression and predicate -- which make them equivalent. Usually it's safe to drop one of them, but I give no guarantees. :)
SELECT pg_size_pretty(SUM(pg_relation_size(idx))::BIGINT) AS SIZE, (array_agg(idx)) AS idx1, (array_agg(idx)) AS idx2, (array_agg(idx)) AS idx3, (array_agg(idx)) AS idx4 FROM ( SELECT indexrelid::regclass AS idx, (indrelid::text ||E'\n'|| indclass::text ||E'\n'|| indkey::text ||E'\n'|| COALESCE(indexprs::text,'')||E'\n' || COALESCE(indpred::text,'')) AS KEY FROM pg_index) sub GROUP BY KEY HAVING COUNT(*)>1 ORDER BY SUM(pg_relation_size(idx)) DESC;
Based on check_postgres
One of the common needs for a REINDEX is when indexes become bloated due to either sparse deletions or use of VACUUM FULL (with pre 9.0 versions). An estimator for the amount of bloat in a table has been included in the check_postgres script, which you can call directly or incorporate into a larger monitoring system. Scripts based on this code and/or its concepts from other sources include:
- bloat view (Dimitri Fontaine)
- Visualizing Postgres - index_byte_sizes view (Michael Glaesemann, myYearbook)
- OmniTI Tasty Treats for PostgreSQL - shell and Perl pg_bloat_report scripts
A new query has been created to have a better bloat estimate for Btree indexes. Unlike the query from check_postgres, this one focus only on BTree index its disk layout.
See articles about it.
The monitoring script check_pgactivity is including a check based on this work.
Since indexes add significant overhead to any table change operation, they should be removed if they are not being used for either queries or constraint enforcement (such as making sure a value is unique). How to find such indexes:
- Index pruning techniques
- Finding unused indexes
- Finding useless indexes
- 'Monitor unused indexes' by Johnny Morano
- Index statistics queries from "Refactoring SQL Applications" review