Index Maintenance
From PostgreSQL wiki
One day, you will probably need to cope with routine reindexing on your database, particularly if you don't use VACUUM aggressively enough. A particularly handy command in this area is CLUSTER, which can help with other types of cleanup.
Avoid using VACUUM FULL in versions 8.4 and earlier.
Contents |
Index summary
Here's a sample query to pull the number of rows, indexes, and some info about those indexes for each table. (Only works on 8.3; ditch the pg_size_pretty if you’re on an earlier version)
SELECT pg_class.relname, pg_size_pretty(pg_class.reltuples::bigint) AS rows_in_bytes, pg_class.reltuples AS num_rows, count(indexname) AS number_of_indexes, CASE WHEN x.is_unique = 1 THEN 'Y' ELSE 'N' END AS UNIQUE, SUM(case WHEN number_of_columns = 1 THEN 1 ELSE 0 END) AS single_column, SUM(case WHEN number_of_columns IS NULL THEN 0 WHEN number_of_columns = 1 THEN 0 ELSE 1 END) AS multi_column FROM pg_namespace LEFT OUTER JOIN pg_class ON pg_namespace.oid = pg_class.relnamespace LEFT OUTER JOIN (SELECT indrelid, max(CAST(indisunique AS integer)) AS is_unique FROM pg_index GROUP BY indrelid) x ON pg_class.oid = x.indrelid LEFT OUTER JOIN ( SELECT c.relname AS ctablename, ipg.relname AS indexname, x.indnatts AS number_of_columns FROM pg_index x JOIN pg_class c ON c.oid = x.indrelid JOIN pg_class ipg ON ipg.oid = x.indexrelid ) AS foo ON pg_class.relname = foo.ctablename WHERE pg_namespace.nspname='public' AND pg_class.relkind = 'r' GROUP BY pg_class.relname, pg_class.reltuples, x.is_unique ORDER BY 2;
Index size/usage statistics
Table & index sizes along which indexes are being scanned and how many tuples are fetched. See Disk Usage for another view that includes both table and index sizes.
SELECT t.tablename, indexname, c.reltuples AS num_rows, pg_size_pretty(pg_relation_size(quote_ident(t.tablename)::text)) AS table_size, pg_size_pretty(pg_relation_size(quote_ident(indexrelname)::text)) AS index_size, CASE WHEN x.is_unique = 1 THEN 'Y' ELSE 'N' END AS UNIQUE, idx_scan AS number_of_scans, idx_tup_read AS tuples_read, idx_tup_fetch AS tuples_fetched FROM pg_tables t LEFT OUTER JOIN pg_class c ON t.tablename=c.relname LEFT OUTER JOIN (SELECT indrelid, max(CAST(indisunique AS integer)) AS is_unique FROM pg_index GROUP BY indrelid) x ON c.oid = x.indrelid LEFT OUTER JOIN ( SELECT c.relname AS ctablename, ipg.relname AS indexname, x.indnatts AS number_of_columns, idx_scan, idx_tup_read, idx_tup_fetch,indexrelname FROM pg_index x JOIN pg_class c ON c.oid = x.indrelid JOIN pg_class ipg ON ipg.oid = x.indexrelid JOIN pg_stat_all_indexes psai ON x.indexrelid = psai.indexrelid ) AS foo ON t.tablename = foo.ctablename WHERE t.schemaname='public' ORDER BY 1,2;
Duplicate indexes
Finds multiple indexes that have the same set of columns, same opclass, expression and predicate -- which make them equivalent. Usually it's safe to drop one of them, but I give no guarantees. :)
SELECT pg_size_pretty(sum(pg_relation_size(idx))::bigint) AS size, (array_agg(idx))[1] AS idx1, (array_agg(idx))[2] AS idx2, (array_agg(idx))[3] AS idx3, (array_agg(idx))[4] AS idx4 FROM ( SELECT indexrelid::regclass AS idx, (indrelid::text ||E'\n'|| indclass::text ||E'\n'|| indkey::text ||E'\n'|| coalesce(indexprs::text,'')||E'\n' || coalesce(indpred::text,'')) AS KEY FROM pg_index) sub GROUP BY KEY HAVING count(*)>1 ORDER BY sum(pg_relation_size(idx)) DESC;
Index Bloat
One of the common needs for a REINDEX is when indexes become bloated due to either sparse deletions or use of VACUUM FULL. An estimator for the amount of bloat in a table has been included in the check_postgres script, which you can call directly or incorporate into a larger monitoring system. Scripts based on this code and/or its concepts from other sources include:
- bloat view (Dimitri Fontaine)
- Visualizing Postgres - index_byte_sizes view (Michael Glaesemann, myYearbook)
- OmniTI Tasty Treats for PostgreSQL - shell and Perl pg_bloat_report scripts
Unused Indexes
Since indexes add significant overhead to any table change operation, they should be removed if they are not being used for either queries or constraint enforcement (such as making sure a value is unique). How to find such indexes:
References
- Index statistics queries from "Refactoring SQL Applications" review
