Storing Archival Data
Archival data is data that will never be updated or deleted, stored in tables that will never have additional inserts. It is most commonly found in history tables, which are often partitioned, where a table holds some subset of data generated some time in the past. The following list is a checklist for storaging that data in the most efficient manner possible.
verify table has fillfactor of 100%
Since the data will never be modified again, we should use as few pages as possible. Note this is the default setting as of postgresql 8.3, but should be verified in case it has been modified.
modify storage parameters
This presumes your table has varlen columns to deal with. Anyone have strong recommendations for which storage setting to use? ISTM it is very data dependent, but might be good to list use cases
re-write the table
two methods for this, depending on if you want to cluster the data in your table.
vacuum full method
drop indexes
vacuum full table
create indexes with fillfactor 100%
cluster method
verify indexes have fillfactor of 100%
cluster table on index
vacuum freeze
vacuum freeze helps stave off any vacuuming in the future, reducing overhead on a number of levels. Anyone know if it also removes any pages for recent changes from the fsm?
move table/indexes to archive tablespace
if you have setup special storage for archival data, you can then move the tables and indexes over to that storage. again, this is pretty implementation dependent, but anyone have thoughts on archival storage systems they want to share?