VACCUM FULL9

From PostgreSQL wiki
Jump to navigationJump to search

The VACUUM command and associated autovacuum process are PostgreSQL's way of controlling MVCC bloat. The VACUUM command has two main forms of interest - ordinary VACUUM, and VACUUM FULL. These two commands are actually quite different and should not be confused.

VACUUM scans a table, marking tuples that are no longer needed as free space so that they can be overwritten by newly inserted or updated data. See Introduction to VACUUM, ANALYZE, EXPLAIN, and COUNT and the PostgreSQL documentation on MVCC for a detailed explanation of this. Note that you should rarely need to use the VACUUM command directly on a modern PostgreSQL database, as autovacuum should take care of it for you if properly set up.

VACUUM FULL, unlike VACUUM, touches data that has not been deleted. The VACUUM system is designed with the goal that you should never be running VACUUM FULL regularly, and doing so can have costs like huge WAL archive output and high loads on any streaming replication servers.

VACUUM FULL without arguments will operate on the entire current database. Basically, the VACUUM FULL logic is: for (each table in db) { build new copy of table data; build new indexes; commit the swap of the new data for the old; delete the old files, freeing them back to the FS }

VACUUM uses a transaction internally for each table processed, so it can commit as it goes along. Therefore, VACUUM FULL cannot be run in an existing transaction.


When to use VACUUM FULL and when not to

Many people, either based on misguided advice on the 'net or on the assumption that it must be "better", periodically run VACUUM FULL on their tables. This is generally not recommended and in some cases can make your database slower, not faster.

VACUUM FULL is only needed when you have a table that is mostly dead rows - ie, the vast majority of its contents have been deleted. It should not be used for table optimization or periodic maintenance, as it's generally counterproductive. In most cases the freed space will be promptly re-allocated, possibly increasing file-system-level fragmentation and requiring file system space allocations that're slower than just re-using existing free space within a table.

When you run VACUUM FULL on a table, that table is locked for the duration of the operation, so nothing else can work with the table. VACUUM FULL is much slower than a normal VACUUM, so the table may be unavailable for a while.

What to use instead

Autovacuum

If autovacuum is running frequently enough and aggressively enough, your tables should never grow ("bloat") due to unreclaimed dead rows, so you should never need to return "dead" space to the OS.

Autovacuum continues to improve dramatically with every PostgreSQL version, and is a very good reason to make sure you are running the latest version. For example, past version 8.4 the free space map is now managed automatically, removing a no-longer-necessary tuning parameter and eliminating a major source of table bloat.

If autovacuum isn't doing enough to keep your tables and indexes bloat-free, tune it, don't supplement it with manual vacuuming and reindexing. You may need to tune autovacuum to run more frequently, and/or tell autovacuum to vacuum certain frequently-updated tables more aggressively than others.

Note that VACUUM parameters can be set on a per-table basis, either at create time or with an ALTER TABLE.. see SQL-CREATETABLE-STORAGE-PARAMETERS

VACUUM

A manual VACUUM is recommended after major admin or UPDATE tasks. The pgAdmin GUI will remind you of this, for example. If you need to manually VACUUM your tables at any time other than when running major admin or update tasks that rewrite large parts of your tables, you probably don't have autovacuum set up well enough.

CLUSTER

CLUSTER orders a table with respect to a specific index (see discussion on the CLUSTER docs page).

Consider setting a FILLFACTOR of less than the default 100, so the rewritten table has some free space pre-alloacated within it for updates and new inserts; otherwise you'll just get file system allocations as soon as you do anything to the table.

TRUNCATE TABLE

If you've been using VACUUM FULL to free space from a table that's periodically completely emptied using DELETE FROM tablename; (without a WHERE clause), you can use TRUNCATE TABLE to replace those two steps with one much, much faster one.

Instead of:

DELETE FROM tablename;
VACUUM FULL tablename;

write:

TRUNCATE TABLE tablename;

Please make sure to read the caveats in the notes on the TRUNCATE TABLE documentation. If TRUNCATE TABLE isn't suitable for your needs, you can use DELETE followed by CLUSTER instead.

If using DELETE on a table that is the target of foreign key references, consider adding an index to the referencing columns. That will allow checks for foreign key enforcement to avoid a sequential scan on the referencing table, making DELETE from the referenced table vastly faster.

DROP

Unlike DELETE, using DROP on large objects will return disk space to the operating system without any additional actions.. for example DROP TABLE or DROP SCHEMA.