VACCUM FULL9
The VACUUM
command and associated autovacuum process are PostgreSQL's way of controlling MVCC bloat. The VACUUM
command has two main forms of interest - ordinary VACUUM
, and VACUUM FULL
. These two commands are actually quite different and should not be confused.
VACUUM
scans a table, marking tuples that are no longer needed as free space so that they can be overwritten by newly inserted or updated data. See Introduction to VACUUM, ANALYZE, EXPLAIN, and COUNT and the PostgreSQL documentation on MVCC for a detailed explanation of this. Note that you should rarely need to use the VACUUM
command directly on a modern PostgreSQL database, as autovacuum should take care of it for you if properly set up.
VACUUM FULL
, unlike VACUUM
, touches data that has not been deleted. The VACUUM system is designed with the goal that you should never be running VACUUM FULL
regularly, and doing so can have costs like huge WAL archive output and high loads on any streaming replication servers.
VACUUM FULL
without arguments will operate on the entire current database. Basically, the VACUUM FULL
logic is:
for (each table in db) { build new copy of table data; build new indexes; commit the swap of the new data for the old; delete the old files, freeing them back to the FS }
VACUUM
uses a transaction internally for each table processed, so it can commit as it goes along. Therefore, VACUUM FULL
cannot be run in an existing transaction.
When to use VACUUM FULL
and when not to
Many people, either based on misguided advice on the 'net or on the assumption that it must be "better", periodically run VACUUM FULL
on their tables. This is generally not recommended and in some cases can make your database slower, not faster.
VACUUM FULL
is only needed when you have a table that is mostly dead rows - ie, the vast majority of its contents have been deleted. It should not be used for table optimization or periodic maintenance, as it's generally counterproductive. In most cases the freed space will be promptly re-allocated, possibly increasing file-system-level fragmentation and requiring file system space allocations that're slower than just re-using existing free space within a table.
When you run VACUUM FULL
on a table, that table is locked for the duration of the operation, so nothing else can work with the table. VACUUM FULL
is much slower than a normal VACUUM, so the table may be unavailable for a while.
What to use instead
Autovacuum
If autovacuum is running frequently enough and aggressively enough, your tables should never grow ("bloat") due to unreclaimed dead rows, so you should never need to return "dead" space to the OS.
Autovacuum continues to improve dramatically with every PostgreSQL version, and is a very good reason to make sure you are running the latest version. For example, past version 8.4 the free space map is now managed automatically, removing a no-longer-necessary tuning parameter and eliminating a major source of table bloat.
If autovacuum isn't doing enough to keep your tables and indexes bloat-free, tune it, don't supplement it with manual vacuuming and reindexing. You may need to tune autovacuum to run more frequently, and/or tell autovacuum to vacuum certain frequently-updated tables more aggressively than others.
Note that VACUUM parameters can be set on a per-table basis, either at create time or with an ALTER TABLE.. see SQL-CREATETABLE-STORAGE-PARAMETERS
VACUUM
A manual VACUUM is recommended after major admin or UPDATE tasks. The pgAdmin GUI will remind you of this, for example. If you need to manually VACUUM
your tables at any time other than when running major admin or update tasks that rewrite large parts of your tables, you probably don't have autovacuum set up well enough.
CLUSTER
CLUSTER orders a table with respect to a specific index (see discussion on the CLUSTER docs page).
Consider setting a FILLFACTOR of less than the default 100, so the rewritten table has some free space pre-alloacated within it for updates and new inserts; otherwise you'll just get file system allocations as soon as you do anything to the table.
TRUNCATE TABLE
If you've been using VACUUM FULL
to free space from a table that's periodically completely emptied using DELETE FROM tablename;
(without a WHERE
clause), you can use TRUNCATE TABLE
to replace those two steps with one much, much faster one.
Instead of:
DELETE FROM tablename; VACUUM FULL tablename;
write:
TRUNCATE TABLE tablename;
Please make sure to read the caveats in the notes on the TRUNCATE TABLE
documentation. If TRUNCATE TABLE
isn't suitable for your needs, you can use DELETE
followed by CLUSTER
instead.
If using DELETE
on a table that is the target of foreign key references, consider adding an index to the referencing columns. That will allow checks for foreign key enforcement to avoid a sequential scan on the referencing table, making DELETE
from the referenced table vastly faster.
DROP
Unlike DELETE, using DROP on large objects will return disk space to the operating system without any additional actions.. for example DROP TABLE or DROP SCHEMA.