Note that pg_upgrade is distributed with PostgreSQL, and can accomplish most of these desires (in particular with the --link option). That should be the first stop of people wishing to do an in-place upgrade.
The rest of this current page is generally of historical interest only.
This page goes over the technical challenges in converting older version PostgreSQL databases to newer ones, as outlined in the associated presentation PostgreSQL upgrade project.
A long term overview of what one user would like to see...
In-place upgrade means no need to copy user data to upgrade between software versions (not copying system catalogs would be ideal, however even o large databases, catalog size tends to not be prohibitive). Upgrades should allow for an online system after the upgrade, meaning we have full read/write capabilities to the data. Format changes should be copy-on-write, rather than copy-on-read, to prevent excessive i/o needs after upgrade (think large tables). A manual command to force update to the physical format would be nice (clearly anything like cluster/vacuum full would do this, though perhaps a vacuum upgrade would be warranted as well)
In-place upgrade is very complex project. Fortunately, it can be split into several sub projects. The storage and catalog sub-projects are most important and they are base for others.
Storage upgrade focuses on upgrade on disk formats from old version to the new one. All data in PostgreSQL are stored on pages. Page size is usually 8kB. The size is defined by BLCKSZ constant which can be modified only in compilation time. The goal is to convert pages from old format to the new format without internal structures dependency break. See storage dependency graph:
TODO: dependency graph
On disk format version is specified by PG_PAGE_LAYOUT_VERSION (PLV) (bufpage.h). You can see detail here.
Following structures are part of page layout:
- PageHeaderData (bufpage.h)
- HeapTupleHeaderData (htup.h)
- IndexTupleHeaderData (itup.h)
- ItemIdData (itemid.h)
Each structures also contains inner structures and usually they have flag field(s).
- varlena encoding change (introduced in PLV 4 - PG8.3)
- composite data types - composite data types share data structure with HeapTupleHeaderData
- size expansion - converted data need not fit on page