Freezing XIDs without extra write I/O

From PostgreSQL wiki
Jump to navigationJump to search

This is an experimental idea. If it works well, we can freeze XIDs without extra I/O and disk compatibility breakage. This idea is originally conceived by Robert Haas and discussed on -hackers, and then Heikki Linnakangas posted some PoC patches, but this project was abandoned.

The idea described in this page is based on the original idea but I improved many things.

Advantages

The biggest advantages of this idea are:

  • Freeze XIDs without extra disk I/O.
  • No need to add additional info in pages.
  • Don't break the disk compatibility.

We still need:

  • Any transaction cannot run more than 2^31 transactions consumed.
  • Need to vacuum at least every 2^31 transactions.

Idea

There is a well-known idea that has been discussed for a long time, which is to add transaction epoch to the special area in heap pages so that we can represent 64-bit XID by combining XID on tuples and epoch on the page. This idea is that instead of storing the epoch explicitly, we infer how old an XID is from PageLSN.

A new infrastructure we need is to record the LSN where we started a new half transaction ID epoch, 2^31 (approximately 2 billion) transactions. The first time an XID is assigned in that range, we record the LSN while acquiring XidGenLock. So we end up with half-epoch boundaries like:

1. first use of XID 3             @ LSN 0/0012
2. first use of XID 2*2^30 @ LSN 1/0023
3. first use of XID 4*2^30 @ LSN 2/0034
4. first use of XID 6*2^30 @ LSN 3/0045

The table provide information that, for example, XID=2*2^30 was started at LSN 1/002 and any XIDs newer than 2*2^30 can appear only after LSN 1/0012.

Since any transaction cannot be running for more than 2^31 XIDs consumed, the page could span at most 2 ranges and we need to remember at least 2 ranges' begin-LSN: prev-range and current-range. These two ranges' begin-LSNs can be stored in pg_control file (or we might want to keep all of them for forensics).

When updating a page, we freeze all XIDs on the page if the page LSN is older than the previous range's begin-LSN. That would only happen when we're updating the page, in which case the page is dirtied anyway, so it wouldn't cause any extra I/O.

(Auto)vacuum still needs to be invoked for each table at least every 2bn XIDs (half-epoch). That way, we can ensure removing all aborted XIDs before their age reaches 2bn. Since this vacuum doesn't need to scan pages for freezing purposes anymore it doesn't need to scan all-visible pages. Therefore, relfrozenxid (and relminmxid) now means that XID older than relfrozenxid can only be there if it’s committed.

Suppose the latest XID is 7*2^30 and we read the page whose PageLSN is 1/1000. With the half-epoch boundaries information, since we know that all XIDs in the page fall between 3 to 4*2^30-1, we can know all XIDs are committed and visible to us without CLOG lookups.

Visibility Checks

Recording LSNs for every half-epoch, we can illustrate the relationship between LSN and XID and which half-epochs each XIDs can exist as follow:

     0/0012      1/0023       2/0034     3/0045
       +---------------------------------------------------------> LSN
     3 | +----------+-----------+
       | |          |           |
       | |          |           |
       | |          |           |
2*2^30 | +----------+-----------+----------+
       |            |           |          |
       |            |           |          |
       |            |           |          |
4*2^30 |            +-----------+----------+----------
       |                        |          |          
       |                        |          |              
       |                        |          |           
6*2^30 |                        +----------+----------
       |                                   |
       |                                   |   * we're here
       |                                   |
8*2^30 |                                   +----------
       v
    XID

MultiXacts

VACUUM FREEZE

Todo: VACUUM FREEZE still freezes XIDs?

parameters

Todo: do we need to add/remove/redesign XID-wraparound-related parameters?

  • autovacuum_freeze_table_age
  • autovacuum_freeze_min_age
  • autovacuum_freeze_max_age

Links