Synchronous Transfer

From PostgreSQL wiki
Jump to navigationJump to search

Synchronous Transfer is responsible for maintaining file system level consistency between master and standby server. In Streaming Replication when crash happens, there is possibility that WAL files which are present on master may not reach to the standby server.So at the time of fail-over the fresh backup from new master needs to be taken onto new standby.

Taking fresh backup may take very long time for large databases. Such a long recovery time is not acceptable in many cases and could break the service line agreement of recovery.

By using synchronous_transfer parameter user can configure fail-back safe standby which can avoid such a situation.

  • Here is discussion about this.

Usage

Users Overview

synchronous_transfer specifies whether master will wait for data page modifications AND/OR for transaction commit before corresponding WAL record is replicated to standby server.

The valid values to synchronous_transfer parameter are 'all', 'data_flush' and 'commit':

  • commit
    • This value specifies that transaction commit wait for the transaction's WAL records to be replicated on the standby server. This setting is same as setting synchronous_commit to 'on'.
    • Set synchronous_transfer to this value if there is no need to guard against the WAL inconsistency.
    • Standby server will not behave as fail-back safe standby.
  • data_flush
    • This value specifies that master wait for data page modifications before corresponding WAL record is replicated to standby server but transaction commit will not wait for transaction's WAL records to be replicated on the standby server.
    • Standby server will not behave as Asynchronous fail-back safe standby.
  • all
    • This value specifies that master wait for data page modifications as well as transaction commit before corresponding WAL record is replicated to standby server.
    • Standby server will not behave as Synchronous fail-back safe standby.

Types of Fail-back safe standby

With the synchronous_transfer parameter now two new types of standby setups are possible: Asynchronous fail-back safe standby and Synchronous fail-back safe standby. Hence 4 different ways of configuring Standby servers are possible.

  • Synchronous fail-back safe standby
    • synchronous_standby_names = <server name>
    • synchronous_transfer = all
  • Asynchronous fail-back safe standby
    • synchronous_standby_names = <server name>
    • synchronous_transfer = data_flush
  • Pure synchronous standby
    • synchronous_standby_names = <server name>
    • synchronous_transfer = commit
  • Pure asynchronous standby
    • No setting required

How to Use

  • 2. Set synchronous_transfer parameter to 'all' on the master server
$ $EDITOER postgresql.conf

synchronous_transfer = all

(Above setting will configure synchronous fail-back safe standby,to configure Asynchronous fail-back safe standby set synchronous_transfer parameter to 'data_flush')

that's it! configuration of fail-back safe standby is complete.

You can test the setup of fail-back safe standby with following steps:

  • 3. Keep running transactions on master which generates lots of WAL and crash the master server.
#For example : Kill the postmaster process of the primary server which is 'AAA' server, whose pid is 16764.
$ ps x | grep postgres
16764 pts/4    S      0:00 /home/postgres/pgsql/AAA/bin/postgres -D data
16768 ?        Ss     0:00 postgres: checkpointer process
16769 ?        Ss     0:00 postgres: writer process
16770 ?        Ss     0:00 postgres: wal writer process
16771 ?        Ss     0:00 postgres: autovacuum launcher process
16772 ?        Ss     0:00 postgres: stats collector process
16780 pts/4    S      0:00 /home/postgres/pgsql/BBB/bin/postgres -D data
16783 ?        Ss     0:03 postgres: startup process   recovering 00000001000000000000000B
16784 ?        Ss     0:00 postgres: checkpointer process
16785 ?        Ss     0:00 postgres: writer process
16786 ?        Ss     0:00 postgres: stats collector process
16787 ?        Ss     0:00 postgres: wal receiver process   streaming 0/BD4C080
16788 ?        Ss     0:00 postgres: wal sender process postgres [local] streaming 0/BD4C080
$ kill -s SIGKILL 16764
  • 4. Promote the standby server to the primary server
$ pg_ctl promote
  • 5. Synchronizing the WAL file between primary and standby server using rsync command
$ rsync -r 192.168.1.100:/home/postgres/pgsql/BBB/pg_xlog/ /home/postgres/pgsql/AAA/pg_xlog
  • 6. Create recovery.conf for the old primary server
$ $EDITOR recovery.conf

recovery_target_timeline = 'latest'

standby_mode = on

primary_conninfo = 'application_name=AAA port=5432 user=postgres'
  • 7. Start the old primary server as new standby server
# The new standby server try to fetch WAL file which from latest checkpoint location, from pg_xlog directory at first.
# But we removed the all WAL file on the pg_xlog firectory of new standby server.
# So the new standby server will request to fetch WAL file from the new primary server.
$ pg_ctl start 

Restrictions

  • Currently only one fail-back safe standby with SYNC replication.
    • for example: if there is 2 standby servers which connects to master server (one is SYNC, another one is ASYNC) and synchronous_transfer is set 'all'. Then master server will wait only for SYNC standby server.

Impact for Performance

Todo

  • Currently this patch supports only one fail-back safe standby. It can either be synchronous or an asynchronous standby. More discussion is needed about support of multiple fail-safe standby's.