https://wiki.postgresql.org/api.php?action=feedcontributions&user=Joy&feedformat=atomPostgreSQL wiki - User contributions [en]2024-03-29T14:18:08ZUser contributionsMediaWiki 1.35.13https://wiki.postgresql.org/index.php?title=Streaming_Replication&diff=25909Streaming Replication2015-09-25T07:45:00Z<p>Joy: /* How to Use */</p>
<hr />
<div>'''Streaming Replication''' (SR) provides the capability to continuously ship and<br />
apply the [http://www.postgresql.org/docs/current/static/wal.html WAL XLOG] records to some number of standby servers in order to keep them current.<br />
<br />
This feature was added to PostgreSQL 9.0. The discussion below is a developer oriented one that contains some out of date information. Users of this feature should use the documentation for the feature or a setup tutorial instead:<br />
<br />
* [http://www.postgresql.org/docs/current/static/warm-standby.html PostgreSQL Streaming Replication Documentation] - this documentation is for the latest version, but provides links you can use to look up the docs for your version.<br />
<br />
* [[Binary Replication Tutorial]] provides an introduction to using this replication feature.<br />
<br />
* The related but independent [[Hot Standby]] feature.<br />
<br />
<br />
= Developer and historical details on the project =<br />
SR was developed for inclusion in PostgreSQL 9.0 by NTT OSS Center. The lead developer is [mailto:masao.fujii@gmail.com Masao Fujii]. [http://www.pgcon.org/2008/schedule/events/76.en.html Synchronous Log Shipping Replication Presentation] introduces the early design of the feature.<br />
<br />
= Usage =<br />
== Users Overview ==<br />
* '''Log-shipping'''<br />
** XLOG records generated in the primary are periodically shipped to the standby via the network.<br />
** In the existing warm standby, only records in a filled file are shipped, what's referred to as file-based log-shipping. In SR, XLOG records in partially-filled XLOG file are shipped too, implementing record-based log-shipping. This means the window for data loss in SR is usually smaller than in warm standby, unless the warm standby was also configured for record-based shipping (which is complicated to setup).<br />
** The content of XLOG files written to the standby are exactly the same as those on the primary. XLOG files shipped can be used for a normal recovery and PITR.<br />
* '''Multiple standbys'''<br />
** More than one standby can establish a connection to the primary for SR. XLOG records are concurrently shipped to all these standbys. The delay/death of a standby does not harm log-shipping to other standbys.<br />
** The maximum number of standbys can be specified as a GUC variable.<br />
* '''Continuous recovery'''<br />
** The standby continuously replays XLOG records shipped without using pg_standby.<br />
** XLOG records shipped are replayed as soon as possible without waiting until XLOG file has been filled. The combination of [[Hot Standby]] and SR would make the latest data inserted into the primary visible in the standby almost immediately.<br />
** The standby periodically removes old XLOG files which are no longer needed for recovery, to prevent excessive disk usage.<br />
* '''Setup'''<br />
** The start of log-shipping does not interfere with any query processing on the primary.<br />
** The standby can be started in various conditions.<br />
*** If there are XLOG files in archive directory and restore_command is supplied, at first those files are replayed. Then the standby requests XLOG records following the last applied one to the primary. This prevents XLOG files already present in the standby from being shipped again. Similarly, XLOG files in pg_xlog are also replayed before starting log-shipping.<br />
*** If there is no XLOG files on the standby, the standby requests XLOG records following the starting XLOG location of recovery (the redo starting location).<br />
* '''Connection settings and authentication'''<br />
** A user can configure the same settings as a normal connection to a connection for SR (e.g., keepalive, pg_hba.conf).<br />
* '''Activation'''<br />
** The standby can keep waiting for activation as long as a user likes. This prevents the standby from being automatically brought up by failure of recovery or network outage.<br />
* '''Progress report'''<br />
** The primary and standby report the progress of log-shipping in PS display.<br />
* '''Graceful shutdown'''<br />
** When smart/fast shutdown is requested, the primary waits to exit until XLOG records have been sent to the standby, up to the shutdown checkpoint record.<br />
<br />
== Restrictions ==<br />
* '''Synchronous log-shipping'''<br />
** By default, SR supports operates in asynchronous manner, so the commit command might return a "success" to a client before the corresponding XLOG records are shipped to the standby. To enable synchronous replication, see [http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION Synchronous Replication]<br />
* '''Replication beyond timeline'''<br />
** A user has to get a fresh backup whenever making the old standby catch up.<br />
* '''Clustering'''<br />
** Postgres doesn't provide any clustering feature.<br />
<br />
== How to Use ==<br />
<br />
NB: there is overlap between this section and [[Binary Replication Tutorial]]<br />
<br />
* '''1.''' Install postgres in the primary and standby server as usual. This requires only ''configure'', ''make'' and ''make install''.<br />
* '''2.''' Create the initial database cluster in the primary server as usual, using ''initdb''.<br />
* '''3.''' Create an user named replication with REPLICATION privelages<br />
$ CREATE ROLE replication WITH REPLICATION PASSWORD 'password' LOGIN<br />
* '''4.''' Set up connections and authentication on the primary so that the standby server can successfully connect to the ''replication'' pseudo-database on the primary.<br />
<br />
$ $EDITOR postgresql.conf<br />
<br />
listen_addresses = '192.168.0.10'<br />
<br />
$ $EDITOR pg_hba.conf<br />
<br />
# The standby server must connect with a user that has replication privileges.<br />
# TYPE DATABASE USER ADDRESS METHOD<br />
host replication replication 192.168.0.20/22 md5<br />
<br />
* '''5.''' Set up the streaming replication related parameters on the primary server.<br />
$ $EDITOR postgresql.conf<br />
<br />
# To enable read-only queries on a standby server, wal_level must be set to<br />
# "hot_standby". But you can choose "archive" if you never connect to the<br />
# server in standby mode.<br />
wal_level = hot_standby<br />
<br />
# Set the maximum number of concurrent connections from the standby servers.<br />
max_wal_senders = 5<br />
<br />
# To prevent the primary server from removing the WAL segments required for<br />
# the standby server before shipping them, set the minimum number of segments<br />
# retained in the pg_xlog directory. At least wal_keep_segments should be<br />
# larger than the number of segments generated between the beginning of<br />
# online-backup and the startup of streaming replication. If you enable WAL<br />
# archiving to an archive directory accessible from the standby, this may<br />
# not be necessary.<br />
wal_keep_segments = 32<br />
<br />
# Enable WAL archiving on the primary to an archive directory accessible from<br />
# the standby. If wal_keep_segments is a high enough number to retain the WAL<br />
# segments required for the standby server, this is not necessary.<br />
archive_mode = on<br />
archive_command = 'cp %p /path_to/archive/%f'<br />
* '''6.''' Start postgres on the primary server.<br />
* '''7.''' Make a base backup by copying the primary server's data directory to the standby server.<br />
<br />
** '''7.1.''' Do it with pg_(start|stop)_backup and rsync on the primary<br />
<br />
$ psql -c "SELECT pg_start_backup('label', true)"<br />
$ rsync -ac ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid<br />
$ psql -c "SELECT pg_stop_backup()"<br />
<br />
** '''7.2.''' Do it with pg_basebackup on the standby<br />
<br />
In version 9.1+, pg_basebackup can do the dirty work of fetching the entire data directory of your PostgreSQL installation from the primary and placing it onto the standby server.<br />
<br />
The prerequisite is that you make sure the standby's data directory is empty.<br />
<br />
Make sure to remove any tablespace directories as well. You can find those directories with:<br />
<br />
$ psql -c '\db'<br />
<br />
If you keep your postgresql.conf and other config files in PGDATA, you need a backup of postgresql.conf, to restore after pg_basebackup.<br />
<br />
After you've cleared all the directories, you can use the following command to directly stream the data from the primary onto your standby server.<br />
Run it as the database superuser, typically 'postgres', to make sure the permissions are preserved (use su, sudo or whatever other tool to make sure you're not root).<br />
<br />
$ pg_basebackup -h 192.168.0.10 -D /srv/pgsql/standby -P -U replication --xlog-method=stream<br />
<br />
In version 9.3+, you can also add the -R option so it creates a minimal recovery command file for step 9 below.<br />
<br />
If you backed up postgresql.conf, now restore it.<br />
<br />
* '''8.''' Set up replication-related parameters, connections and authentication in the standby server like the primary, so that the standby might work as a primary after failover.<br />
* '''9.''' Enable read-only queries on the standby server. But if wal_level is ''archive'' on the primary, leave hot_standby unchanged (i.e., off).<br />
$ $EDITOR postgresql.conf<br />
<br />
hot_standby = on<br />
* '''10.''' Create a recovery command file in the standby server; the following parameters are required for streaming replication.<br />
$ $EDITOR recovery.conf<br />
# Note that recovery.conf must be in $PGDATA directory.<br />
# It should NOT be located in the same directory as postgresql.conf<br />
<br />
# Specifies whether to start the server as a standby. In streaming replication,<br />
# this parameter must to be set to on.<br />
standby_mode = 'on'<br />
<br />
# Specifies a connection string which is used for the standby server to connect<br />
# with the primary.<br />
primary_conninfo = 'host=192.168.0.10 port=5432 user=replication password=password'<br />
<br />
# Specifies a trigger file whose presence should cause streaming replication to<br />
# end (i.e., failover).<br />
trigger_file = '/path_to/trigger'<br />
<br />
# Specifies a command to load archive segments from the WAL archive. If<br />
# wal_keep_segments is a high enough number to retain the WAL segments<br />
# required for the standby server, this may not be necessary. But<br />
# a large workload can cause segments to be recycled before the standby<br />
# is fully synchronized, requiring you to start again from a new base backup.<br />
restore_command = 'cp /path_to/archive/%f "%p"'<br />
* '''11.''' Start postgres in the standby server. It will start streaming replication.<br />
* '''12.''' You can calculate the replication lag by comparing the current WAL write location on the primary with the last WAL location received/replayed by the standby. They can be retrieved using ''pg_current_xlog_location'' on the primary and the ''pg_last_xlog_receive_location''/''pg_last_xlog_replay_location'' on the standby, respectively.<br />
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)<br />
pg_current_xlog_location <br />
--------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_receive_location <br />
-------------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_replay_location <br />
------------------------------<br />
0/2000000<br />
(1 row)<br />
* '''13.''' You can also check the progress of streaming replication by using ''ps'' command.<br />
# The displayed LSNs indicate the byte position that the standby server has<br />
# written up to in the xlogs.<br />
[primary] $ ps -ef | grep sender<br />
postgres 6879 6831 0 10:31 ? 00:00:00 postgres: wal sender process postgres 127.0.0.1(44663) streaming 0/2000000<br />
<br />
[standby] $ ps -ef | grep receiver<br />
postgres 6878 6872 1 10:31 ? 00:00:01 postgres: wal receiver process streaming 0/2000000<br />
* How to do failover<br />
** Create the trigger file in the standby after the primary fails.<br />
* How to stop the primary or the standby server<br />
** Shut down it as usual (''pg_ctl stop'').<br />
* How to restart streaming replication after failover<br />
** Repeat the operations from '''6th'''; making a fresh backup, some configurations and starting the original primary as the standby. The primary server doesn't need to be stopped during these operations.<br />
* How to restart streaming replication after the standby fails<br />
** Restart postgres in the standby server after eliminating the cause of failure.<br />
* How to disconnect the standby from the primary<br />
** Create the trigger file in the standby while the primary is running. Then the standby would be brought up.<br />
* How to re-synchronize the stand-alone standby after isolation<br />
** Shut down the standby as usual. And repeat the operations from '''6th'''.<br />
* If you have more than one slave, promoting one will break the other(s). Update their recovery.conf settings to point to the new master, set recovery_target_timeline to 'latest', scp/rsync the pg_xlog directory, and restart the slave.<br />
<br />
= Todo =<br />
== v9.0 ==<br />
<br />
Moved to [[PostgreSQL_9.0_Open_Items]]<br />
<br />
=== Committed ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01455.php Retrying from archive and some refactoring around Read/FetchRecord().] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00395.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg02601.php SR wrongly treats the WAL-boundary.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00396.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01715.php Adjust SR for some later changes about wal-skipping.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00399.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00024.php VACUUM FULL unexpectedly writes an XLOG UNLOGGED record.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00038.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01754.php Add a message type header.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00037.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01536.php Documentation: Add a new "Replication" chapter.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00115.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00350.php Failed assertion during recovery of partial WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00124.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00712.php A PANIC error might occur in the standby because of a partially-filled archived WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00137.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00330.php Improve the standby messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00140.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01672.php pq_getbyte_if_available() is not working because the win32 socket emulation layer simply wasn't designed to deal with non-blocking sockets.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00198.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01488.php Walsender might emit unfit messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00239.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01236.php Streaming replication on win32, still broken.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00270.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00992.php Create new section for recovery.conf.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00295.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01824.php Assertion failure in walreceiver.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00356.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01717.php Forbid a startup of walsender during recovery, and emit a suitable message? Or allow walsender to be started also during recovery?] - [http://archives.postgresql.org/message-id/20100316090955.9A5107541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01003.php How do we clean down the archive without using pg_standby?] - [http://archives.postgresql.org/message-id/20100318091718.BC14D7541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01510.php File-based log shipping without pg_standby doesn't replay the WAL files in pg_xlog.] - [http://archives.postgresql.org/pgsql-committers/2010-03/msg00356.php commit]<br />
<br />
== v9.1 ==<br />
=== Synchronization capability ===<br />
* Introduce the replication mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'' and ''fsync''.<br />
** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
** ''recv'' or ''fsync'' makes transaction commit wait for XLOG to be received or fsynced by the standby, respectively.<br />
** (''apply'' makes transaction commit wait for XLOG to be replayed by the standby. This mode will be supported in v9.2 or later)<br />
** The replication mode is specified in recovery.conf of the standby as well as other parameters for replication.<br />
*** The startup process reads the replication mode from recovery.conf and shares it to walreceiver via new shared-memory variable.<br />
*** Walreceiver also shares it to walsender by using the replication handshake message (existing protocol needs to be extended).<br />
** Based on the replication mode, walreceiver sends the reply meaning that replication is done up to the specified location to the primary.<br />
*** In async, walreceiver doesn't need to send any reply other than end-of-replication message.<br />
*** In recv or fsync, walreceiver sends the reply just after receiving or flushing XLOG, respectively.<br />
*** New message type for the reply needs to be defined. The reply is sent as CopyData message.<br />
** Walreceiver writes all the outstanding XLOG to disk before shutting down.<br />
** Walsender receives the reply from the standby, updates the location of the last record replicated, and announces completion of replication.<br />
*** New shared-memory variable to keep that location is required.<br />
** When processing the commit command, backend waits for XLOG to be replicated to only the standbys which are in the recv or fsync replication mode.<br />
*** Also smart shutdown waits for XLOG of shutdown checkpoint to be replicated.<br />
* Required optimization<br />
** Walsender should send outstanding XLOG without waiting wal_sender_delay.<br />
*** When processing the commit command, backend signals walsender to send outstanding XLOG immediately.<br />
** Backend should exit the wait loop as soon as the reply arrives at the primary.<br />
*** When receiving the reply, walsender signals backends to get up from the sleep and determine whether to exit the wait loop by checking the location of the last XLOG replicated.<br />
*** Only backends waiting for XLOG to be replicated up to the location contained in the reply are sent the signal.<br />
** Walsender waits for the signal from backends and the reply from the standby at the same time, by using select/poll.<br />
** Walsender reads XLOG from not only disk but also shared memory (wal buffers).<br />
** Walreceiver should flush XLOG file only when XLOG file is switched or the related page is flushed.<br />
*** When startup process or bgwriter flushes the buffer page, it checks whether the related XLOG has already been flushed via shared memory (location of the last XLOG flushed).<br />
*** It flushes the buffer page, if XLOG file has already been flushed.<br />
*** It signals walreceiver to flush XLOG file immediately and waits for the flush to complete, if XLOG file has not been flushed yet.<br />
** While the standby is catching up with the primary, those servers should ignore the replication mode and perform asynchronous replication.<br />
*** After those servers have almost gotten into synchronization, they perform replication based on the specified replication mode.<br />
*** New replication states like 'catching-up', 'sync', etc need to be defined, and the state machine for them is required on both servers.<br />
*** Current replication state can be monitored on both servers via SQL.<br />
* Required timeout<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG is replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
<br />
== Future release ==<br />
* '''Synchronization capability'''<br />
** Introduce the synchronization mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'', ''fsync'' and ''apply''.<br />
*** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
*** ''recv'', ''fsync'' and ''apply'' makes transaction commit wait for XLOG records to be received, fsynced and applied on the standby, respectively.<br />
** Change walsender to be able to read XLOG from not only the disk but also shared memory.<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG records are replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
* '''Monitoring'''<br />
** Provide the capability to check the progress and gap of streaming replication via one query. A collaboration of HS and SR is necessary to provide that capability on the standby side.<br />
** Provide the capability to check if the specified repliation is in progress via a query. Also more detailed status information might be necessary, e.g, the standby is catching up now, has already gotten into sync, and so on.<br />
** Change the stats collector to collect the statistics information about replication, e.g., average delay of replication time.<br />
** Develop the tool to calculate the latest XLOG position from XLOG files. This is necessary to check the gap of replication after the server fails.<br />
** Also develop the tool to extract the user-readable contents from XLOG files. This is necessary to see the contents of the gap, and manually restore them.<br />
* '''Easy to Use'''<br />
** Introduce the parameters like:<br />
*** replication_halt_timeout - replication will halt if no data has been sent for this much time.<br />
*** replication_halt_segments - replication will halt if number of WAL files in pg_xlog exceeds this threshold.<br />
*** These parameters allow us to avoid disk overflow.<br />
** Add new feature which transfers also base backup via the direct connection between the primary and the standby.<br />
** Add new hooks like walsender_hook and walreceiver_hook to cooperate with the add-on program for compression like pglesslog.<br />
** Provide a graceful termination of replication via a query on the primary. On the standby, a trigger file mechanism already provides that capability.<br />
** Support replication beyond timeline. The timeline history files need to be shipped from the primary to the standby.<br />
* '''Robustness'''<br />
** Support keepalive in libpq. This is useful for a client and the standby to detect a failure of the primary immediately.<br />
* '''Miscellaneous'''<br />
** Standalone walreceiver tool, which connects to the primary, continuously receives and writes XLOG records, independently from postgres server.<br />
** Cascade streaming replication. Allow walsender to send XLOG to another standby during recovery.<br />
** WAL archiving during recovery.<br />
<br />
[[Category:Replication]]</div>Joyhttps://wiki.postgresql.org/index.php?title=Binary_Replication_Tutorial&diff=25908Binary Replication Tutorial2015-09-25T07:44:11Z<p>Joy: </p>
<hr />
<div>Welcome to the new PostgreSQL 9 replication and standby databases guide. This new set of features implements possibly the longest awaited functionality in PostgreSQL's history. As a result, a lot of people are going to be trying to deploy standby databases for the first time, and find the process rather unintuitive. This guide is here to help.<br />
<br />
'''Work in progress: only 40% complete'''<br />
<br />
NB: there is some duplication with the page on [[Streaming Replication]]<br />
<br />
= 5 Minutes to Simple Replication =<br />
<br />
This is the easiest way to set up replication between a master and standby. It requires shutting down the master; other methods are detailed later in this guide.<br />
<br />
What we're going to do is shut down the master and copy the files we need over to the slave server, creating a cloned copy of the master. Because the master is shut down, we don't have to worry about changes being made to it.<br />
<br />
Note: Both the '5 minutes' instructions and the '10 minutes' version which follows do not deal with the complications that arise with a database that uses tablespaces, specifically what to do about the pg_tblspc directory and its contents.<br />
<br />
== Prerequisites ==<br />
<br />
You must have the right setup to make this work:<br />
<br />
* 2 servers with similar operating systems (e.g both Linux 64-bit).<br />
* The same release of PostgreSQL 9.0 installed on both servers.<br />
* PostgreSQL superuser shell access on both servers.<br />
* Knowledge of how to start, stop and reload Postgres.<br />
* PostgreSQL 9.0 running on Server1.<br />
* A database created and loaded on Server1.<br />
* A postgres user or root user who has network <br />
<br />
See the full documentation for more information:<br />
<br />
* [http://www.postgresql.org/docs/9.0/static/warm-standby.html 9.0 Replication Documentation]<br />
* [http://www.postgresql.org/docs/9.1/static/warm-standby.html 9.1 Replication Documentation]<br />
<br />
== Binary Replication in 7 Steps ==<br />
<br />
This 6-step guide, and all of the examples in this tutorial, assume that you have a master server at 192.168.0.1 and a standby server at 192.168.0.2 and that your database and its configuration files are installed at /var/lib/postgresql/data. Replace those with whatever your actual server addresses and directories are.<br />
<br />
1. Edit postgresql.conf on the master to turn on streaming replication. Change these settings:<br />
<br />
listen_addresses = '*'<br />
wal_level = hot_standby<br />
max_wal_senders = 3<br />
<br />
2. Edit pg_hba.conf on the master in order to let the standby connect. <br />
<br />
host replication all 192.168.0.2/32 trust<br />
<br />
3. Edit postgresql.conf on the standby to set up hot standby. Change this line:<br />
<br />
hot_standby = on<br />
<br />
4. Create or edit recovery.conf on the standby to set up replication and standby mode. Save a file in the standby's '''data directory''', called recovery.conf, with the following lines:<br />
<br />
standby_mode = 'on'<br />
primary_conninfo = 'host=192.168.0.1'<br />
<br />
5. Shut down both the master and standby, and copy the files. You want to copy most but not all files between the two servers, excluding the configuration files and the pg_xlog directory. An example rsync script would be:<br />
<br />
rsync -av --exclude pg_xlog --exclude postgresql.conf data/* 192.168.0.2:/var/lib/postgresql/data/<br />
<br />
6. Start the standby first, so that they can't get out of sync. (Messages will be logged about not being able to connect to the primary server, that's OK.)<br />
<br />
7. Start the master.<br />
<br />
== Starting Replication with only a Quick Master Restart ==<br />
<br />
Is taking down the master for long enough to copy the files too long? Then you need the 10-minute version.<br />
<br />
What we're going to do this time is similar to what we did before, cloning the database by copying the files from the master to the slave server. However, because the database is only going to be shut down for a short period of time, long enough to activate the changes in the configuration file, after we've copied the data files we will need to copy additional files so that the slave will be an up-to-date copy of the master. <br />
<br />
So, we will tell the master we're running a backup, copy the data files (not quite the same set of files as before), tell the master the backup is complete, then copy the WAL files in the pg_xlog directory so that when the slave comes up it can make all the changes that were committed to the master database after the backup was started.<br />
<br />
First, start with the same prerequisites as above.<br />
<br />
1. Set the postgresql.conf variables the same in step (1) as above.<br />
<br />
2. Don't close the file yet. You'll need to set two other variables which control the size of your write-ahead-log (WAL). The first is wal_keep_segments, the second is checkpoint_segments. Unless you've already done so, you're going to need to increase these, which is usually a good idea for performance anyway. You want the WAL to be big enough to not get used up in 15 or 20 minutes. If you don't have a clear idea of that, here's some reasonable values, based on how busy and how large your database is. Also, a database with large blob objects may require a much larger setting. Remember, these logs will take up disk space, so make sure that you have enough available - space requirements are below.<br />
<br />
checkpoint_segments = 8 <br />
wal_keep_segments = 8 <br />
# light load 500MB<br />
<br />
checkpoint_segments = 16<br />
wal_keep_segments = 32<br />
# moderately busy 1.5GB <br />
<br />
checkpoint_segments = 64<br />
wal_keep_segments = 128<br />
# busy server 5GB<br />
<br />
You don't ''have'' to increase checkpoint_segments in order to increase wal_keep_segments, but it's generally a good idea. Now save the file. <br />
<br />
3. Edit pg_hba.conf as in (2) in the "Six Steps" above.<br />
<br />
4. Now you need to restart the master. Given the interruption in service, you should probably plan this ahead. <br />
<br />
5. Edit postgresql.conf and recovery.conf on the standby as in (3) above.<br />
<br />
6. Now, we're going to need to copy the files from the master and start the standby. Unlike in the 6-step version, this needs to be done quickly or the standby will fail to sync and you'll need to try again. First step, you need to tell the master you're starting a backup (see below for a more detailed explanation of this). Log in to psql as the database superuser.<br />
<br />
psql -U postgres<br />
# select pg_start_backup('clone',true);<br />
<br />
Note that the string you use as a backup label doesn't matter; use any string you want.<br />
<br />
7. Now, quickly copy all the database files. This rsync is slightly different from the 6-step version:<br />
<br />
rsync -av --exclude pg_xlog --exclude postgresql.conf --exclude postgresql.pid \ <br />
data/* 192.168.0.2:/var/lib/postgresql/data/<br />
<br />
8. As soon as that's done you need to stop the backup on the master:<br />
<br />
# select pg_stop_backup();<br />
<br />
9. As soon as that completes, you need to quickly copy the WAL files from the master to the standby.<br />
<br />
rsync -av data/pg_xlog 192.168.0.2:/var/lib/postgresql/data/<br />
<br />
10. Now, start the standby.<br />
<br />
If you've done this quickly enough, then the standby should catch up with the master and you should be replicating. If not, you'll get this message:<br />
<br />
(Future Revisions note: Message needs to go here)<br />
<br />
... which means you need to try again, possibly with checkpoint_segments and wal_keep_segments higher. If that still doesn't work, you're going to need to use the even more complex archiving method described below.<br />
<br />
Now, the rest of the guide will explain how to deal with more complex situations, such as archive logs, handling security, and maintaining availability, failover and standby promotion.<br />
<br />
= Introduction to Binary Replication =<br />
<br />
Binary replication is also called "Hot Standby" and "Streaming Replication" which are two separate, but complimentary, features of PostgreSQL 9.0 and later. Here's some general information about how they work and what they are for.<br />
<br />
== What Can You Do With Binary Replication? ==<br />
<br />
* Have a simple and complete replica of your production database, preventing all but a couple seconds of data loss even under catastrophic circumstances.<br />
* Load-balance between your read/write master server and multiple read-only slave servers. (Note: This means that queries that aren't read-only cannot be run on a slave server. A common misconception has to do with finding the 'current' value of a sequence on a slave server, that is not possible.)<br />
* Run reporting or other long-running queries on a replica server, taking them off your main transaction-processing server.<br />
* Replicate all DDL, including table and index changes, and even creating new databases.<br />
* Replicate a hosted multi-tenant database, making no specific requirements for primary keys or database changes of your users.<br />
<br />
== What Can't You Do With Binary Replication? ==<br />
<br />
* Replicate a specific table, schema, or database. Binary replication is the entire Postgres instance (or "cluster").<br />
* Multi-master replication. Multi-master binary replication is probably technically impossible.<br />
* Replicate between different versions of PostgreSQL, or between different platforms.<br />
* Set up replication without administration rights on the server. Sorry, working on it.<br />
* Replicate data synchronously, guaranteeing zero data loss. And ... this is here since the release of PostgreSQL 9.1!<br />
<br />
For the reasons above, we expect that Slony-I, Londiste, Bucardo, pgPool2 and other systems will continue to be used.<br />
<br />
== Transaction Logs and Log Shipping ==<br />
<br />
Users who are already familiar with the PostgreSQL transaction log and warm standby can skip this section.<br />
<br />
An individual "instance", "server", or (confusingly) "cluster" of PostgreSQL (hereafter Server) consists of a single postmaster server process connected to a single initialized PostgreSQL data directory (PGDATA), which in turn contains several databases. Each running Server has a transaction log, located in the PGDATA/pg_xlog directory. This transaction log consists of binary snapshots of data, written to record synchronously each change to all databases' data, in case of unexpected shutdown of the database server (such as in a power failure). This ensures that data is not corrupted and no completed transaction is lost.<br />
<br />
You can also use this log to allow a copy of the original database to replicate changes made to a master database. This was first implemented with the PITR feature in PostgreSQL 8.0, and is known as "log shipping". Log shipping is required for most forms of binary replication.<br />
<br />
This log consists of 16MB segments full of new data pages (8K segments) of the database, and not of SQL statements. For this reason there is no before and after auditing possible via this log, as you cannot know exactly what has changed. Also, the log is treated as a buffer, being deleted as it is no longer needed for crash recovery. More importantly, the data page format of the log means that log segments can only be applied to a database which is binary-identical to the database which created the log. <br />
<br />
== PITR, Warm Standby, Hot Standby, and Streaming Replication ==<br />
<br />
For the rest of this tutorial, we will refer to the active read-write instance of the Server which generates transaction logs as the "Master" and the passive, read-only or offline instance (or instances) of the Server which receives transaction logs as the "Standby" (or "Standbys"). The term Master/Standby is equivalent to other terminology which may be used in the database industry, such as Master/Slave, Primary/Secondary or Primary/Replica.<br />
<br />
=== PITR ===<br />
<br />
In Point-In-Time Recovery (PITR), transaction logs are copied and saved to storage until needed. Then, when needed, the Standby server can be "brought up" (made active) and transaction logs applied, either stopping when they run out or at a prior point indicated by the administrator. PITR has been available since PostgreSQL version 8.0, and as such will not be documented here.<br />
<br />
PITR is primarily used for database forensics and recovery. It is also useful when you need to back up a very large database, as it effectively supports incremental backups, which pg_dump does not.<br />
<br />
=== Warm Standby ===<br />
<br />
In Warm Standby, transaction logs are copied from the Master and applied to the Standby immediately after they are received, or at a short delay. The Standby is offline (in "recovery mode") and not available for any query workload. This allows the Standby to be brought up to full operation very quickly. Warm Standby has been available since version 8.3, and will not be fully documented here.<br />
<br />
Warm Standby requires Log Shipping. It is primary used for database failover.<br />
<br />
=== Hot Standby ===<br />
<br />
Hot Standby is identical to Warm Standby, except that the Standby is available to run read-only queries. This offers all of the advantages of Warm Standby, plus the ability to distribute some business workload to the Standby server(s). Hot Standby by itself requires Log Shipping.<br />
<br />
Hot Standby is used both for database failover, and can also be used for load-balancing. In contrast to Streaming Replication, it places no load on the master (except for disk space requirements) and is thus theoretically infinitely scalable. A WAL archive could be distributed to dozens or hundreds of servers via network storage. The WAL files could also easily be copied over a poor quality network connection, or by SFTP.<br />
<br />
However, since Hot Standby replicates by shipping 16MB logs, it is at best minutes behind and sometimes more than that. This can be problematic both from a failover and a load-balancing perspective.<br />
<br />
=== Streaming Replication ===<br />
<br />
Streaming Replication improves either Warm Standby or Hot Standby by opening a network connection between the Standby and the Master database, instead of copying 16MB log files. This allows data changes to be copied over the network almost immediately on completion on the Master. <br />
<br />
In Streaming Replication, the master and the standby have special processes called the walsender and walreceiver which transmit modified data pages over a network port. This requires one fairly busy connection per standby, imposing an incremental load on the master for each additional standby. Still, the load is quite low and a single master should be able to support multiple standbys easily.<br />
<br />
Streaming replication does not require log shipping in normal operation. It may, however, require log shipping to start replication, and can utilize log shipping in order to catch up standbys which fall behind.<br />
<br />
= How to Replicate =<br />
<br />
== Cloning a Live Database ==<br />
<br />
If your workload doesn't allow you to take the master down (and whose does?), things get a bit more complicated. You need to somehow take a "coherent snapshot" of the master, so that you don't have an inconsistent or corrupt database on the standby. Now, in some cases this can be done via filesystem snapshotting tools or similar tricks, but as that approach is tricky and platform-dependent, we're not going to cover it here.<br />
<br />
Instead, we're going to cover the built-in method, which involves keeping a log of all changes applied to the database which happen during the copying process. The steps are essentially the same, regardless of whether you're planning to use just hot standby, streaming replication, or both. There are two parts:<br />
<br />
* Cloning the database files<br />
* Copying the archive logs<br />
<br />
Unintuitive as it is, the latter needs to be set up first, so we're going to start with that.<br />
<br />
== Setting Up Archiving On The Master ==<br />
<br />
Archiving is the process of making an extra copy of each WAL file as it is completed. These log files then need to somehow be accessed by the standby. There are three basic ways to handle this, and you should decide in advance what method you're going to use:<br />
<br />
# Manually<br />
# Automatic file copying from master to standby using rsync or simiar<br />
# Writing them to a common shared network file location<br />
<br />
The first method is only appropriate if you're archiving logs only to jump-start streaming replication, and you have a fairly low-traffic database or the ability to stop all writes. The third method is probably the easiest to manage if you have an appropriate network share; it can even be used to support multiple standbys with some extra thought and scripting. All of these methods will be explained below.<br />
<br />
This needs to be turned on on the master, which if it's never been done before may require a restart (sorry, working on it), and will certainly require a reload. You'll need to set the following parameters:<br />
<br />
wal_level = hot_standby<br />
archive_mode = on<br />
archive_command = 'some command'<br />
<br />
What archive command you use depends on which archiving approach you are taking, of course. Here are three examples of commands you might use. Note that you will need to create the "archive" directories.<br />
<br />
# Manual: cp -f %p /var/lib/postgresql/data/archive/%f </dev/null<br />
# Automatic Copy: rsync -a %p 192.168.0.2:/var/lib/pgsql/data/archive/%f<br />
# Network Share: cp -f %p /shares/walarchive/archive/%f </dev/null<br />
<br />
In these commands, %p is replaced by postgres at invocation time with the full path and name of the WAL file, and %f with the name of the file alone. There are more escapes and parameters dealing with WAL archiving which will be detailed later in the tutorial. Note that, in real production, you are unlikely to want to use any commands as simple as the above. In general, you will want to have archive_command call an executable script which traps errors and can be disabled. Examples of such scripts are available in this tutorial.<br />
<br />
Now, if archive_mode was originally "off" or if you had to change wal_level, you're going to need to restart the master (sorry, this will be fixed in a later version). If you just needed to change the archive_command, however, only a reload is required.<br />
<br />
Once you've restarted or reloaded, check the master's logs to make sure archiving is working. If it's failing, the master will complain extensively. You might also check that archive log files are being created; run the command "SELECT pg_switch_xlog();" as the superuser to force a new log to be written.<br />
<br />
== Setting Up Archiving on the Standby ==<br />
<br />
The standby needs to be configured to consume logs. This is simpler than the master's setup, and doesn't really change no matter what archive copying strategy you're using.<br />
<br />
== Recovery.conf ==<br />
<br />
On the standby, replication configuration is controlled through a file called, for historical reasons, recovery.conf. If this file is present in PostgreSQL's data directory when PostgreSQL is started, that server will assume it is a standby and attempt to obey it. Generally, there is an example file installed with the other PostgreSQL shared docs. However, that example file covers all of the various replication options at once, so it's often simpler to write your own file, from scratch. Any change to recovery.conf requires a restart of the standby.<br />
<br />
In recovery.conf, you need to add a command to copy the archived WAL files to the standby's on pg_xlog directory. This is the mirror image of the archive_command on the master. Generally, a simple cp command is sufficient:<br />
<br />
restore_command = 'cp -f /var/lib/postgresql/data/archive/%f %p </dev/null'<br />
restore_command = 'cp -f /shares/walarchive/%f %p </dev/null'<br />
<br />
Again, you might want to use a simple shell script which traps error messages, and, importantly, deletes archive files which are no longer needed. If you will be doing only hot standby and not using streaming replication, you probably want to compile the pg_standby binary provided in PostgreSQL's additional modules or "contrib", and use it instead:<br />
<br />
restore_command = 'pg_standby /shares/walarchive %f %p %r'<br />
<br />
More detail on pg_standby is in its [http://www.postgresql.org/docs/9.1/static/pgstandby.html documentation].<br />
<br />
== Cloning a Snapshot of the Master ==<br />
<br />
Once you have archiving working, you're ready to clone the master database. At this point, it's a simple process:<br />
<br />
# As superuser, issue the command "SELECT pg_start_backup('backup');" on the master.<br />
# Copy all of the database files to the standby.<br />
# Start the standby database.<br />
# Issue the command "SELECT pg_stop_backup();" on the master.<br />
<br />
Of course, each of those steps deserves a little more elaboration. pg_start_backup and pg_stop_backup are special commands you issue on the master in order to create, hold open, and close, a "snapshot" which is how we make sure your copy of the database is not inconsistent. They also write special files to the archive log which tell the standby when it has a complete snapshot.<br />
<br />
If you are using the "manual" method of synching the archive logs, immediately after step 4 you need to do one last rsync or copy of the archive logs to the standby.<br />
<br />
When you're done with the cloning, you should see output similar to the below:<br />
<br />
This means that you're up and replicating, and should now be able to run queries on the standby.<br />
<br />
== Failing Over To The Standby ==<br />
<br />
Of course, one of the major reasons to have a standby is in case something (planned or unplanned) causes the master server to shut down. Then you want to "fail over", or stop replication and change the standby to a full read-write master.<br />
<br />
The recommended method is the same regardless of the type of replication or standby: via "trigger file". First, you need to set a configuration option in recovery.conf on the standby:<br />
<br />
trigger_file = '/var/lib/postgresql/data/failover'<br />
<br />
Then, when it's time to fail over, you just create an empty file with that name, such as by using the "touch" command. The standby will notice the file, attempt to apply any remaining WAL records or files it has received, and then switch to read-write or "master" mode. When this happens, you will see a message like this in the Postgres log:<br />
<br />
PostgreSQL will also rename the recovery.conf file to recovery.done in order to prevent having the new master fail on restart. For this reason, the recovery.conf file should be owned by the same user which the server runs as (usually "postgres").<br />
<br />
The alternative to using a trigger file is to failover manually, by deleting or renaming the recovery.conf file and restarting the standby. This method is inferior because it requires a restart which would interrupt any read-only connections to the standby currently in use.<br />
<br />
In a high-availability system, the above activity should be managed automatically in order to avoid downtime. PostgreSQL itself supplies no tools to do this, but numerous third-party utilities such as "Linux heartbeat" are compatible with PostgreSQL replication.<br />
<br />
It's important to prevent the original master from restarting after failover, lest you end up with a "split brain" problem and data loss. There is a substantial body of literature on this, and third-party tools, so we won't discuss them here at this time.<br />
<br />
== Load Balancing ==<br />
<br />
== Managing Archive Logs ==<br />
<br />
== Tuning and Configuration of Binary Replication ==<br />
<br />
== Monitoring Replication ==<br />
<br />
<br />
[[Category:Replication]]<br />
[[Category:Howto]]</div>Joyhttps://wiki.postgresql.org/index.php?title=User:Joy&diff=25831User:Joy2015-09-15T13:00:19Z<p>Joy: Created page with "'''Joy''' is Josip Rodin."</p>
<hr />
<div>'''Joy''' is Josip Rodin.</div>Joyhttps://wiki.postgresql.org/index.php?title=Streaming_Replication&diff=25830Streaming Replication2015-09-15T10:50:51Z<p>Joy: /* How to Use */ clarify the host where this is being run</p>
<hr />
<div>'''Streaming Replication''' (SR) provides the capability to continuously ship and<br />
apply the [http://www.postgresql.org/docs/current/static/wal.html WAL XLOG] records to some number of standby servers in order to keep them current.<br />
<br />
This feature was added to PostgreSQL 9.0. The discussion below is a developer oriented one that contains some out of date information. Users of this feature should use the documentation for the feature or a setup tutorial instead:<br />
<br />
* [http://www.postgresql.org/docs/current/static/warm-standby.html PostgreSQL Streaming Replication Documentation] - this documentation is for the latest version, but provides links you can use to look up the docs for your version.<br />
<br />
* [[Binary Replication Tutorial]] provides an introduction to using this replication feature.<br />
<br />
* The related but independent [[Hot Standby]] feature.<br />
<br />
<br />
= Developer and historical details on the project =<br />
SR was developed for inclusion in PostgreSQL 9.0 by NTT OSS Center. The lead developer is [mailto:masao.fujii@gmail.com Masao Fujii]. [http://www.pgcon.org/2008/schedule/events/76.en.html Synchronous Log Shipping Replication Presentation] introduces the early design of the feature.<br />
<br />
= Usage =<br />
== Users Overview ==<br />
* '''Log-shipping'''<br />
** XLOG records generated in the primary are periodically shipped to the standby via the network.<br />
** In the existing warm standby, only records in a filled file are shipped, what's referred to as file-based log-shipping. In SR, XLOG records in partially-filled XLOG file are shipped too, implementing record-based log-shipping. This means the window for data loss in SR is usually smaller than in warm standby, unless the warm standby was also configured for record-based shipping (which is complicated to setup).<br />
** The content of XLOG files written to the standby are exactly the same as those on the primary. XLOG files shipped can be used for a normal recovery and PITR.<br />
* '''Multiple standbys'''<br />
** More than one standby can establish a connection to the primary for SR. XLOG records are concurrently shipped to all these standbys. The delay/death of a standby does not harm log-shipping to other standbys.<br />
** The maximum number of standbys can be specified as a GUC variable.<br />
* '''Continuous recovery'''<br />
** The standby continuously replays XLOG records shipped without using pg_standby.<br />
** XLOG records shipped are replayed as soon as possible without waiting until XLOG file has been filled. The combination of [[Hot Standby]] and SR would make the latest data inserted into the primary visible in the standby almost immediately.<br />
** The standby periodically removes old XLOG files which are no longer needed for recovery, to prevent excessive disk usage.<br />
* '''Setup'''<br />
** The start of log-shipping does not interfere with any query processing on the primary.<br />
** The standby can be started in various conditions.<br />
*** If there are XLOG files in archive directory and restore_command is supplied, at first those files are replayed. Then the standby requests XLOG records following the last applied one to the primary. This prevents XLOG files already present in the standby from being shipped again. Similarly, XLOG files in pg_xlog are also replayed before starting log-shipping.<br />
*** If there is no XLOG files on the standby, the standby requests XLOG records following the starting XLOG location of recovery (the redo starting location).<br />
* '''Connection settings and authentication'''<br />
** A user can configure the same settings as a normal connection to a connection for SR (e.g., keepalive, pg_hba.conf).<br />
* '''Activation'''<br />
** The standby can keep waiting for activation as long as a user likes. This prevents the standby from being automatically brought up by failure of recovery or network outage.<br />
* '''Progress report'''<br />
** The primary and standby report the progress of log-shipping in PS display.<br />
* '''Graceful shutdown'''<br />
** When smart/fast shutdown is requested, the primary waits to exit until XLOG records have been sent to the standby, up to the shutdown checkpoint record.<br />
<br />
== Restrictions ==<br />
* '''Synchronous log-shipping'''<br />
** By default, SR supports operates in asynchronous manner, so the commit command might return a "success" to a client before the corresponding XLOG records are shipped to the standby. To enable synchronous replication, see [http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION Synchronous Replication]<br />
* '''Replication beyond timeline'''<br />
** A user has to get a fresh backup whenever making the old standby catch up.<br />
* '''Clustering'''<br />
** Postgres doesn't provide any clustering feature.<br />
<br />
== How to Use ==<br />
* '''1.''' Install postgres in the primary and standby server as usual. This requires only ''configure'', ''make'' and ''make install''.<br />
* '''2.''' Create the initial database cluster in the primary server as usual, using ''initdb''.<br />
* '''3.''' Create an user named replication with REPLICATION privelages<br />
$ CREATE ROLE replication WITH REPLICATION PASSWORD 'password' LOGIN<br />
* '''4.''' Set up connections and authentication on the primary so that the standby server can successfully connect to the ''replication'' pseudo-database on the primary.<br />
<br />
$ $EDITOR postgresql.conf<br />
<br />
listen_addresses = '192.168.0.10'<br />
<br />
$ $EDITOR pg_hba.conf<br />
<br />
# The standby server must connect with a user that has replication privileges.<br />
# TYPE DATABASE USER ADDRESS METHOD<br />
host replication replication 192.168.0.20/22 md5<br />
<br />
* '''5.''' Set up the streaming replication related parameters on the primary server.<br />
$ $EDITOR postgresql.conf<br />
<br />
# To enable read-only queries on a standby server, wal_level must be set to<br />
# "hot_standby". But you can choose "archive" if you never connect to the<br />
# server in standby mode.<br />
wal_level = hot_standby<br />
<br />
# Set the maximum number of concurrent connections from the standby servers.<br />
max_wal_senders = 5<br />
<br />
# To prevent the primary server from removing the WAL segments required for<br />
# the standby server before shipping them, set the minimum number of segments<br />
# retained in the pg_xlog directory. At least wal_keep_segments should be<br />
# larger than the number of segments generated between the beginning of<br />
# online-backup and the startup of streaming replication. If you enable WAL<br />
# archiving to an archive directory accessible from the standby, this may<br />
# not be necessary.<br />
wal_keep_segments = 32<br />
<br />
# Enable WAL archiving on the primary to an archive directory accessible from<br />
# the standby. If wal_keep_segments is a high enough number to retain the WAL<br />
# segments required for the standby server, this is not necessary.<br />
archive_mode = on<br />
archive_command = 'cp %p /path_to/archive/%f'<br />
* '''6.''' Start postgres on the primary server.<br />
* '''7.''' Make a base backup by copying the primary server's data directory to the standby server.<br />
<br />
** '''7.1.''' Do it with pg_(start|stop)_backup and rsync on the primary<br />
<br />
$ psql -c "SELECT pg_start_backup('label', true)"<br />
$ rsync -ac ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid<br />
$ psql -c "SELECT pg_stop_backup()"<br />
<br />
** '''7.2.''' Do it with pg_basebackup on the standby<br />
<br />
In version 9.1+, pg_basebackup can do the dirty work of fetching the entire data directory of your PostgreSQL installation from the primary and placing it onto the standby server.<br />
<br />
The prerequisite is that you make sure the standby's data directory is empty.<br />
<br />
Make sure to remove any tablespace directories as well. You can find those directories with:<br />
<br />
$ psql -c '\db'<br />
<br />
If you keep your postgresql.conf and other config files in PGDATA, you need a backup of postgresql.conf, to restore after pg_basebackup.<br />
<br />
After you've cleared all the directories, you can use the following command to directly stream the data from the primary onto your standby server.<br />
Run it as the database superuser, typically 'postgres', to make sure the permissions are preserved (use su, sudo or whatever other tool to make sure you're not root).<br />
<br />
$ pg_basebackup -h 192.168.0.10 -D /srv/pgsql/standby -P -U replication --xlog-method=stream<br />
<br />
In version 9.3+, you can also add the -R option so it creates a minimal recovery command file for step 9 below.<br />
<br />
If you backed up postgresql.conf, now restore it.<br />
<br />
* '''8.''' Set up replication-related parameters, connections and authentication in the standby server like the primary, so that the standby might work as a primary after failover.<br />
* '''9.''' Enable read-only queries on the standby server. But if wal_level is ''archive'' on the primary, leave hot_standby unchanged (i.e., off).<br />
$ $EDITOR postgresql.conf<br />
<br />
hot_standby = on<br />
* '''10.''' Create a recovery command file in the standby server; the following parameters are required for streaming replication.<br />
$ $EDITOR recovery.conf<br />
# Note that recovery.conf must be in $PGDATA directory.<br />
# It should NOT be located in the same directory as postgresql.conf<br />
<br />
# Specifies whether to start the server as a standby. In streaming replication,<br />
# this parameter must to be set to on.<br />
standby_mode = 'on'<br />
<br />
# Specifies a connection string which is used for the standby server to connect<br />
# with the primary.<br />
primary_conninfo = 'host=192.168.0.10 port=5432 user=replication password=password'<br />
<br />
# Specifies a trigger file whose presence should cause streaming replication to<br />
# end (i.e., failover).<br />
trigger_file = '/path_to/trigger'<br />
<br />
# Specifies a command to load archive segments from the WAL archive. If<br />
# wal_keep_segments is a high enough number to retain the WAL segments<br />
# required for the standby server, this may not be necessary. But<br />
# a large workload can cause segments to be recycled before the standby<br />
# is fully synchronized, requiring you to start again from a new base backup.<br />
restore_command = 'cp /path_to/archive/%f "%p"'<br />
* '''11.''' Start postgres in the standby server. It will start streaming replication.<br />
* '''12.''' You can calculate the replication lag by comparing the current WAL write location on the primary with the last WAL location received/replayed by the standby. They can be retrieved using ''pg_current_xlog_location'' on the primary and the ''pg_last_xlog_receive_location''/''pg_last_xlog_replay_location'' on the standby, respectively.<br />
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)<br />
pg_current_xlog_location <br />
--------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_receive_location <br />
-------------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_replay_location <br />
------------------------------<br />
0/2000000<br />
(1 row)<br />
* '''13.''' You can also check the progress of streaming replication by using ''ps'' command.<br />
# The displayed LSNs indicate the byte position that the standby server has<br />
# written up to in the xlogs.<br />
[primary] $ ps -ef | grep sender<br />
postgres 6879 6831 0 10:31 ? 00:00:00 postgres: wal sender process postgres 127.0.0.1(44663) streaming 0/2000000<br />
<br />
[standby] $ ps -ef | grep receiver<br />
postgres 6878 6872 1 10:31 ? 00:00:01 postgres: wal receiver process streaming 0/2000000<br />
* How to do failover<br />
** Create the trigger file in the standby after the primary fails.<br />
* How to stop the primary or the standby server<br />
** Shut down it as usual (''pg_ctl stop'').<br />
* How to restart streaming replication after failover<br />
** Repeat the operations from '''6th'''; making a fresh backup, some configurations and starting the original primary as the standby. The primary server doesn't need to be stopped during these operations.<br />
* How to restart streaming replication after the standby fails<br />
** Restart postgres in the standby server after eliminating the cause of failure.<br />
* How to disconnect the standby from the primary<br />
** Create the trigger file in the standby while the primary is running. Then the standby would be brought up.<br />
* How to re-synchronize the stand-alone standby after isolation<br />
** Shut down the standby as usual. And repeat the operations from '''6th'''.<br />
* If you have more than one slave, promoting one will break the other(s). Update their recovery.conf settings to point to the new master, set recovery_target_timeline to 'latest', scp/rsync the pg_xlog directory, and restart the slave.<br />
<br />
= Todo =<br />
== v9.0 ==<br />
<br />
Moved to [[PostgreSQL_9.0_Open_Items]]<br />
<br />
=== Committed ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01455.php Retrying from archive and some refactoring around Read/FetchRecord().] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00395.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg02601.php SR wrongly treats the WAL-boundary.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00396.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01715.php Adjust SR for some later changes about wal-skipping.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00399.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00024.php VACUUM FULL unexpectedly writes an XLOG UNLOGGED record.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00038.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01754.php Add a message type header.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00037.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01536.php Documentation: Add a new "Replication" chapter.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00115.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00350.php Failed assertion during recovery of partial WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00124.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00712.php A PANIC error might occur in the standby because of a partially-filled archived WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00137.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00330.php Improve the standby messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00140.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01672.php pq_getbyte_if_available() is not working because the win32 socket emulation layer simply wasn't designed to deal with non-blocking sockets.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00198.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01488.php Walsender might emit unfit messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00239.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01236.php Streaming replication on win32, still broken.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00270.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00992.php Create new section for recovery.conf.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00295.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01824.php Assertion failure in walreceiver.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00356.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01717.php Forbid a startup of walsender during recovery, and emit a suitable message? Or allow walsender to be started also during recovery?] - [http://archives.postgresql.org/message-id/20100316090955.9A5107541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01003.php How do we clean down the archive without using pg_standby?] - [http://archives.postgresql.org/message-id/20100318091718.BC14D7541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01510.php File-based log shipping without pg_standby doesn't replay the WAL files in pg_xlog.] - [http://archives.postgresql.org/pgsql-committers/2010-03/msg00356.php commit]<br />
<br />
== v9.1 ==<br />
=== Synchronization capability ===<br />
* Introduce the replication mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'' and ''fsync''.<br />
** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
** ''recv'' or ''fsync'' makes transaction commit wait for XLOG to be received or fsynced by the standby, respectively.<br />
** (''apply'' makes transaction commit wait for XLOG to be replayed by the standby. This mode will be supported in v9.2 or later)<br />
** The replication mode is specified in recovery.conf of the standby as well as other parameters for replication.<br />
*** The startup process reads the replication mode from recovery.conf and shares it to walreceiver via new shared-memory variable.<br />
*** Walreceiver also shares it to walsender by using the replication handshake message (existing protocol needs to be extended).<br />
** Based on the replication mode, walreceiver sends the reply meaning that replication is done up to the specified location to the primary.<br />
*** In async, walreceiver doesn't need to send any reply other than end-of-replication message.<br />
*** In recv or fsync, walreceiver sends the reply just after receiving or flushing XLOG, respectively.<br />
*** New message type for the reply needs to be defined. The reply is sent as CopyData message.<br />
** Walreceiver writes all the outstanding XLOG to disk before shutting down.<br />
** Walsender receives the reply from the standby, updates the location of the last record replicated, and announces completion of replication.<br />
*** New shared-memory variable to keep that location is required.<br />
** When processing the commit command, backend waits for XLOG to be replicated to only the standbys which are in the recv or fsync replication mode.<br />
*** Also smart shutdown waits for XLOG of shutdown checkpoint to be replicated.<br />
* Required optimization<br />
** Walsender should send outstanding XLOG without waiting wal_sender_delay.<br />
*** When processing the commit command, backend signals walsender to send outstanding XLOG immediately.<br />
** Backend should exit the wait loop as soon as the reply arrives at the primary.<br />
*** When receiving the reply, walsender signals backends to get up from the sleep and determine whether to exit the wait loop by checking the location of the last XLOG replicated.<br />
*** Only backends waiting for XLOG to be replicated up to the location contained in the reply are sent the signal.<br />
** Walsender waits for the signal from backends and the reply from the standby at the same time, by using select/poll.<br />
** Walsender reads XLOG from not only disk but also shared memory (wal buffers).<br />
** Walreceiver should flush XLOG file only when XLOG file is switched or the related page is flushed.<br />
*** When startup process or bgwriter flushes the buffer page, it checks whether the related XLOG has already been flushed via shared memory (location of the last XLOG flushed).<br />
*** It flushes the buffer page, if XLOG file has already been flushed.<br />
*** It signals walreceiver to flush XLOG file immediately and waits for the flush to complete, if XLOG file has not been flushed yet.<br />
** While the standby is catching up with the primary, those servers should ignore the replication mode and perform asynchronous replication.<br />
*** After those servers have almost gotten into synchronization, they perform replication based on the specified replication mode.<br />
*** New replication states like 'catching-up', 'sync', etc need to be defined, and the state machine for them is required on both servers.<br />
*** Current replication state can be monitored on both servers via SQL.<br />
* Required timeout<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG is replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
<br />
== Future release ==<br />
* '''Synchronization capability'''<br />
** Introduce the synchronization mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'', ''fsync'' and ''apply''.<br />
*** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
*** ''recv'', ''fsync'' and ''apply'' makes transaction commit wait for XLOG records to be received, fsynced and applied on the standby, respectively.<br />
** Change walsender to be able to read XLOG from not only the disk but also shared memory.<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG records are replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
* '''Monitoring'''<br />
** Provide the capability to check the progress and gap of streaming replication via one query. A collaboration of HS and SR is necessary to provide that capability on the standby side.<br />
** Provide the capability to check if the specified repliation is in progress via a query. Also more detailed status information might be necessary, e.g, the standby is catching up now, has already gotten into sync, and so on.<br />
** Change the stats collector to collect the statistics information about replication, e.g., average delay of replication time.<br />
** Develop the tool to calculate the latest XLOG position from XLOG files. This is necessary to check the gap of replication after the server fails.<br />
** Also develop the tool to extract the user-readable contents from XLOG files. This is necessary to see the contents of the gap, and manually restore them.<br />
* '''Easy to Use'''<br />
** Introduce the parameters like:<br />
*** replication_halt_timeout - replication will halt if no data has been sent for this much time.<br />
*** replication_halt_segments - replication will halt if number of WAL files in pg_xlog exceeds this threshold.<br />
*** These parameters allow us to avoid disk overflow.<br />
** Add new feature which transfers also base backup via the direct connection between the primary and the standby.<br />
** Add new hooks like walsender_hook and walreceiver_hook to cooperate with the add-on program for compression like pglesslog.<br />
** Provide a graceful termination of replication via a query on the primary. On the standby, a trigger file mechanism already provides that capability.<br />
** Support replication beyond timeline. The timeline history files need to be shipped from the primary to the standby.<br />
* '''Robustness'''<br />
** Support keepalive in libpq. This is useful for a client and the standby to detect a failure of the primary immediately.<br />
* '''Miscellaneous'''<br />
** Standalone walreceiver tool, which connects to the primary, continuously receives and writes XLOG records, independently from postgres server.<br />
** Cascade streaming replication. Allow walsender to send XLOG to another standby during recovery.<br />
** WAL archiving during recovery.<br />
<br />
[[Category:Replication]]</div>Joyhttps://wiki.postgresql.org/index.php?title=Streaming_Replication&diff=25829Streaming Replication2015-09-15T10:48:48Z<p>Joy: /* How to Use */ improve mediawiki syntax (was thinking of moinmoin for some reason)</p>
<hr />
<div>'''Streaming Replication''' (SR) provides the capability to continuously ship and<br />
apply the [http://www.postgresql.org/docs/current/static/wal.html WAL XLOG] records to some number of standby servers in order to keep them current.<br />
<br />
This feature was added to PostgreSQL 9.0. The discussion below is a developer oriented one that contains some out of date information. Users of this feature should use the documentation for the feature or a setup tutorial instead:<br />
<br />
* [http://www.postgresql.org/docs/current/static/warm-standby.html PostgreSQL Streaming Replication Documentation] - this documentation is for the latest version, but provides links you can use to look up the docs for your version.<br />
<br />
* [[Binary Replication Tutorial]] provides an introduction to using this replication feature.<br />
<br />
* The related but independent [[Hot Standby]] feature.<br />
<br />
<br />
= Developer and historical details on the project =<br />
SR was developed for inclusion in PostgreSQL 9.0 by NTT OSS Center. The lead developer is [mailto:masao.fujii@gmail.com Masao Fujii]. [http://www.pgcon.org/2008/schedule/events/76.en.html Synchronous Log Shipping Replication Presentation] introduces the early design of the feature.<br />
<br />
= Usage =<br />
== Users Overview ==<br />
* '''Log-shipping'''<br />
** XLOG records generated in the primary are periodically shipped to the standby via the network.<br />
** In the existing warm standby, only records in a filled file are shipped, what's referred to as file-based log-shipping. In SR, XLOG records in partially-filled XLOG file are shipped too, implementing record-based log-shipping. This means the window for data loss in SR is usually smaller than in warm standby, unless the warm standby was also configured for record-based shipping (which is complicated to setup).<br />
** The content of XLOG files written to the standby are exactly the same as those on the primary. XLOG files shipped can be used for a normal recovery and PITR.<br />
* '''Multiple standbys'''<br />
** More than one standby can establish a connection to the primary for SR. XLOG records are concurrently shipped to all these standbys. The delay/death of a standby does not harm log-shipping to other standbys.<br />
** The maximum number of standbys can be specified as a GUC variable.<br />
* '''Continuous recovery'''<br />
** The standby continuously replays XLOG records shipped without using pg_standby.<br />
** XLOG records shipped are replayed as soon as possible without waiting until XLOG file has been filled. The combination of [[Hot Standby]] and SR would make the latest data inserted into the primary visible in the standby almost immediately.<br />
** The standby periodically removes old XLOG files which are no longer needed for recovery, to prevent excessive disk usage.<br />
* '''Setup'''<br />
** The start of log-shipping does not interfere with any query processing on the primary.<br />
** The standby can be started in various conditions.<br />
*** If there are XLOG files in archive directory and restore_command is supplied, at first those files are replayed. Then the standby requests XLOG records following the last applied one to the primary. This prevents XLOG files already present in the standby from being shipped again. Similarly, XLOG files in pg_xlog are also replayed before starting log-shipping.<br />
*** If there is no XLOG files on the standby, the standby requests XLOG records following the starting XLOG location of recovery (the redo starting location).<br />
* '''Connection settings and authentication'''<br />
** A user can configure the same settings as a normal connection to a connection for SR (e.g., keepalive, pg_hba.conf).<br />
* '''Activation'''<br />
** The standby can keep waiting for activation as long as a user likes. This prevents the standby from being automatically brought up by failure of recovery or network outage.<br />
* '''Progress report'''<br />
** The primary and standby report the progress of log-shipping in PS display.<br />
* '''Graceful shutdown'''<br />
** When smart/fast shutdown is requested, the primary waits to exit until XLOG records have been sent to the standby, up to the shutdown checkpoint record.<br />
<br />
== Restrictions ==<br />
* '''Synchronous log-shipping'''<br />
** By default, SR supports operates in asynchronous manner, so the commit command might return a "success" to a client before the corresponding XLOG records are shipped to the standby. To enable synchronous replication, see [http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION Synchronous Replication]<br />
* '''Replication beyond timeline'''<br />
** A user has to get a fresh backup whenever making the old standby catch up.<br />
* '''Clustering'''<br />
** Postgres doesn't provide any clustering feature.<br />
<br />
== How to Use ==<br />
* '''1.''' Install postgres in the primary and standby server as usual. This requires only ''configure'', ''make'' and ''make install''.<br />
* '''2.''' Create the initial database cluster in the primary server as usual, using ''initdb''.<br />
* '''3.''' Create an user named replication with REPLICATION privelages<br />
$ CREATE ROLE replication WITH REPLICATION PASSWORD 'password' LOGIN<br />
* '''4.''' Set up connections and authentication so that the standby server can successfully connect to the ''replication'' pseudo-database on the primary.<br />
<br />
$ $EDITOR postgresql.conf<br />
<br />
listen_addresses = '192.168.0.10'<br />
<br />
$ $EDITOR pg_hba.conf<br />
<br />
# The standby server must connect with a user that has replication privileges.<br />
# TYPE DATABASE USER ADDRESS METHOD<br />
host replication replication 192.168.0.20/22 md5<br />
<br />
* '''5.''' Set up the streaming replication related parameters on the primary server.<br />
$ $EDITOR postgresql.conf<br />
<br />
# To enable read-only queries on a standby server, wal_level must be set to<br />
# "hot_standby". But you can choose "archive" if you never connect to the<br />
# server in standby mode.<br />
wal_level = hot_standby<br />
<br />
# Set the maximum number of concurrent connections from the standby servers.<br />
max_wal_senders = 5<br />
<br />
# To prevent the primary server from removing the WAL segments required for<br />
# the standby server before shipping them, set the minimum number of segments<br />
# retained in the pg_xlog directory. At least wal_keep_segments should be<br />
# larger than the number of segments generated between the beginning of<br />
# online-backup and the startup of streaming replication. If you enable WAL<br />
# archiving to an archive directory accessible from the standby, this may<br />
# not be necessary.<br />
wal_keep_segments = 32<br />
<br />
# Enable WAL archiving on the primary to an archive directory accessible from<br />
# the standby. If wal_keep_segments is a high enough number to retain the WAL<br />
# segments required for the standby server, this is not necessary.<br />
archive_mode = on<br />
archive_command = 'cp %p /path_to/archive/%f'<br />
* '''6.''' Start postgres on the primary server.<br />
* '''7.''' Make a base backup by copying the primary server's data directory to the standby server.<br />
<br />
** '''7.1.''' Do it with pg_(start|stop)_backup and rsync<br />
<br />
$ psql -c "SELECT pg_start_backup('label', true)"<br />
$ rsync -ac ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid<br />
$ psql -c "SELECT pg_stop_backup()"<br />
<br />
** '''7.2.''' Do it with pg_basebackup<br />
<br />
In version 9.1+, pg_basebackup can do the dirty work of fetching the entire data directory of your PostgreSQL installation from the primary and placing it onto the standby server.<br />
<br />
The prerequisite is that you make sure the standby's data directory is empty.<br />
<br />
Make sure to remove any tablespace directories as well. You can find those directories with:<br />
<br />
$ psql -c '\db'<br />
<br />
If you keep your postgresql.conf and other config files in PGDATA, you need a backup of postgresql.conf, to restore after pg_basebackup.<br />
<br />
After you've cleared all the directories, you can use the following command to directly stream the data from the primary onto your standby server.<br />
Run it as the database superuser, typically 'postgres', to make sure the permissions are preserved (use su, sudo or whatever other tool to make sure you're not root).<br />
<br />
$ pg_basebackup -h 192.168.0.10 -D /srv/pgsql/standby -P -U replication --xlog-method=stream<br />
<br />
In version 9.3+, you can also add the -R option so it creates a minimal recovery command file for step 9 below.<br />
<br />
If you backed up postgresql.conf, now restore it.<br />
<br />
* '''8.''' Set up replication-related parameters, connections and authentication in the standby server like the primary, so that the standby might work as a primary after failover.<br />
* '''9.''' Enable read-only queries on the standby server. But if wal_level is ''archive'' on the primary, leave hot_standby unchanged (i.e., off).<br />
$ $EDITOR postgresql.conf<br />
<br />
hot_standby = on<br />
* '''10.''' Create a recovery command file in the standby server; the following parameters are required for streaming replication.<br />
$ $EDITOR recovery.conf<br />
# Note that recovery.conf must be in $PGDATA directory.<br />
# It should NOT be located in the same directory as postgresql.conf<br />
<br />
# Specifies whether to start the server as a standby. In streaming replication,<br />
# this parameter must to be set to on.<br />
standby_mode = 'on'<br />
<br />
# Specifies a connection string which is used for the standby server to connect<br />
# with the primary.<br />
primary_conninfo = 'host=192.168.0.10 port=5432 user=replication password=password'<br />
<br />
# Specifies a trigger file whose presence should cause streaming replication to<br />
# end (i.e., failover).<br />
trigger_file = '/path_to/trigger'<br />
<br />
# Specifies a command to load archive segments from the WAL archive. If<br />
# wal_keep_segments is a high enough number to retain the WAL segments<br />
# required for the standby server, this may not be necessary. But<br />
# a large workload can cause segments to be recycled before the standby<br />
# is fully synchronized, requiring you to start again from a new base backup.<br />
restore_command = 'cp /path_to/archive/%f "%p"'<br />
* '''11.''' Start postgres in the standby server. It will start streaming replication.<br />
* '''12.''' You can calculate the replication lag by comparing the current WAL write location on the primary with the last WAL location received/replayed by the standby. They can be retrieved using ''pg_current_xlog_location'' on the primary and the ''pg_last_xlog_receive_location''/''pg_last_xlog_replay_location'' on the standby, respectively.<br />
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)<br />
pg_current_xlog_location <br />
--------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_receive_location <br />
-------------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_replay_location <br />
------------------------------<br />
0/2000000<br />
(1 row)<br />
* '''13.''' You can also check the progress of streaming replication by using ''ps'' command.<br />
# The displayed LSNs indicate the byte position that the standby server has<br />
# written up to in the xlogs.<br />
[primary] $ ps -ef | grep sender<br />
postgres 6879 6831 0 10:31 ? 00:00:00 postgres: wal sender process postgres 127.0.0.1(44663) streaming 0/2000000<br />
<br />
[standby] $ ps -ef | grep receiver<br />
postgres 6878 6872 1 10:31 ? 00:00:01 postgres: wal receiver process streaming 0/2000000<br />
* How to do failover<br />
** Create the trigger file in the standby after the primary fails.<br />
* How to stop the primary or the standby server<br />
** Shut down it as usual (''pg_ctl stop'').<br />
* How to restart streaming replication after failover<br />
** Repeat the operations from '''6th'''; making a fresh backup, some configurations and starting the original primary as the standby. The primary server doesn't need to be stopped during these operations.<br />
* How to restart streaming replication after the standby fails<br />
** Restart postgres in the standby server after eliminating the cause of failure.<br />
* How to disconnect the standby from the primary<br />
** Create the trigger file in the standby while the primary is running. Then the standby would be brought up.<br />
* How to re-synchronize the stand-alone standby after isolation<br />
** Shut down the standby as usual. And repeat the operations from '''6th'''.<br />
* If you have more than one slave, promoting one will break the other(s). Update their recovery.conf settings to point to the new master, set recovery_target_timeline to 'latest', scp/rsync the pg_xlog directory, and restart the slave.<br />
<br />
= Todo =<br />
== v9.0 ==<br />
<br />
Moved to [[PostgreSQL_9.0_Open_Items]]<br />
<br />
=== Committed ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01455.php Retrying from archive and some refactoring around Read/FetchRecord().] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00395.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg02601.php SR wrongly treats the WAL-boundary.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00396.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01715.php Adjust SR for some later changes about wal-skipping.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00399.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00024.php VACUUM FULL unexpectedly writes an XLOG UNLOGGED record.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00038.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01754.php Add a message type header.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00037.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01536.php Documentation: Add a new "Replication" chapter.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00115.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00350.php Failed assertion during recovery of partial WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00124.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00712.php A PANIC error might occur in the standby because of a partially-filled archived WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00137.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00330.php Improve the standby messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00140.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01672.php pq_getbyte_if_available() is not working because the win32 socket emulation layer simply wasn't designed to deal with non-blocking sockets.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00198.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01488.php Walsender might emit unfit messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00239.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01236.php Streaming replication on win32, still broken.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00270.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00992.php Create new section for recovery.conf.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00295.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01824.php Assertion failure in walreceiver.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00356.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01717.php Forbid a startup of walsender during recovery, and emit a suitable message? Or allow walsender to be started also during recovery?] - [http://archives.postgresql.org/message-id/20100316090955.9A5107541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01003.php How do we clean down the archive without using pg_standby?] - [http://archives.postgresql.org/message-id/20100318091718.BC14D7541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01510.php File-based log shipping without pg_standby doesn't replay the WAL files in pg_xlog.] - [http://archives.postgresql.org/pgsql-committers/2010-03/msg00356.php commit]<br />
<br />
== v9.1 ==<br />
=== Synchronization capability ===<br />
* Introduce the replication mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'' and ''fsync''.<br />
** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
** ''recv'' or ''fsync'' makes transaction commit wait for XLOG to be received or fsynced by the standby, respectively.<br />
** (''apply'' makes transaction commit wait for XLOG to be replayed by the standby. This mode will be supported in v9.2 or later)<br />
** The replication mode is specified in recovery.conf of the standby as well as other parameters for replication.<br />
*** The startup process reads the replication mode from recovery.conf and shares it to walreceiver via new shared-memory variable.<br />
*** Walreceiver also shares it to walsender by using the replication handshake message (existing protocol needs to be extended).<br />
** Based on the replication mode, walreceiver sends the reply meaning that replication is done up to the specified location to the primary.<br />
*** In async, walreceiver doesn't need to send any reply other than end-of-replication message.<br />
*** In recv or fsync, walreceiver sends the reply just after receiving or flushing XLOG, respectively.<br />
*** New message type for the reply needs to be defined. The reply is sent as CopyData message.<br />
** Walreceiver writes all the outstanding XLOG to disk before shutting down.<br />
** Walsender receives the reply from the standby, updates the location of the last record replicated, and announces completion of replication.<br />
*** New shared-memory variable to keep that location is required.<br />
** When processing the commit command, backend waits for XLOG to be replicated to only the standbys which are in the recv or fsync replication mode.<br />
*** Also smart shutdown waits for XLOG of shutdown checkpoint to be replicated.<br />
* Required optimization<br />
** Walsender should send outstanding XLOG without waiting wal_sender_delay.<br />
*** When processing the commit command, backend signals walsender to send outstanding XLOG immediately.<br />
** Backend should exit the wait loop as soon as the reply arrives at the primary.<br />
*** When receiving the reply, walsender signals backends to get up from the sleep and determine whether to exit the wait loop by checking the location of the last XLOG replicated.<br />
*** Only backends waiting for XLOG to be replicated up to the location contained in the reply are sent the signal.<br />
** Walsender waits for the signal from backends and the reply from the standby at the same time, by using select/poll.<br />
** Walsender reads XLOG from not only disk but also shared memory (wal buffers).<br />
** Walreceiver should flush XLOG file only when XLOG file is switched or the related page is flushed.<br />
*** When startup process or bgwriter flushes the buffer page, it checks whether the related XLOG has already been flushed via shared memory (location of the last XLOG flushed).<br />
*** It flushes the buffer page, if XLOG file has already been flushed.<br />
*** It signals walreceiver to flush XLOG file immediately and waits for the flush to complete, if XLOG file has not been flushed yet.<br />
** While the standby is catching up with the primary, those servers should ignore the replication mode and perform asynchronous replication.<br />
*** After those servers have almost gotten into synchronization, they perform replication based on the specified replication mode.<br />
*** New replication states like 'catching-up', 'sync', etc need to be defined, and the state machine for them is required on both servers.<br />
*** Current replication state can be monitored on both servers via SQL.<br />
* Required timeout<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG is replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
<br />
== Future release ==<br />
* '''Synchronization capability'''<br />
** Introduce the synchronization mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'', ''fsync'' and ''apply''.<br />
*** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
*** ''recv'', ''fsync'' and ''apply'' makes transaction commit wait for XLOG records to be received, fsynced and applied on the standby, respectively.<br />
** Change walsender to be able to read XLOG from not only the disk but also shared memory.<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG records are replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
* '''Monitoring'''<br />
** Provide the capability to check the progress and gap of streaming replication via one query. A collaboration of HS and SR is necessary to provide that capability on the standby side.<br />
** Provide the capability to check if the specified repliation is in progress via a query. Also more detailed status information might be necessary, e.g, the standby is catching up now, has already gotten into sync, and so on.<br />
** Change the stats collector to collect the statistics information about replication, e.g., average delay of replication time.<br />
** Develop the tool to calculate the latest XLOG position from XLOG files. This is necessary to check the gap of replication after the server fails.<br />
** Also develop the tool to extract the user-readable contents from XLOG files. This is necessary to see the contents of the gap, and manually restore them.<br />
* '''Easy to Use'''<br />
** Introduce the parameters like:<br />
*** replication_halt_timeout - replication will halt if no data has been sent for this much time.<br />
*** replication_halt_segments - replication will halt if number of WAL files in pg_xlog exceeds this threshold.<br />
*** These parameters allow us to avoid disk overflow.<br />
** Add new feature which transfers also base backup via the direct connection between the primary and the standby.<br />
** Add new hooks like walsender_hook and walreceiver_hook to cooperate with the add-on program for compression like pglesslog.<br />
** Provide a graceful termination of replication via a query on the primary. On the standby, a trigger file mechanism already provides that capability.<br />
** Support replication beyond timeline. The timeline history files need to be shipped from the primary to the standby.<br />
* '''Robustness'''<br />
** Support keepalive in libpq. This is useful for a client and the standby to detect a failure of the primary immediately.<br />
* '''Miscellaneous'''<br />
** Standalone walreceiver tool, which connects to the primary, continuously receives and writes XLOG records, independently from postgres server.<br />
** Cascade streaming replication. Allow walsender to send XLOG to another standby during recovery.<br />
** WAL archiving during recovery.<br />
<br />
[[Category:Replication]]</div>Joyhttps://wiki.postgresql.org/index.php?title=Streaming_Replication&diff=25828Streaming Replication2015-09-15T10:47:26Z<p>Joy: /* How to Use */ the old example with -R was relevant, so restore that, but normalize the examples, explaining what is actually meant - sometimes $ indicates a postgres shell, sometimes a root shell</p>
<hr />
<div>'''Streaming Replication''' (SR) provides the capability to continuously ship and<br />
apply the [http://www.postgresql.org/docs/current/static/wal.html WAL XLOG] records to some number of standby servers in order to keep them current.<br />
<br />
This feature was added to PostgreSQL 9.0. The discussion below is a developer oriented one that contains some out of date information. Users of this feature should use the documentation for the feature or a setup tutorial instead:<br />
<br />
* [http://www.postgresql.org/docs/current/static/warm-standby.html PostgreSQL Streaming Replication Documentation] - this documentation is for the latest version, but provides links you can use to look up the docs for your version.<br />
<br />
* [[Binary Replication Tutorial]] provides an introduction to using this replication feature.<br />
<br />
* The related but independent [[Hot Standby]] feature.<br />
<br />
<br />
= Developer and historical details on the project =<br />
SR was developed for inclusion in PostgreSQL 9.0 by NTT OSS Center. The lead developer is [mailto:masao.fujii@gmail.com Masao Fujii]. [http://www.pgcon.org/2008/schedule/events/76.en.html Synchronous Log Shipping Replication Presentation] introduces the early design of the feature.<br />
<br />
= Usage =<br />
== Users Overview ==<br />
* '''Log-shipping'''<br />
** XLOG records generated in the primary are periodically shipped to the standby via the network.<br />
** In the existing warm standby, only records in a filled file are shipped, what's referred to as file-based log-shipping. In SR, XLOG records in partially-filled XLOG file are shipped too, implementing record-based log-shipping. This means the window for data loss in SR is usually smaller than in warm standby, unless the warm standby was also configured for record-based shipping (which is complicated to setup).<br />
** The content of XLOG files written to the standby are exactly the same as those on the primary. XLOG files shipped can be used for a normal recovery and PITR.<br />
* '''Multiple standbys'''<br />
** More than one standby can establish a connection to the primary for SR. XLOG records are concurrently shipped to all these standbys. The delay/death of a standby does not harm log-shipping to other standbys.<br />
** The maximum number of standbys can be specified as a GUC variable.<br />
* '''Continuous recovery'''<br />
** The standby continuously replays XLOG records shipped without using pg_standby.<br />
** XLOG records shipped are replayed as soon as possible without waiting until XLOG file has been filled. The combination of [[Hot Standby]] and SR would make the latest data inserted into the primary visible in the standby almost immediately.<br />
** The standby periodically removes old XLOG files which are no longer needed for recovery, to prevent excessive disk usage.<br />
* '''Setup'''<br />
** The start of log-shipping does not interfere with any query processing on the primary.<br />
** The standby can be started in various conditions.<br />
*** If there are XLOG files in archive directory and restore_command is supplied, at first those files are replayed. Then the standby requests XLOG records following the last applied one to the primary. This prevents XLOG files already present in the standby from being shipped again. Similarly, XLOG files in pg_xlog are also replayed before starting log-shipping.<br />
*** If there is no XLOG files on the standby, the standby requests XLOG records following the starting XLOG location of recovery (the redo starting location).<br />
* '''Connection settings and authentication'''<br />
** A user can configure the same settings as a normal connection to a connection for SR (e.g., keepalive, pg_hba.conf).<br />
* '''Activation'''<br />
** The standby can keep waiting for activation as long as a user likes. This prevents the standby from being automatically brought up by failure of recovery or network outage.<br />
* '''Progress report'''<br />
** The primary and standby report the progress of log-shipping in PS display.<br />
* '''Graceful shutdown'''<br />
** When smart/fast shutdown is requested, the primary waits to exit until XLOG records have been sent to the standby, up to the shutdown checkpoint record.<br />
<br />
== Restrictions ==<br />
* '''Synchronous log-shipping'''<br />
** By default, SR supports operates in asynchronous manner, so the commit command might return a "success" to a client before the corresponding XLOG records are shipped to the standby. To enable synchronous replication, see [http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION Synchronous Replication]<br />
* '''Replication beyond timeline'''<br />
** A user has to get a fresh backup whenever making the old standby catch up.<br />
* '''Clustering'''<br />
** Postgres doesn't provide any clustering feature.<br />
<br />
== How to Use ==<br />
* '''1.''' Install postgres in the primary and standby server as usual. This requires only ''configure'', ''make'' and ''make install''.<br />
* '''2.''' Create the initial database cluster in the primary server as usual, using ''initdb''.<br />
* '''3.''' Create an user named replication with REPLICATION privelages<br />
$ CREATE ROLE replication WITH REPLICATION PASSWORD 'password' LOGIN<br />
* '''4.''' Set up connections and authentication so that the standby server can successfully connect to the ''replication'' pseudo-database on the primary.<br />
<br />
$ $EDITOR postgresql.conf<br />
<br />
listen_addresses = '192.168.0.10'<br />
<br />
$ $EDITOR pg_hba.conf<br />
<br />
# The standby server must connect with a user that has replication privileges.<br />
# TYPE DATABASE USER ADDRESS METHOD<br />
host replication replication 192.168.0.20/22 md5<br />
<br />
* '''5.''' Set up the streaming replication related parameters on the primary server.<br />
$ $EDITOR postgresql.conf<br />
<br />
# To enable read-only queries on a standby server, wal_level must be set to<br />
# "hot_standby". But you can choose "archive" if you never connect to the<br />
# server in standby mode.<br />
wal_level = hot_standby<br />
<br />
# Set the maximum number of concurrent connections from the standby servers.<br />
max_wal_senders = 5<br />
<br />
# To prevent the primary server from removing the WAL segments required for<br />
# the standby server before shipping them, set the minimum number of segments<br />
# retained in the pg_xlog directory. At least wal_keep_segments should be<br />
# larger than the number of segments generated between the beginning of<br />
# online-backup and the startup of streaming replication. If you enable WAL<br />
# archiving to an archive directory accessible from the standby, this may<br />
# not be necessary.<br />
wal_keep_segments = 32<br />
<br />
# Enable WAL archiving on the primary to an archive directory accessible from<br />
# the standby. If wal_keep_segments is a high enough number to retain the WAL<br />
# segments required for the standby server, this is not necessary.<br />
archive_mode = on<br />
archive_command = 'cp %p /path_to/archive/%f'<br />
* '''6.''' Start postgres on the primary server.<br />
* '''7.''' Make a base backup by copying the primary server's data directory to the standby server.<br />
<br />
* '''7.1.''' Do it with pg_(start|stop)_backup and rsync<br />
<br />
$ psql -c "SELECT pg_start_backup('label', true)"<br />
$ rsync -ac ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid<br />
$ psql -c "SELECT pg_stop_backup()"<br />
<br />
* '''7.2.''' Do it with pg_basebackup<br />
<br />
In version 9.1+, pg_basebackup can do the dirty work of fetching the entire data directory of your PostgreSQL installation from the primary and placing it onto the standby server.<br />
<br />
The prerequisite is that you make sure the standby's data directory is empty.<br />
<br />
Make sure to remove any tablespace directories as well. You can find those directories with:<br />
<br />
$ psql -c '\db'<br />
<br />
If you keep your postgresql.conf and other config files in PGDATA, you need a backup of postgresql.conf, to restore after pg_basebackup.<br />
<br />
After you've cleared all the directories, you can use the following command to directly stream the data from the primary onto your standby server.<br />
Run it as the database superuser, typically 'postgres', to make sure the permissions are preserved (use su, sudo or whatever other tool to make sure you're not root).<br />
<br />
$ pg_basebackup -h 192.168.0.10 -D /srv/pgsql/standby -P -U replication --xlog-method=stream<br />
<br />
In version 9.3+, you can also add the -R option so it creates a minimal recovery command file for step 9 below.<br />
<br />
If you backed up postgresql.conf, now restore it.<br />
<br />
* '''8.''' Set up replication-related parameters, connections and authentication in the standby server like the primary, so that the standby might work as a primary after failover.<br />
* '''9.''' Enable read-only queries on the standby server. But if wal_level is ''archive'' on the primary, leave hot_standby unchanged (i.e., off).<br />
$ $EDITOR postgresql.conf<br />
<br />
hot_standby = on<br />
* '''10.''' Create a recovery command file in the standby server; the following parameters are required for streaming replication.<br />
$ $EDITOR recovery.conf<br />
# Note that recovery.conf must be in $PGDATA directory.<br />
# It should NOT be located in the same directory as postgresql.conf<br />
<br />
# Specifies whether to start the server as a standby. In streaming replication,<br />
# this parameter must to be set to on.<br />
standby_mode = 'on'<br />
<br />
# Specifies a connection string which is used for the standby server to connect<br />
# with the primary.<br />
primary_conninfo = 'host=192.168.0.10 port=5432 user=replication password=password'<br />
<br />
# Specifies a trigger file whose presence should cause streaming replication to<br />
# end (i.e., failover).<br />
trigger_file = '/path_to/trigger'<br />
<br />
# Specifies a command to load archive segments from the WAL archive. If<br />
# wal_keep_segments is a high enough number to retain the WAL segments<br />
# required for the standby server, this may not be necessary. But<br />
# a large workload can cause segments to be recycled before the standby<br />
# is fully synchronized, requiring you to start again from a new base backup.<br />
restore_command = 'cp /path_to/archive/%f "%p"'<br />
* '''11.''' Start postgres in the standby server. It will start streaming replication.<br />
* '''12.''' You can calculate the replication lag by comparing the current WAL write location on the primary with the last WAL location received/replayed by the standby. They can be retrieved using ''pg_current_xlog_location'' on the primary and the ''pg_last_xlog_receive_location''/''pg_last_xlog_replay_location'' on the standby, respectively.<br />
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)<br />
pg_current_xlog_location <br />
--------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_receive_location <br />
-------------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_replay_location <br />
------------------------------<br />
0/2000000<br />
(1 row)<br />
* '''13.''' You can also check the progress of streaming replication by using ''ps'' command.<br />
# The displayed LSNs indicate the byte position that the standby server has<br />
# written up to in the xlogs.<br />
[primary] $ ps -ef | grep sender<br />
postgres 6879 6831 0 10:31 ? 00:00:00 postgres: wal sender process postgres 127.0.0.1(44663) streaming 0/2000000<br />
<br />
[standby] $ ps -ef | grep receiver<br />
postgres 6878 6872 1 10:31 ? 00:00:01 postgres: wal receiver process streaming 0/2000000<br />
* How to do failover<br />
** Create the trigger file in the standby after the primary fails.<br />
* How to stop the primary or the standby server<br />
** Shut down it as usual (''pg_ctl stop'').<br />
* How to restart streaming replication after failover<br />
** Repeat the operations from '''6th'''; making a fresh backup, some configurations and starting the original primary as the standby. The primary server doesn't need to be stopped during these operations.<br />
* How to restart streaming replication after the standby fails<br />
** Restart postgres in the standby server after eliminating the cause of failure.<br />
* How to disconnect the standby from the primary<br />
** Create the trigger file in the standby while the primary is running. Then the standby would be brought up.<br />
* How to re-synchronize the stand-alone standby after isolation<br />
** Shut down the standby as usual. And repeat the operations from '''6th'''.<br />
* If you have more than one slave, promoting one will break the other(s). Update their recovery.conf settings to point to the new master, set recovery_target_timeline to 'latest', scp/rsync the pg_xlog directory, and restart the slave.<br />
<br />
= Todo =<br />
== v9.0 ==<br />
<br />
Moved to [[PostgreSQL_9.0_Open_Items]]<br />
<br />
=== Committed ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01455.php Retrying from archive and some refactoring around Read/FetchRecord().] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00395.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg02601.php SR wrongly treats the WAL-boundary.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00396.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01715.php Adjust SR for some later changes about wal-skipping.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00399.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00024.php VACUUM FULL unexpectedly writes an XLOG UNLOGGED record.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00038.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01754.php Add a message type header.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00037.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01536.php Documentation: Add a new "Replication" chapter.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00115.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00350.php Failed assertion during recovery of partial WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00124.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00712.php A PANIC error might occur in the standby because of a partially-filled archived WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00137.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00330.php Improve the standby messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00140.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01672.php pq_getbyte_if_available() is not working because the win32 socket emulation layer simply wasn't designed to deal with non-blocking sockets.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00198.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01488.php Walsender might emit unfit messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00239.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01236.php Streaming replication on win32, still broken.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00270.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00992.php Create new section for recovery.conf.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00295.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01824.php Assertion failure in walreceiver.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00356.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01717.php Forbid a startup of walsender during recovery, and emit a suitable message? Or allow walsender to be started also during recovery?] - [http://archives.postgresql.org/message-id/20100316090955.9A5107541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01003.php How do we clean down the archive without using pg_standby?] - [http://archives.postgresql.org/message-id/20100318091718.BC14D7541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01510.php File-based log shipping without pg_standby doesn't replay the WAL files in pg_xlog.] - [http://archives.postgresql.org/pgsql-committers/2010-03/msg00356.php commit]<br />
<br />
== v9.1 ==<br />
=== Synchronization capability ===<br />
* Introduce the replication mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'' and ''fsync''.<br />
** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
** ''recv'' or ''fsync'' makes transaction commit wait for XLOG to be received or fsynced by the standby, respectively.<br />
** (''apply'' makes transaction commit wait for XLOG to be replayed by the standby. This mode will be supported in v9.2 or later)<br />
** The replication mode is specified in recovery.conf of the standby as well as other parameters for replication.<br />
*** The startup process reads the replication mode from recovery.conf and shares it to walreceiver via new shared-memory variable.<br />
*** Walreceiver also shares it to walsender by using the replication handshake message (existing protocol needs to be extended).<br />
** Based on the replication mode, walreceiver sends the reply meaning that replication is done up to the specified location to the primary.<br />
*** In async, walreceiver doesn't need to send any reply other than end-of-replication message.<br />
*** In recv or fsync, walreceiver sends the reply just after receiving or flushing XLOG, respectively.<br />
*** New message type for the reply needs to be defined. The reply is sent as CopyData message.<br />
** Walreceiver writes all the outstanding XLOG to disk before shutting down.<br />
** Walsender receives the reply from the standby, updates the location of the last record replicated, and announces completion of replication.<br />
*** New shared-memory variable to keep that location is required.<br />
** When processing the commit command, backend waits for XLOG to be replicated to only the standbys which are in the recv or fsync replication mode.<br />
*** Also smart shutdown waits for XLOG of shutdown checkpoint to be replicated.<br />
* Required optimization<br />
** Walsender should send outstanding XLOG without waiting wal_sender_delay.<br />
*** When processing the commit command, backend signals walsender to send outstanding XLOG immediately.<br />
** Backend should exit the wait loop as soon as the reply arrives at the primary.<br />
*** When receiving the reply, walsender signals backends to get up from the sleep and determine whether to exit the wait loop by checking the location of the last XLOG replicated.<br />
*** Only backends waiting for XLOG to be replicated up to the location contained in the reply are sent the signal.<br />
** Walsender waits for the signal from backends and the reply from the standby at the same time, by using select/poll.<br />
** Walsender reads XLOG from not only disk but also shared memory (wal buffers).<br />
** Walreceiver should flush XLOG file only when XLOG file is switched or the related page is flushed.<br />
*** When startup process or bgwriter flushes the buffer page, it checks whether the related XLOG has already been flushed via shared memory (location of the last XLOG flushed).<br />
*** It flushes the buffer page, if XLOG file has already been flushed.<br />
*** It signals walreceiver to flush XLOG file immediately and waits for the flush to complete, if XLOG file has not been flushed yet.<br />
** While the standby is catching up with the primary, those servers should ignore the replication mode and perform asynchronous replication.<br />
*** After those servers have almost gotten into synchronization, they perform replication based on the specified replication mode.<br />
*** New replication states like 'catching-up', 'sync', etc need to be defined, and the state machine for them is required on both servers.<br />
*** Current replication state can be monitored on both servers via SQL.<br />
* Required timeout<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG is replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
<br />
== Future release ==<br />
* '''Synchronization capability'''<br />
** Introduce the synchronization mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'', ''fsync'' and ''apply''.<br />
*** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
*** ''recv'', ''fsync'' and ''apply'' makes transaction commit wait for XLOG records to be received, fsynced and applied on the standby, respectively.<br />
** Change walsender to be able to read XLOG from not only the disk but also shared memory.<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG records are replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
* '''Monitoring'''<br />
** Provide the capability to check the progress and gap of streaming replication via one query. A collaboration of HS and SR is necessary to provide that capability on the standby side.<br />
** Provide the capability to check if the specified repliation is in progress via a query. Also more detailed status information might be necessary, e.g, the standby is catching up now, has already gotten into sync, and so on.<br />
** Change the stats collector to collect the statistics information about replication, e.g., average delay of replication time.<br />
** Develop the tool to calculate the latest XLOG position from XLOG files. This is necessary to check the gap of replication after the server fails.<br />
** Also develop the tool to extract the user-readable contents from XLOG files. This is necessary to see the contents of the gap, and manually restore them.<br />
* '''Easy to Use'''<br />
** Introduce the parameters like:<br />
*** replication_halt_timeout - replication will halt if no data has been sent for this much time.<br />
*** replication_halt_segments - replication will halt if number of WAL files in pg_xlog exceeds this threshold.<br />
*** These parameters allow us to avoid disk overflow.<br />
** Add new feature which transfers also base backup via the direct connection between the primary and the standby.<br />
** Add new hooks like walsender_hook and walreceiver_hook to cooperate with the add-on program for compression like pglesslog.<br />
** Provide a graceful termination of replication via a query on the primary. On the standby, a trigger file mechanism already provides that capability.<br />
** Support replication beyond timeline. The timeline history files need to be shipped from the primary to the standby.<br />
* '''Robustness'''<br />
** Support keepalive in libpq. This is useful for a client and the standby to detect a failure of the primary immediately.<br />
* '''Miscellaneous'''<br />
** Standalone walreceiver tool, which connects to the primary, continuously receives and writes XLOG records, independently from postgres server.<br />
** Cascade streaming replication. Allow walsender to send XLOG to another standby during recovery.<br />
** WAL archiving during recovery.<br />
<br />
[[Category:Replication]]</div>Joyhttps://wiki.postgresql.org/index.php?title=Streaming_Replication&diff=25827Streaming Replication2015-09-15T10:11:16Z<p>Joy: /* How to Use */ fix typo</p>
<hr />
<div>'''Streaming Replication''' (SR) provides the capability to continuously ship and<br />
apply the [http://www.postgresql.org/docs/current/static/wal.html WAL XLOG] records to some number of standby servers in order to keep them current.<br />
<br />
This feature was added to PostgreSQL 9.0. The discussion below is a developer oriented one that contains some out of date information. Users of this feature should use the documentation for the feature or a setup tutorial instead:<br />
<br />
* [http://www.postgresql.org/docs/current/static/warm-standby.html PostgreSQL Streaming Replication Documentation] - this documentation is for the latest version, but provides links you can use to look up the docs for your version.<br />
<br />
* [[Binary Replication Tutorial]] provides an introduction to using this replication feature.<br />
<br />
* The related but independent [[Hot Standby]] feature.<br />
<br />
<br />
= Developer and historical details on the project =<br />
SR was developed for inclusion in PostgreSQL 9.0 by NTT OSS Center. The lead developer is [mailto:masao.fujii@gmail.com Masao Fujii]. [http://www.pgcon.org/2008/schedule/events/76.en.html Synchronous Log Shipping Replication Presentation] introduces the early design of the feature.<br />
<br />
= Usage =<br />
== Users Overview ==<br />
* '''Log-shipping'''<br />
** XLOG records generated in the primary are periodically shipped to the standby via the network.<br />
** In the existing warm standby, only records in a filled file are shipped, what's referred to as file-based log-shipping. In SR, XLOG records in partially-filled XLOG file are shipped too, implementing record-based log-shipping. This means the window for data loss in SR is usually smaller than in warm standby, unless the warm standby was also configured for record-based shipping (which is complicated to setup).<br />
** The content of XLOG files written to the standby are exactly the same as those on the primary. XLOG files shipped can be used for a normal recovery and PITR.<br />
* '''Multiple standbys'''<br />
** More than one standby can establish a connection to the primary for SR. XLOG records are concurrently shipped to all these standbys. The delay/death of a standby does not harm log-shipping to other standbys.<br />
** The maximum number of standbys can be specified as a GUC variable.<br />
* '''Continuous recovery'''<br />
** The standby continuously replays XLOG records shipped without using pg_standby.<br />
** XLOG records shipped are replayed as soon as possible without waiting until XLOG file has been filled. The combination of [[Hot Standby]] and SR would make the latest data inserted into the primary visible in the standby almost immediately.<br />
** The standby periodically removes old XLOG files which are no longer needed for recovery, to prevent excessive disk usage.<br />
* '''Setup'''<br />
** The start of log-shipping does not interfere with any query processing on the primary.<br />
** The standby can be started in various conditions.<br />
*** If there are XLOG files in archive directory and restore_command is supplied, at first those files are replayed. Then the standby requests XLOG records following the last applied one to the primary. This prevents XLOG files already present in the standby from being shipped again. Similarly, XLOG files in pg_xlog are also replayed before starting log-shipping.<br />
*** If there is no XLOG files on the standby, the standby requests XLOG records following the starting XLOG location of recovery (the redo starting location).<br />
* '''Connection settings and authentication'''<br />
** A user can configure the same settings as a normal connection to a connection for SR (e.g., keepalive, pg_hba.conf).<br />
* '''Activation'''<br />
** The standby can keep waiting for activation as long as a user likes. This prevents the standby from being automatically brought up by failure of recovery or network outage.<br />
* '''Progress report'''<br />
** The primary and standby report the progress of log-shipping in PS display.<br />
* '''Graceful shutdown'''<br />
** When smart/fast shutdown is requested, the primary waits to exit until XLOG records have been sent to the standby, up to the shutdown checkpoint record.<br />
<br />
== Restrictions ==<br />
* '''Synchronous log-shipping'''<br />
** By default, SR supports operates in asynchronous manner, so the commit command might return a "success" to a client before the corresponding XLOG records are shipped to the standby. To enable synchronous replication, see [http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION Synchronous Replication]<br />
* '''Replication beyond timeline'''<br />
** A user has to get a fresh backup whenever making the old standby catch up.<br />
* '''Clustering'''<br />
** Postgres doesn't provide any clustering feature.<br />
<br />
== How to Use ==<br />
* '''1.''' Install postgres in the primary and standby server as usual. This requires only ''configure'', ''make'' and ''make install''.<br />
* '''2.''' Create the initial database cluster in the primary server as usual, using ''initdb''.<br />
* '''3.''' Create an user named replication with REPLICATION privelages<br />
$ CREATE ROLE replication WITH REPLICATION PASSWORD 'password' LOGIN<br />
* '''4.''' Set up connections and authentication so that the standby server can successfully connect to the ''replication'' pseudo-database on the primary.<br />
$ $EDITOR postgresql.conf<br />
<br />
listen_addresses = '192.168.0.10'<br />
<br />
$ $EDITOR pg_hba.conf<br />
<br />
# The standby server must connect with a user that has replication privileges.<br />
# TYPE DATABASE USER ADDRESS METHOD<br />
host replication replication 192.168.0.20/22 md5<br />
<br />
* '''5.''' Set up the streaming replication related parameters on the primary server.<br />
$ $EDITOR postgresql.conf<br />
<br />
# To enable read-only queries on a standby server, wal_level must be set to<br />
# "hot_standby". But you can choose "archive" if you never connect to the<br />
# server in standby mode.<br />
wal_level = hot_standby<br />
<br />
# Set the maximum number of concurrent connections from the standby servers.<br />
max_wal_senders = 5<br />
<br />
# To prevent the primary server from removing the WAL segments required for<br />
# the standby server before shipping them, set the minimum number of segments<br />
# retained in the pg_xlog directory. At least wal_keep_segments should be<br />
# larger than the number of segments generated between the beginning of<br />
# online-backup and the startup of streaming replication. If you enable WAL<br />
# archiving to an archive directory accessible from the standby, this may<br />
# not be necessary.<br />
wal_keep_segments = 32<br />
<br />
# Enable WAL archiving on the primary to an archive directory accessible from<br />
# the standby. If wal_keep_segments is a high enough number to retain the WAL<br />
# segments required for the standby server, this is not necessary.<br />
archive_mode = on<br />
archive_command = 'cp %p /path_to/archive/%f'<br />
* '''6.''' Start postgres on the primary server.<br />
* '''7.''' Make a base backup by copying the primary server's data directory to the standby server.<br />
$ psql -c "SELECT pg_start_backup('label', true)"<br />
$ rsync -ac ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid<br />
$ psql -c "SELECT pg_stop_backup()"<br />
<br />
or in version 9.1+, on the standby server run delete the data directory for your postgresql installation and make sure to any tablespace directories as well. You can find table space directories with:<br />
<br />
$ psql -c '\db'<br />
<br />
If you keep your postgresql.conf and other config files in PGDATA, you need to make a backup of postgresql.conf to restore after the next command.<br />
<br />
After you've cleared all the directories, you can use the following command to directly stream the data from the primary onto your standby server. This assumes your superuser is named 'postgres' and the default data directory of /var/lib/pgsql/<VERSION>/data. If you do not run the command as the superuser the permissions on all of the directories that are copied will be wrong. <br />
<br />
$ su - postgres -s /bin/bash --command='/usr/bin/pg_basebackup -h <primary_server> -D /var/lib/pgsql/9.2/data -P -U replication --xlog-method=stream'<br />
<br />
If you backed up postgresql.conf, after you do the basebackup, restore it.<br />
<br />
The -R option (version 9.3+) will create a minimal recovery command file for step 9 below.<br />
<br />
* '''8.''' Set up replication-related parameters, connections and authentication in the standby server like the primary, so that the standby might work as a primary after failover.<br />
* '''9.''' Enable read-only queries on the standby server. But if wal_level is ''archive'' on the primary, leave hot_standby unchanged (i.e., off).<br />
$ $EDITOR postgresql.conf<br />
<br />
hot_standby = on<br />
* '''10.''' Create a recovery command file in the standby server; the following parameters are required for streaming replication.<br />
$ $EDITOR recovery.conf<br />
# Note that recovery.conf must be in $PGDATA directory.<br />
# It should NOT be located in the same directory as postgresql.conf<br />
<br />
# Specifies whether to start the server as a standby. In streaming replication,<br />
# this parameter must to be set to on.<br />
standby_mode = 'on'<br />
<br />
# Specifies a connection string which is used for the standby server to connect<br />
# with the primary.<br />
primary_conninfo = 'host=192.168.0.10 port=5432 user=replication password=password'<br />
<br />
# Specifies a trigger file whose presence should cause streaming replication to<br />
# end (i.e., failover).<br />
trigger_file = '/path_to/trigger'<br />
<br />
# Specifies a command to load archive segments from the WAL archive. If<br />
# wal_keep_segments is a high enough number to retain the WAL segments<br />
# required for the standby server, this may not be necessary. But<br />
# a large workload can cause segments to be recycled before the standby<br />
# is fully synchronized, requiring you to start again from a new base backup.<br />
restore_command = 'cp /path_to/archive/%f "%p"'<br />
* '''11.''' Start postgres in the standby server. It will start streaming replication.<br />
* '''12.''' You can calculate the replication lag by comparing the current WAL write location on the primary with the last WAL location received/replayed by the standby. They can be retrieved using ''pg_current_xlog_location'' on the primary and the ''pg_last_xlog_receive_location''/''pg_last_xlog_replay_location'' on the standby, respectively.<br />
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)<br />
pg_current_xlog_location <br />
--------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_receive_location <br />
-------------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_replay_location <br />
------------------------------<br />
0/2000000<br />
(1 row)<br />
* '''13.''' You can also check the progress of streaming replication by using ''ps'' command.<br />
# The displayed LSNs indicate the byte position that the standby server has<br />
# written up to in the xlogs.<br />
[primary] $ ps -ef | grep sender<br />
postgres 6879 6831 0 10:31 ? 00:00:00 postgres: wal sender process postgres 127.0.0.1(44663) streaming 0/2000000<br />
<br />
[standby] $ ps -ef | grep receiver<br />
postgres 6878 6872 1 10:31 ? 00:00:01 postgres: wal receiver process streaming 0/2000000<br />
* How to do failover<br />
** Create the trigger file in the standby after the primary fails.<br />
* How to stop the primary or the standby server<br />
** Shut down it as usual (''pg_ctl stop'').<br />
* How to restart streaming replication after failover<br />
** Repeat the operations from '''6th'''; making a fresh backup, some configurations and starting the original primary as the standby. The primary server doesn't need to be stopped during these operations.<br />
* How to restart streaming replication after the standby fails<br />
** Restart postgres in the standby server after eliminating the cause of failure.<br />
* How to disconnect the standby from the primary<br />
** Create the trigger file in the standby while the primary is running. Then the standby would be brought up.<br />
* How to re-synchronize the stand-alone standby after isolation<br />
** Shut down the standby as usual. And repeat the operations from '''6th'''.<br />
* If you have more than one slave, promoting one will break the other(s). Update their recovery.conf settings to point to the new master, set recovery_target_timeline to 'latest', scp/rsync the pg_xlog directory, and restart the slave.<br />
<br />
= Todo =<br />
== v9.0 ==<br />
<br />
Moved to [[PostgreSQL_9.0_Open_Items]]<br />
<br />
=== Committed ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01455.php Retrying from archive and some refactoring around Read/FetchRecord().] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00395.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg02601.php SR wrongly treats the WAL-boundary.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00396.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01715.php Adjust SR for some later changes about wal-skipping.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00399.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00024.php VACUUM FULL unexpectedly writes an XLOG UNLOGGED record.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00038.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01754.php Add a message type header.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00037.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01536.php Documentation: Add a new "Replication" chapter.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00115.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00350.php Failed assertion during recovery of partial WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00124.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00712.php A PANIC error might occur in the standby because of a partially-filled archived WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00137.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00330.php Improve the standby messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00140.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01672.php pq_getbyte_if_available() is not working because the win32 socket emulation layer simply wasn't designed to deal with non-blocking sockets.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00198.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01488.php Walsender might emit unfit messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00239.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01236.php Streaming replication on win32, still broken.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00270.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00992.php Create new section for recovery.conf.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00295.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01824.php Assertion failure in walreceiver.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00356.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01717.php Forbid a startup of walsender during recovery, and emit a suitable message? Or allow walsender to be started also during recovery?] - [http://archives.postgresql.org/message-id/20100316090955.9A5107541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01003.php How do we clean down the archive without using pg_standby?] - [http://archives.postgresql.org/message-id/20100318091718.BC14D7541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01510.php File-based log shipping without pg_standby doesn't replay the WAL files in pg_xlog.] - [http://archives.postgresql.org/pgsql-committers/2010-03/msg00356.php commit]<br />
<br />
== v9.1 ==<br />
=== Synchronization capability ===<br />
* Introduce the replication mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'' and ''fsync''.<br />
** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
** ''recv'' or ''fsync'' makes transaction commit wait for XLOG to be received or fsynced by the standby, respectively.<br />
** (''apply'' makes transaction commit wait for XLOG to be replayed by the standby. This mode will be supported in v9.2 or later)<br />
** The replication mode is specified in recovery.conf of the standby as well as other parameters for replication.<br />
*** The startup process reads the replication mode from recovery.conf and shares it to walreceiver via new shared-memory variable.<br />
*** Walreceiver also shares it to walsender by using the replication handshake message (existing protocol needs to be extended).<br />
** Based on the replication mode, walreceiver sends the reply meaning that replication is done up to the specified location to the primary.<br />
*** In async, walreceiver doesn't need to send any reply other than end-of-replication message.<br />
*** In recv or fsync, walreceiver sends the reply just after receiving or flushing XLOG, respectively.<br />
*** New message type for the reply needs to be defined. The reply is sent as CopyData message.<br />
** Walreceiver writes all the outstanding XLOG to disk before shutting down.<br />
** Walsender receives the reply from the standby, updates the location of the last record replicated, and announces completion of replication.<br />
*** New shared-memory variable to keep that location is required.<br />
** When processing the commit command, backend waits for XLOG to be replicated to only the standbys which are in the recv or fsync replication mode.<br />
*** Also smart shutdown waits for XLOG of shutdown checkpoint to be replicated.<br />
* Required optimization<br />
** Walsender should send outstanding XLOG without waiting wal_sender_delay.<br />
*** When processing the commit command, backend signals walsender to send outstanding XLOG immediately.<br />
** Backend should exit the wait loop as soon as the reply arrives at the primary.<br />
*** When receiving the reply, walsender signals backends to get up from the sleep and determine whether to exit the wait loop by checking the location of the last XLOG replicated.<br />
*** Only backends waiting for XLOG to be replicated up to the location contained in the reply are sent the signal.<br />
** Walsender waits for the signal from backends and the reply from the standby at the same time, by using select/poll.<br />
** Walsender reads XLOG from not only disk but also shared memory (wal buffers).<br />
** Walreceiver should flush XLOG file only when XLOG file is switched or the related page is flushed.<br />
*** When startup process or bgwriter flushes the buffer page, it checks whether the related XLOG has already been flushed via shared memory (location of the last XLOG flushed).<br />
*** It flushes the buffer page, if XLOG file has already been flushed.<br />
*** It signals walreceiver to flush XLOG file immediately and waits for the flush to complete, if XLOG file has not been flushed yet.<br />
** While the standby is catching up with the primary, those servers should ignore the replication mode and perform asynchronous replication.<br />
*** After those servers have almost gotten into synchronization, they perform replication based on the specified replication mode.<br />
*** New replication states like 'catching-up', 'sync', etc need to be defined, and the state machine for them is required on both servers.<br />
*** Current replication state can be monitored on both servers via SQL.<br />
* Required timeout<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG is replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
<br />
== Future release ==<br />
* '''Synchronization capability'''<br />
** Introduce the synchronization mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'', ''fsync'' and ''apply''.<br />
*** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
*** ''recv'', ''fsync'' and ''apply'' makes transaction commit wait for XLOG records to be received, fsynced and applied on the standby, respectively.<br />
** Change walsender to be able to read XLOG from not only the disk but also shared memory.<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG records are replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
* '''Monitoring'''<br />
** Provide the capability to check the progress and gap of streaming replication via one query. A collaboration of HS and SR is necessary to provide that capability on the standby side.<br />
** Provide the capability to check if the specified repliation is in progress via a query. Also more detailed status information might be necessary, e.g, the standby is catching up now, has already gotten into sync, and so on.<br />
** Change the stats collector to collect the statistics information about replication, e.g., average delay of replication time.<br />
** Develop the tool to calculate the latest XLOG position from XLOG files. This is necessary to check the gap of replication after the server fails.<br />
** Also develop the tool to extract the user-readable contents from XLOG files. This is necessary to see the contents of the gap, and manually restore them.<br />
* '''Easy to Use'''<br />
** Introduce the parameters like:<br />
*** replication_halt_timeout - replication will halt if no data has been sent for this much time.<br />
*** replication_halt_segments - replication will halt if number of WAL files in pg_xlog exceeds this threshold.<br />
*** These parameters allow us to avoid disk overflow.<br />
** Add new feature which transfers also base backup via the direct connection between the primary and the standby.<br />
** Add new hooks like walsender_hook and walreceiver_hook to cooperate with the add-on program for compression like pglesslog.<br />
** Provide a graceful termination of replication via a query on the primary. On the standby, a trigger file mechanism already provides that capability.<br />
** Support replication beyond timeline. The timeline history files need to be shipped from the primary to the standby.<br />
* '''Robustness'''<br />
** Support keepalive in libpq. This is useful for a client and the standby to detect a failure of the primary immediately.<br />
* '''Miscellaneous'''<br />
** Standalone walreceiver tool, which connects to the primary, continuously receives and writes XLOG records, independently from postgres server.<br />
** Cascade streaming replication. Allow walsender to send XLOG to another standby during recovery.<br />
** WAL archiving during recovery.<br />
<br />
[[Category:Replication]]</div>Joyhttps://wiki.postgresql.org/index.php?title=Streaming_Replication&diff=25826Streaming Replication2015-09-15T10:10:12Z<p>Joy: /* How to Use */ clarify the last edit - not everyone keeps cluster config in PGDATA, esp. not on Debian and Ubuntu</p>
<hr />
<div>'''Streaming Replication''' (SR) provides the capability to continuously ship and<br />
apply the [http://www.postgresql.org/docs/current/static/wal.html WAL XLOG] records to some number of standby servers in order to keep them current.<br />
<br />
This feature was added to PostgreSQL 9.0. The discussion below is a developer oriented one that contains some out of date information. Users of this feature should use the documentation for the feature or a setup tutorial instead:<br />
<br />
* [http://www.postgresql.org/docs/current/static/warm-standby.html PostgreSQL Streaming Replication Documentation] - this documentation is for the latest version, but provides links you can use to look up the docs for your version.<br />
<br />
* [[Binary Replication Tutorial]] provides an introduction to using this replication feature.<br />
<br />
* The related but independent [[Hot Standby]] feature.<br />
<br />
<br />
= Developer and historical details on the project =<br />
SR was developed for inclusion in PostgreSQL 9.0 by NTT OSS Center. The lead developer is [mailto:masao.fujii@gmail.com Masao Fujii]. [http://www.pgcon.org/2008/schedule/events/76.en.html Synchronous Log Shipping Replication Presentation] introduces the early design of the feature.<br />
<br />
= Usage =<br />
== Users Overview ==<br />
* '''Log-shipping'''<br />
** XLOG records generated in the primary are periodically shipped to the standby via the network.<br />
** In the existing warm standby, only records in a filled file are shipped, what's referred to as file-based log-shipping. In SR, XLOG records in partially-filled XLOG file are shipped too, implementing record-based log-shipping. This means the window for data loss in SR is usually smaller than in warm standby, unless the warm standby was also configured for record-based shipping (which is complicated to setup).<br />
** The content of XLOG files written to the standby are exactly the same as those on the primary. XLOG files shipped can be used for a normal recovery and PITR.<br />
* '''Multiple standbys'''<br />
** More than one standby can establish a connection to the primary for SR. XLOG records are concurrently shipped to all these standbys. The delay/death of a standby does not harm log-shipping to other standbys.<br />
** The maximum number of standbys can be specified as a GUC variable.<br />
* '''Continuous recovery'''<br />
** The standby continuously replays XLOG records shipped without using pg_standby.<br />
** XLOG records shipped are replayed as soon as possible without waiting until XLOG file has been filled. The combination of [[Hot Standby]] and SR would make the latest data inserted into the primary visible in the standby almost immediately.<br />
** The standby periodically removes old XLOG files which are no longer needed for recovery, to prevent excessive disk usage.<br />
* '''Setup'''<br />
** The start of log-shipping does not interfere with any query processing on the primary.<br />
** The standby can be started in various conditions.<br />
*** If there are XLOG files in archive directory and restore_command is supplied, at first those files are replayed. Then the standby requests XLOG records following the last applied one to the primary. This prevents XLOG files already present in the standby from being shipped again. Similarly, XLOG files in pg_xlog are also replayed before starting log-shipping.<br />
*** If there is no XLOG files on the standby, the standby requests XLOG records following the starting XLOG location of recovery (the redo starting location).<br />
* '''Connection settings and authentication'''<br />
** A user can configure the same settings as a normal connection to a connection for SR (e.g., keepalive, pg_hba.conf).<br />
* '''Activation'''<br />
** The standby can keep waiting for activation as long as a user likes. This prevents the standby from being automatically brought up by failure of recovery or network outage.<br />
* '''Progress report'''<br />
** The primary and standby report the progress of log-shipping in PS display.<br />
* '''Graceful shutdown'''<br />
** When smart/fast shutdown is requested, the primary waits to exit until XLOG records have been sent to the standby, up to the shutdown checkpoint record.<br />
<br />
== Restrictions ==<br />
* '''Synchronous log-shipping'''<br />
** By default, SR supports operates in asynchronous manner, so the commit command might return a "success" to a client before the corresponding XLOG records are shipped to the standby. To enable synchronous replication, see [http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION Synchronous Replication]<br />
* '''Replication beyond timeline'''<br />
** A user has to get a fresh backup whenever making the old standby catch up.<br />
* '''Clustering'''<br />
** Postgres doesn't provide any clustering feature.<br />
<br />
== How to Use ==<br />
* '''1.''' Install postgres in the primary and standby server as usual. This requires only ''configure'', ''make'' and ''make install''.<br />
* '''2.''' Create the initial database cluster in the primary server as usual, using ''initdb''.<br />
* '''3.''' Create an user named replication with REPLICATION privelages<br />
$ CREATE ROLE replication WITH REPLICATION PASSWORD 'password' LOGIN<br />
* '''4.''' Set up connections and authentication so that the standby server can successfully connect to the ''replication'' pseudo-database on the primary.<br />
$ $EDITOR postgresql.conf<br />
<br />
listen_addresses = '192.168.0.10'<br />
<br />
$ $EDITOR pg_hba.conf<br />
<br />
# The standby server must connect with a user that has replication privileges.<br />
# TYPE DATABASE USER ADDRESS METHOD<br />
host replication replication 192.168.0.20/22 md5<br />
<br />
* '''5.''' Set up the streaming replication related parameters on the primary server.<br />
$ $EDITOR postgresql.conf<br />
<br />
# To enable read-only queries on a standby server, wal_level must be set to<br />
# "hot_standby". But you can choose "archive" if you never connect to the<br />
# server in standby mode.<br />
wal_level = hot_standby<br />
<br />
# Set the maximum number of concurrent connections from the standby servers.<br />
max_wal_senders = 5<br />
<br />
# To prevent the primary server from removing the WAL segments required for<br />
# the standby server before shipping them, set the minimum number of segments<br />
# retained in the pg_xlog directory. At least wal_keep_segments should be<br />
# larger than the number of segments generated between the beginning of<br />
# online-backup and the startup of streaming replication. If you enable WAL<br />
# archiving to an archive directory accessible from the standby, this may<br />
# not be necessary.<br />
wal_keep_segments = 32<br />
<br />
# Enable WAL archiving on the primary to an archive directory accessible from<br />
# the standby. If wal_keep_segments is a high enough number to retain the WAL<br />
# segments required for the standby server, this is not necessary.<br />
archive_mode = on<br />
archive_command = 'cp %p /path_to/archive/%f'<br />
* '''6.''' Start postgres on the primary server.<br />
* '''7.''' Make a base backup by copying the primary server's data directory to the standby server.<br />
$ psql -c "SELECT pg_start_backup('label', true)"<br />
$ rsync -ac ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid<br />
$ psql -c "SELECT pg_stop_backup()"<br />
<br />
or in version 9.1+, on the standby server run delete the data directory for your postgresql installation and make sure to any tablespace directories as well. You can find table space directories with:<br />
<br />
$ psql -c '\db'<br />
<br />
If you keep your postgresql.conf and other config files in PGDATA, you need to make a backup of postgresql.conf to restore after the next command.<br />
<br />
After you've cleared all the directories, you can use the following command to directly stream the data from the primary onto your standby server. This assumes your superuser is named 'postgres' and the default data directory of /var/lib/psql/<VERSION>/data. If you do not run the command as the superuser the permissions on all of the directories that are copied will be wrong. <br />
<br />
$ su - postgres -s /bin/bash --command='/usr/bin/pg_basebackup -h <primary_server> -D /var/lib/pgsql/9.2/data -P -U replication --xlog-method=stream'<br />
<br />
If you backed up postgresql.conf, after you do the basebackup, restore it.<br />
<br />
The -R option (version 9.3+) will create a minimal recovery command file for step 9 below.<br />
<br />
* '''8.''' Set up replication-related parameters, connections and authentication in the standby server like the primary, so that the standby might work as a primary after failover.<br />
* '''9.''' Enable read-only queries on the standby server. But if wal_level is ''archive'' on the primary, leave hot_standby unchanged (i.e., off).<br />
$ $EDITOR postgresql.conf<br />
<br />
hot_standby = on<br />
* '''10.''' Create a recovery command file in the standby server; the following parameters are required for streaming replication.<br />
$ $EDITOR recovery.conf<br />
# Note that recovery.conf must be in $PGDATA directory.<br />
# It should NOT be located in the same directory as postgresql.conf<br />
<br />
# Specifies whether to start the server as a standby. In streaming replication,<br />
# this parameter must to be set to on.<br />
standby_mode = 'on'<br />
<br />
# Specifies a connection string which is used for the standby server to connect<br />
# with the primary.<br />
primary_conninfo = 'host=192.168.0.10 port=5432 user=replication password=password'<br />
<br />
# Specifies a trigger file whose presence should cause streaming replication to<br />
# end (i.e., failover).<br />
trigger_file = '/path_to/trigger'<br />
<br />
# Specifies a command to load archive segments from the WAL archive. If<br />
# wal_keep_segments is a high enough number to retain the WAL segments<br />
# required for the standby server, this may not be necessary. But<br />
# a large workload can cause segments to be recycled before the standby<br />
# is fully synchronized, requiring you to start again from a new base backup.<br />
restore_command = 'cp /path_to/archive/%f "%p"'<br />
* '''11.''' Start postgres in the standby server. It will start streaming replication.<br />
* '''12.''' You can calculate the replication lag by comparing the current WAL write location on the primary with the last WAL location received/replayed by the standby. They can be retrieved using ''pg_current_xlog_location'' on the primary and the ''pg_last_xlog_receive_location''/''pg_last_xlog_replay_location'' on the standby, respectively.<br />
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)<br />
pg_current_xlog_location <br />
--------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_receive_location <br />
-------------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_replay_location <br />
------------------------------<br />
0/2000000<br />
(1 row)<br />
* '''13.''' You can also check the progress of streaming replication by using ''ps'' command.<br />
# The displayed LSNs indicate the byte position that the standby server has<br />
# written up to in the xlogs.<br />
[primary] $ ps -ef | grep sender<br />
postgres 6879 6831 0 10:31 ? 00:00:00 postgres: wal sender process postgres 127.0.0.1(44663) streaming 0/2000000<br />
<br />
[standby] $ ps -ef | grep receiver<br />
postgres 6878 6872 1 10:31 ? 00:00:01 postgres: wal receiver process streaming 0/2000000<br />
* How to do failover<br />
** Create the trigger file in the standby after the primary fails.<br />
* How to stop the primary or the standby server<br />
** Shut down it as usual (''pg_ctl stop'').<br />
* How to restart streaming replication after failover<br />
** Repeat the operations from '''6th'''; making a fresh backup, some configurations and starting the original primary as the standby. The primary server doesn't need to be stopped during these operations.<br />
* How to restart streaming replication after the standby fails<br />
** Restart postgres in the standby server after eliminating the cause of failure.<br />
* How to disconnect the standby from the primary<br />
** Create the trigger file in the standby while the primary is running. Then the standby would be brought up.<br />
* How to re-synchronize the stand-alone standby after isolation<br />
** Shut down the standby as usual. And repeat the operations from '''6th'''.<br />
* If you have more than one slave, promoting one will break the other(s). Update their recovery.conf settings to point to the new master, set recovery_target_timeline to 'latest', scp/rsync the pg_xlog directory, and restart the slave.<br />
<br />
= Todo =<br />
== v9.0 ==<br />
<br />
Moved to [[PostgreSQL_9.0_Open_Items]]<br />
<br />
=== Committed ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01455.php Retrying from archive and some refactoring around Read/FetchRecord().] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00395.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg02601.php SR wrongly treats the WAL-boundary.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00396.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01715.php Adjust SR for some later changes about wal-skipping.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00399.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00024.php VACUUM FULL unexpectedly writes an XLOG UNLOGGED record.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00038.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01754.php Add a message type header.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00037.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01536.php Documentation: Add a new "Replication" chapter.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00115.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00350.php Failed assertion during recovery of partial WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00124.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00712.php A PANIC error might occur in the standby because of a partially-filled archived WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00137.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00330.php Improve the standby messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00140.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01672.php pq_getbyte_if_available() is not working because the win32 socket emulation layer simply wasn't designed to deal with non-blocking sockets.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00198.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01488.php Walsender might emit unfit messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00239.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01236.php Streaming replication on win32, still broken.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00270.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00992.php Create new section for recovery.conf.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00295.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01824.php Assertion failure in walreceiver.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00356.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01717.php Forbid a startup of walsender during recovery, and emit a suitable message? Or allow walsender to be started also during recovery?] - [http://archives.postgresql.org/message-id/20100316090955.9A5107541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01003.php How do we clean down the archive without using pg_standby?] - [http://archives.postgresql.org/message-id/20100318091718.BC14D7541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01510.php File-based log shipping without pg_standby doesn't replay the WAL files in pg_xlog.] - [http://archives.postgresql.org/pgsql-committers/2010-03/msg00356.php commit]<br />
<br />
== v9.1 ==<br />
=== Synchronization capability ===<br />
* Introduce the replication mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'' and ''fsync''.<br />
** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
** ''recv'' or ''fsync'' makes transaction commit wait for XLOG to be received or fsynced by the standby, respectively.<br />
** (''apply'' makes transaction commit wait for XLOG to be replayed by the standby. This mode will be supported in v9.2 or later)<br />
** The replication mode is specified in recovery.conf of the standby as well as other parameters for replication.<br />
*** The startup process reads the replication mode from recovery.conf and shares it to walreceiver via new shared-memory variable.<br />
*** Walreceiver also shares it to walsender by using the replication handshake message (existing protocol needs to be extended).<br />
** Based on the replication mode, walreceiver sends the reply meaning that replication is done up to the specified location to the primary.<br />
*** In async, walreceiver doesn't need to send any reply other than end-of-replication message.<br />
*** In recv or fsync, walreceiver sends the reply just after receiving or flushing XLOG, respectively.<br />
*** New message type for the reply needs to be defined. The reply is sent as CopyData message.<br />
** Walreceiver writes all the outstanding XLOG to disk before shutting down.<br />
** Walsender receives the reply from the standby, updates the location of the last record replicated, and announces completion of replication.<br />
*** New shared-memory variable to keep that location is required.<br />
** When processing the commit command, backend waits for XLOG to be replicated to only the standbys which are in the recv or fsync replication mode.<br />
*** Also smart shutdown waits for XLOG of shutdown checkpoint to be replicated.<br />
* Required optimization<br />
** Walsender should send outstanding XLOG without waiting wal_sender_delay.<br />
*** When processing the commit command, backend signals walsender to send outstanding XLOG immediately.<br />
** Backend should exit the wait loop as soon as the reply arrives at the primary.<br />
*** When receiving the reply, walsender signals backends to get up from the sleep and determine whether to exit the wait loop by checking the location of the last XLOG replicated.<br />
*** Only backends waiting for XLOG to be replicated up to the location contained in the reply are sent the signal.<br />
** Walsender waits for the signal from backends and the reply from the standby at the same time, by using select/poll.<br />
** Walsender reads XLOG from not only disk but also shared memory (wal buffers).<br />
** Walreceiver should flush XLOG file only when XLOG file is switched or the related page is flushed.<br />
*** When startup process or bgwriter flushes the buffer page, it checks whether the related XLOG has already been flushed via shared memory (location of the last XLOG flushed).<br />
*** It flushes the buffer page, if XLOG file has already been flushed.<br />
*** It signals walreceiver to flush XLOG file immediately and waits for the flush to complete, if XLOG file has not been flushed yet.<br />
** While the standby is catching up with the primary, those servers should ignore the replication mode and perform asynchronous replication.<br />
*** After those servers have almost gotten into synchronization, they perform replication based on the specified replication mode.<br />
*** New replication states like 'catching-up', 'sync', etc need to be defined, and the state machine for them is required on both servers.<br />
*** Current replication state can be monitored on both servers via SQL.<br />
* Required timeout<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG is replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
<br />
== Future release ==<br />
* '''Synchronization capability'''<br />
** Introduce the synchronization mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'', ''fsync'' and ''apply''.<br />
*** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
*** ''recv'', ''fsync'' and ''apply'' makes transaction commit wait for XLOG records to be received, fsynced and applied on the standby, respectively.<br />
** Change walsender to be able to read XLOG from not only the disk but also shared memory.<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG records are replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
* '''Monitoring'''<br />
** Provide the capability to check the progress and gap of streaming replication via one query. A collaboration of HS and SR is necessary to provide that capability on the standby side.<br />
** Provide the capability to check if the specified repliation is in progress via a query. Also more detailed status information might be necessary, e.g, the standby is catching up now, has already gotten into sync, and so on.<br />
** Change the stats collector to collect the statistics information about replication, e.g., average delay of replication time.<br />
** Develop the tool to calculate the latest XLOG position from XLOG files. This is necessary to check the gap of replication after the server fails.<br />
** Also develop the tool to extract the user-readable contents from XLOG files. This is necessary to see the contents of the gap, and manually restore them.<br />
* '''Easy to Use'''<br />
** Introduce the parameters like:<br />
*** replication_halt_timeout - replication will halt if no data has been sent for this much time.<br />
*** replication_halt_segments - replication will halt if number of WAL files in pg_xlog exceeds this threshold.<br />
*** These parameters allow us to avoid disk overflow.<br />
** Add new feature which transfers also base backup via the direct connection between the primary and the standby.<br />
** Add new hooks like walsender_hook and walreceiver_hook to cooperate with the add-on program for compression like pglesslog.<br />
** Provide a graceful termination of replication via a query on the primary. On the standby, a trigger file mechanism already provides that capability.<br />
** Support replication beyond timeline. The timeline history files need to be shipped from the primary to the standby.<br />
* '''Robustness'''<br />
** Support keepalive in libpq. This is useful for a client and the standby to detect a failure of the primary immediately.<br />
* '''Miscellaneous'''<br />
** Standalone walreceiver tool, which connects to the primary, continuously receives and writes XLOG records, independently from postgres server.<br />
** Cascade streaming replication. Allow walsender to send XLOG to another standby during recovery.<br />
** WAL archiving during recovery.<br />
<br />
[[Category:Replication]]</div>Joy