BDR User Guide
From PostgreSQL wiki
This page is the users and administrators guide for BDR. If you're looking for technical details on the project plan and implementation, see BDR Project. For detailed parameters, etc, see BDR Reference.
BDR (Bi-Directional Replication) is a feature being progressively added to PostgreSQL core that provides greatly enhanced replication capabilities. It is available for immediate end-user deployment as a small patch on top of PostgreSQL 9.4 plus an extension module.
BDR allows users to create a geographically distributed asynchronous multi-master database using Logical Log Streaming Replication (LLSR) transport based on the changeset extraction feature introduced in PostgreSQL 9.4. It is designed to provide both high availability and geographically distributed disaster recovery capabilities.
BDR is not “clustering” as some vendors use the term, in that it doesn't have a distributed lock manager, global transaction co-ordinator, etc. Each member server is separate yet connected, with design choices that allow separation between nodes that would not be possible with global transaction coordination. Each node has a local copy of the data on all the other nodes and queries run locally on individual nodes. Each node is internally consistent at all times; the group of servers as a whole is eventually-consistent.
Guidance on getting a testing setup established are in Initial setup. Please read the full documentation if you intend to put BDR into production.
More detail on the implementation of BDR, its limitations, and advantages/disadvantages can be found in the Logical Log Streaming Replication section.
BDR Quick Start
To set up BDR you'll need to:
- Install a patched copy of PostgreSQL that can support BDR;
- initdb a new data directory or upgrade your current one to support BDR;
- Configure one or more logical senders and receivers
Usually BDR is used to connect multiple separate PostgreSQL server instances. This guide will illustrate a simpler configuration that replicates between two databases on the same server instance so you don't have to juggle running multiple servers. It'll also assume you want to start a new server instance rather than BDR-enable an existing one. Those topics and more are covered in the main user guide.
Installing the patched PostgreSQL binaries
BDR is distributed as code in the 2ndquadrant_bdr repository on git.postgresql.org. Source and binary packages are in progress.
PostgreSQL 9.3 and below do not support BDR, and 9.4 requires patches, so this guide will not work for you if you are trying to use a normal install of PostgreSQL. (It is expected that PostgreSQL 9.5 will support the BDR extension without additional patches).
BDR only supports Linux and Mac OS X. You cannot use BDR on Windows yet. There are no fundamental technical barriers to supporting Windows, but it has not been a priority of the project.
Compiling PostgreSQL with BDR
A script to download and compile BDR is provided for your convenience. For those who prefer to do it by hand, see installing BDR from source.
To run the script:
curl "http://git.postgresql.org/gitweb/?p=2ndquadrant_bdr.git;a=blob_plain;f=contrib/bdr/scripts/bdr_quickstart.sh;hb=refs/heads/bdr-next" | bash
(Now, on an aside, that script could've been almost anything. It's safer to download scripts like that, read them, then run the downloaded copy.)
When it finishes, the script will print:
--------------------------- BDR compiled and installed. Sources at /home/myuser/2ndquadrant_bdr/bdr-src Installed to /home/myuser/2ndquadrant_bdr/bdr Now add it to your PATH: export PATH=/home/myuser/2ndquadrant_bdr/bdr/bin:$PATH and carry on with the quickstart at https://wiki.postgresql.org/wiki/BDR_User_Guide ---------------------------
Adjusting your environment
To actually use these new binaries you will need to do as the quickstart script suggested and:
This only affects the terminal you ran it in and makes no permanent changes. For how to apply the change permanently see adjusting your environment.
Now check that you're using the BDR binaries by running:
It should print something like:
psql (PostgreSQL) 9.4_bdr0601
that mentions BDR.
Creating a BDR-enabled PostgreSQL instance
Since we're creating a new PostgreSQL instance for this example, run:
initdb -D $HOME/2ndquadrant_bdr/bdr-db -A trust -U postgres
The -A trust option tells PostgreSQL to turn off user authentication. This should never be used in production, it just keeps this quickstart simpler. Securely configuring BDR is covered in BDR Administration.
Start the server
export PGPORT=5599 pg_ctl -l $HOME/2ndquadrant_bdr/bdr-db.log -D $HOME/2ndquadrant_bdr/bdr-db -w start
The server will start up, printing:
waiting for server to start.... done server started
If you instead get:
waiting for server to start........ stopped waiting pg_ctl: could not start server Examine the log output.
then take a look at $HOME/2ndquadrant_bdr/bdr-db.log to see what happened. Most likely you already have a server running on port 5599 or you're repeating a step and the BDR postgres server is already running on that port.
Create the databases
We need two (or more) databases to test BDR with, since we're going to be running it between two databases within one PostgreSQL install. So run:
createdb -U postgres bdr1 createdb -U postgres bdr2
to create the two databases.
It is important that you leave bdr2 empty, but if you like you can now make a few tables within bdr1, add some rows, etc.
Enable the BDR extension
You now have a running PostgreSQL server. It behaves like any ordinary PostgreSQL server at this point, but it's time to change that.
Add the following lines to the end of $HOME/2ndquadrant_bdr/bdr-db/postgresql.conf:
# Generic settings required for BDR #---------------------------------- # Allow two other peer nodes, plus one for init_replica max_replication_slots = 3 # Two peer nodes, plus two slots for pg_basebackup max_wal_senders = 4 # Record data for logical replication wal_level = 'logical' track_commit_timestamp = on # Load BDR shared_preload_libraries = 'bdr' # BDR connection configuration #----------------------------- bdr.connections = 'nodeA, nodeB' bdr.nodeA_dsn = 'dbname=bdr2 user=postgres port=5599' bdr.nodeA_local_dbname = 'bdr1' bdr.nodeB_dsn = 'dbname=bdr1 user=postgres port=5599' bdr.nodeB_local_dbname = 'bdr2' bdr.nodeB_init_replica = on bdr.nodeB_replica_local_dsn = 'dbname=bdr2 user=postgres port=5599'
The first part covers the generic settings required to use a two-node BDR configuration. They're discussed in more detail in the BDR Parameter Reference.
The second part specifies a two-node BDR configuration where the node named nodeA runs from database bdr1 and connects to database bdr2. The node named nodeB runs from the other database bdr2 and connects back to database bdr1. init_replica means that when BDR first starts, nodeB's contents are automatically copied from nodeA's database.
This probably seems pretty confusing. That's because we're using two databases on the same PostgreSQL instance to save you the hassle of running multiple servers, which would be even more confusing. In a normal BDR deployment you would almost always run separate PostgreSQL servers, each on its own computer, where every BDR node has the same database name on a different PostgreSQL server. In that case the local_dbname parameters are no longer required.
If you don't fully understand this section, don't worry, there are more examples in the admin guide.
Add a pg_hba.conf entry to allow replication
PostgreSQL requires that you explicitly enable replication in the host-based access control file pg_hba.conf. So edit $HOME/2ndquadrant_bdr/bdr-db/pg_hba.conf<tt> and add the following lines (or uncomment the ones already there):
local replication postgres trust host replication postgres 127.0.0.1/32 trust host replication postgres ::1/128 trust
To learn more about this, see the docs on <tt>.
Restart the server
Now that you've created the databases to use and configured BDR, it's time to restart the server so BDR will get turned on. Use the same pg_ctl command, but with restart instead of start:
pg_ctl -l $HOME/2ndquadrant_bdr/bdr-db.log -D $HOME/2ndquadrant_bdr/bdr-db -w restart
(Optional) Check the logs
Take a look at $HOME/2ndquadrant_bdr/bdr-db.log:
You should see a few lines like:
Dumping remote database "dbname=bdr1 user=postgres port=5599" with 1 concurrent workers to "/tmp/postgres-bdr-0000076D-1.11322" Restoring dump to local DB "dbname=bdr2 user=postgres port=5599" with 1 concurrent workers from "/tmp/postgres-bdr-0000076D-1.11322"
indicating that db1 has been copied to db2, then
LOG: registering background worker "bdr (6028730235379497978,1,16385,): nodeb: apply" LOG: starting background worker process "bdr (6028730235379497978,1,16385,): nodeb: apply" DETAIL: streaming transactions committing after 0/19D1588, reading WAL from 0/19D01D0
LOG: starting background worker process "bdr (6028730235379497978,1,16384,): nodea: apply" LOG: starting logical decoding for slot bdr_16385_6028730235379497978_1_16384__ DETAIL: streaming transactions committing after 0/19D01D0, reading WAL from 0/19D0198
They won't be exactly the same, but this indicates normal startup.
At time of writing it's normal to get a CONFLICT message, a FATAL: Role "myusername" does not exist and a few other messages. These are cosmetic and will be removed.