BDR User Guide

From PostgreSQL wiki

Revision as of 05:17, 25 June 2014 by Ringerc (Talk | contribs)

Jump to: navigation, search

This page is the users and administrators guide for BDR. If you're looking for technical details on the project plan and implementation, see BDR Project. For detailed parameters, etc, see BDR Reference.


Contents

About BDR

BDR (Bi-Directional Replication) is a feature being progressively added to PostgreSQL core that provides greatly enhanced replication capabilities. It is available for immediate end-user deployment as a small patch on top of PostgreSQL 9.4 plus an extension module.

BDR allows users to create a geographically distributed asynchronous multi-master database using Logical Log Streaming Replication (LLSR) transport based on the changeset extraction feature introduced in PostgreSQL 9.4. It is designed to provide both high availability and geographically distributed disaster recovery capabilities.

BDR is not “clustering” as some vendors use the term, in that it doesn't have a distributed lock manager, global transaction co-ordinator, etc. Each member server is separate yet connected, with design choices that allow separation between nodes that would not be possible with global transaction coordination. Each node has a local copy of the data on all the other nodes and queries run locally on individual nodes. Each node is internally consistent at all times; the group of servers as a whole is eventually-consistent.

Some cross-database co-ordination features are provided in the form of distributed global sequences, synchronisation functions and conflict handlers.

Guidance on getting a testing setup established are in Initial setup. Please read the full documentation if you intend to put BDR into production.

More detail on the implementation of BDR, its limitations, and advantages/disadvantages can be found in the Logical Log Streaming Replication section.

BDR Quick Start

To set up BDR you'll need to:

  • Install a patched copy of PostgreSQL that can support BDR;
  • initdb a new data directory or upgrade your current one to support BDR;
  • Configure one or more logical senders and receivers

Usually BDR is used to connect multiple separate PostgreSQL server instances. This guide will illustrate a simpler configuration that replicates between two databases on the same server instance so you don't have to juggle running multiple servers. It'll also assume you want to start a new server instance rather than BDR-enable an existing one. Those topics and more are covered in the main user guide.

Installing the patched PostgreSQL binaries

BDR is distributed as code in the 2ndquadrant_bdr repository on git.postgresql.org. Source and binary packages are in progress.

PostgreSQL 9.3 and below do not support BDR, and 9.4 requires patches, so this guide will not work for you if you are trying to use a normal install of PostgreSQL. (It is expected that PostgreSQL 9.5 will support the BDR extension without additional patches).

BDR only supports Linux and Mac OS X. You cannot use BDR on Windows yet. There are no fundamental technical barriers to supporting Windows, but it has not been a priority of the project.

Compiling PostgreSQL with BDR

A script to download and compile BDR is provided for your convenience. For those who prefer to do it by hand, see installing BDR from source.

To run the script:

   curl "http://git.postgresql.org/gitweb/?p=2ndquadrant_bdr.git;a=blob_plain;f=contrib/bdr/scripts/bdr_quickstart.sh;hb=refs/heads/bdr-next" | bash

(Now, on an aside, that script could've been almost anything. It's safer to download scripts like that, read them, then run the downloaded copy.)

When it finishes, the script will print:

---------------------------
BDR compiled and installed.

Sources at /home/myuser/2ndquadrant_bdr/bdr-src
Installed to /home/myuser/2ndquadrant_bdr/bdr

Now add it to your PATH:
    export PATH=/home/myuser/2ndquadrant_bdr/bdr/bin:$PATH
and carry on with the quickstart at https://wiki.postgresql.org/wiki/BDR_User_Guide
---------------------------


Adjusting your environment

To actually use these new binaries you will need to do as the quickstart script suggested and:

export PATH=$HOME/2ndquadrant_bdr/bdr/bin:$PATH

This only affects the terminal you ran it in and makes no permanent changes. For how to apply the change permanently see adjusting your environment.

Now check that you're using the BDR binaries by running:

   psql --version

It should print something like:

   psql (PostgreSQL) 9.4_bdr0601

that mentions BDR.

Creating a BDR-enabled PostgreSQL instance

Since we're creating a new PostgreSQL instance for this example, run:

initdb -D $HOME/2ndquadrant_bdr/bdr-db -A trust -U postgres

The -A trust option tells PostgreSQL to turn off user authentication. This should never be used in production, it just keeps this quickstart simpler. Securely configuring BDR is covered in BDR Administration.

Start the server

Run:

export PGPORT=5599
pg_ctl -l $HOME/2ndquadrant_bdr/bdr-db.log -D $HOME/2ndquadrant_bdr/bdr-db -w start

The server will start up, printing:

waiting for server to start.... done
server started

If you instead get:

waiting for server to start........ stopped waiting
pg_ctl: could not start server
Examine the log output.

then take a look at $HOME/2ndquadrant_bdr/bdr-db.log to see what happened. Most likely you already have a server running on port 5599 or you're repeating a step and the BDR postgres server is already running on that port.

Create the databases

We need two (or more) databases to test BDR with, since we're going to be running it between two databases within one PostgreSQL install. So run:

createdb -U postgres bdr1
createdb -U postgres bdr2

to create the two databases.

It is important that you leave bdr2 empty, but if you like you can now make a few tables within bdr1, add some rows, etc.

Enable the BDR extension

You now have a running PostgreSQL server. It behaves like any ordinary PostgreSQL server at this point, but it's time to change that.

Add the following lines to the end of $HOME/2ndquadrant_bdr/bdr-db/postgresql.conf:

# Generic settings required for BDR
#----------------------------------

# Allow two other peer nodes, plus one for init_replica
max_replication_slots = 3

# Two peer nodes, plus two slots for pg_basebackup
max_wal_senders = 4 

# Record data for logical replication
wal_level = 'logical'
track_commit_timestamp = on

# Load BDR
shared_preload_libraries = 'bdr'

# BDR connection configuration
#-----------------------------

bdr.connections = 'nodeA, nodeB'

bdr.nodeA_dsn = 'dbname=bdr2 user=postgres port=5599'
bdr.nodeA_local_dbname = 'bdr1'

bdr.nodeB_dsn = 'dbname=bdr1 user=postgres  port=5599'
bdr.nodeB_local_dbname = 'bdr2'
bdr.nodeB_init_replica = on
bdr.nodeB_replica_local_dsn = 'dbname=bdr2 user=postgres port=5599'

The first part covers the generic settings required to use a two-node BDR configuration. They're discussed in more detail in the BDR Parameter Reference.

The second part specifies a two-node BDR configuration where the node named nodeA runs from database bdr1 and connects to database bdr2. The node named nodeB runs from the other database bdr2 and connects back to database bdr1. init_replica means that when BDR first starts, nodeB's contents are automatically copied from nodeA's database.

This probably seems pretty confusing. That's because we're using two databases on the same PostgreSQL instance to save you the hassle of running multiple servers, which would be even more confusing. In a normal BDR deployment you would almost always run separate PostgreSQL servers, each on its own computer, where every BDR node has the same database name on a different PostgreSQL server. In that case the local_dbname parameters are no longer required.

If you don't fully understand this section, don't worry, there are more examples in the admin guide.

Add a pg_hba.conf entry to allow replication

PostgreSQL requires that you explicitly enable replication in the host-based access control file pg_hba.conf. So edit $HOME/2ndquadrant_bdr/bdr-db/pg_hba.conf<tt> and add the following lines (or uncomment the ones already there):

local replication postgres              trust
host  replication postgres 127.0.0.1/32 trust
host  replication postgres ::1/128      trust

To learn more about this, see the docs on <tt>[1].

Restart the server

Now that you've created the databases to use and configured BDR, it's time to restart the server so BDR will get turned on. Use the same pg_ctl command, but with restart instead of start:

pg_ctl -l $HOME/2ndquadrant_bdr/bdr-db.log -D $HOME/2ndquadrant_bdr/bdr-db -w restart

(Optional) Check the logs

Take a look at $HOME/2ndquadrant_bdr/bdr-db.log:

less $HOME/2ndquadrant_bdr/bdr-db.log

You should see a few lines like:

Dumping remote database "dbname=bdr1 user=postgres  port=5599" with 1 concurrent workers to "/tmp/postgres-bdr-0000076D-1.11322"
Restoring dump to local DB "dbname=bdr2 user=postgres port=5599" with 1 concurrent workers from "/tmp/postgres-bdr-0000076D-1.11322"

indicating that db1 has been copied to db2, then

LOG:  registering background worker "bdr (6028730235379497978,1,16385,): nodeb: apply"
LOG:  starting background worker process "bdr (6028730235379497978,1,16385,): nodeb: apply"
DETAIL:  streaming transactions committing after 0/19D1588, reading WAL from 0/19D01D0

...

LOG:  starting background worker process "bdr (6028730235379497978,1,16384,): nodea: apply"
LOG:  starting logical decoding for slot bdr_16385_6028730235379497978_1_16384__
DETAIL:  streaming transactions committing after 0/19D01D0, reading WAL from 0/19D0198

They won't be exactly the same, but this indicates normal startup.

At time of writing it's normal to get a CONFLICT message, a FATAL: Role "myusername" does not exist and a few other messages. These are cosmetic and will be removed.

Personal tools