From PostgreSQL wiki
This document is obsolete. See the official BDR documentation for current information.
BDR administration tasks
Tasks a BDR admin might need to do.
This section of the BDR documentation is in ongoing development and has incomplete portions. Please feel free to contribute.
BDR is distributed as two components: An extension and a patched version of PostgreSQL 9.4 that's required for the extension to work. You cannot run BDR on unmodified PostgreSQL because BDR requires features like commit timestamps that do not exist in stock PostgreSQL 9.4. (UDR, a feature-reduced version of BDR, can be used on stock 9.4).
To install BDR from source you must first install the patched PostgreSQL, then install the extension into the patched PostgreSQL. It's easier than it sounds.
Clone the source for the patched PostgreSQL from git, then compile and install PostgreSQL like normal.
Assuming that you want to install it in your home directory for testing purposes, you would install the development tools and PostgreSQL dependencies for your operating system/distro, then run:
- git clone -b bdr-pg/REL9_4_STABLE git://git.postgresql.org/git/2ndquadrant_bdr.git bdr-pg
- pushd bdr-pg
- ./configure --prefix=$HOME/bdr --with-openssl
- make -C contrib install
- make install
- make -C contrib install
Now clone and install the BDR extension. This is maintained in another branch of the same repository:
- git clone -b bdr-plugin/stable git://git.postgresql.org/git/2ndquadrant_bdr.git bdr-extension
- pushd bdr-extension
- PATH=$HOME/bdr/bin:$PATH ./configure --enable-bdr=yes
- make install
Adjusting your environment
Once you have BDR installed from packages or source you will want to make sure that the BDR versions of PostgreSQL's binaries like pg_ctl, postgres and initdb are on your PATH.
You can do this per-command by writing (eg):
PATH=/path/to/bdr/bin:$PATH pg_ctl ...
but it's generally more convenient to add the path to your .bash_profile. BDR's psql is unchanged from that in regular 9.4, as is the network protocol, so there won't be compatibility issues.
Just edit $HOME/.bash_profile and append:
Replace /path/to/bdr with the path you specified as --prefix to configure when you compiled BDR.
Creating the first node
You can make your first BDR node by creating a new data directory (a new PostgreSQL "cluster") or upgrading an existing 9.4 install for use with BDR.
New data directory
On RPM distros like RHEL/CentOS/Fedora you will need to run service postgresql-9.4bdr initdb to actually create the data directory. Your data directory will be /var/lib/pgsql/9.4bdr/data. You can skip to the next step after running that command.
On Ubuntu/Debian your data directory will be in /var/lib/postgresql/9.4bdr/main/data (Ubuntu, Debian). It has already been created for you so you can skip to the next step.
If you installed from source you need to initialize a new PostgreSQL data directory yourself. Run:
initdb -D $HOME/bdr-data -U postgres --auth-host=md5 --auth-local=peer
which will make the data directory.
Upgrading a 9.4 data directory
BDR will come with scripts to upgrade a 9.4 data directory to 9.4bdr or convert a 9.4bdr data directory back to plain unpatched 9.4.
Adding a node
Bringing up a new node involves:
- Doing preliminary setup to ensure the new node will be able to connect to existing nodes
- Configuring BDR on the new node and initializing the new node with data
- Adding connections from all existing nodes to the new node
- Restarting all existing nodes so they replicate from the new node
Ensuring the new node will be able to connect
The new node needs to be able to talk to all the existing nodes. This may require changes to their configuration.
Add pg_hba.conf entries for the new node
The new node will have to be able to connect to all the other nodes. So make sure, before you add the new node, that appropriate pg_hba.conf entries on all current nodes exist to permit the new node to exist.
A hostssl entry for the bdr target database(s) with an unrestricted 0.0.0.0/0 IP/CIDR filter can be a good idea when combined with a strong md5 password and/or client certificate authentication, so you don't have to change each node's pg_hba.conf for every new node addition or IP change.
Remember that you need an entry that matches the database, and another with the pseudo-database replication.
- For example*, if you're using the user postgres as the BDR user and *you have SSL enabled*:
hostssl all postgres 0.0.0.0/0 md5 hostssl replication postgres 0.0.0.0/0 md5
(In production, consider using a dedicated user that isn't a superuser for replication; see the security concerns section).
While it is possible to use a non-SSL connection, it's generally unwise unless you're on a completely trusted network. Use SSL.
All nodes must have their listen_addresses parameter set so that the new node can connect, and the new node must have listen_addresses set so the other nodes can connect.
Generally you just set:
listen_addresses = '*'
on all nodes then use firewall rules, host-based and user-based authentication to limit access.
Add firewall rules
If there's a firewall in place on the existing nodes, make sure the new node can talk to them.
Set up .pgpass
If you're using md5 auth, you'll want a .pgpass file in the home directory of the user you're running BDR as. That's usually the postgres user.
The PostgreSQL server will read this file when making connections to other PostgreSQL servers using BDR.
Remember that .pgpass must be mode 0600.
It can be convenient to keep a single .pgpass that you just copy from node to node. Leave the host matching part as a wildcard so you don't have to update it for new nodes.
Enabling BDR on the new database requires certain parameters to be set in postgresql.conf.
- shared_preload_libraries = 'bdr'
- track_commit_timestamp = on
- wal_level = 'logical'
- max_replication_slots = 3 (or more than 0)
- max_wal_senders = 4 (or more than 0)
For more details on these options see the BDR Parameter Reference.
Collect DSNs (connstrings)
You now need to make a list of connection strings that'll let your new node connect to the existing nodes. For most setups you can just copy them from an existing node, then add a new one for the node you copied it from.
An example real-world connection setup from a recent deployment (with some details redacted) is
bdr.connections = 'awsbarman' bdr.awsbarman_dsn = 'dbname=thedb host=the-target-hostname user=bdr port=5433 sslmode=require'
You should test these DSNs with psql as the user BDR will run as. In this case, if BDR was running under the local user postgres:
sudo -u postgres -i psql 'dbname=thedb host=the-target-hostname user=bdr port=5433 sslmode=require'
The -i to sudo is important; it ensures that $HOME etc is set correctly, so the right .pgpass is found.
Populate the node data
Once initial setup is done you need to populate the new node with data from an existing node. There are two ways to do this:
- By physical clone; or
- By logical copy
A physical clone (based on pg_basebackup) copies the whole PostgreSQL instance with all databases on it. It's useful if you're copying an install with multiple databases involved in BDR or an install with nothing else on it. Physical clone requires a pg_basebackup to be run first, ensuring that all database state including current users etc are copied.
Logical clone (based on pg_dump and pg_restore) copies only a single database. It does not copy global state like users, etc. It doesn't have to copy table bloat, indexes, WAL records, etc, but has the additional overhead of applying a dump on the local host.
Both can be good choices, depending on your needs, much like you might use either pg_basebackup or pg_dump + pg_restore for backups.
Using a physical clone
The pg_basebackup utility can be used to create initial copy of the cluster from one of the existing nodes. Use pg_basebackup -X stream so you don't need to configure WAL archiving.
After the initial copy has been created, the postgresql.conf needs to be updated for current node (see Parameter Reference and Bi-Directional Replication Use Cases for more info). The important part is to set bdr.<connection_name>_init_replica to true for the connection(s) pointing to the cluster which was used as origin for pg_basebackup, and adjust all the bdr.bdr_connections entries to be appropriate for the new node.
If you get a message like:
Failed to detect local connection info. Please specify replica_local_dsn in the postgresql.conf.
then you should retry after configuring the bdr.[connname]_replica_local_dsn parameter for the same connection you have init_replica = on for. Generally dbname=whatever_db_you_use_for_bdr is sufficient.
The bdr_init_copy utility which is provided as part of BDR has to be used to start the node for the first time - it will bring the node to consistent state with rest of the nodes.
Using a logical copy
To make a logical copy you need to initdb a new PostgreSQL instance if you don't have one already running that you intend to use.
This is just a normal initdb like with any other PostgreSQL install.
You can skip this step if you have an existing local PostgreSQL instance and want to add a new database that will be cloned from an existing remote BDR node.
Copy global state (users/groups/roles etc)
Now dump and restore any global state you need from other instances. BDR logical copy does not clone users/groups/roles, foreign servers, and other global state from the target node, so you must copy this yourself. It is generally useful to do a pg_dumpall --globals-only on the origin node and pipe the result to psql on the local node, with something like:
pg_dumpall --globals-only "host=origin.host.name user=postgres" | psql
Create a new empty database
Then you must CREATE DATABASE thedbname TEMPLATE template0 to make the target database for BDR. It is important to use the empty template, template0.
Configure the new database for BDR
In postgresql.conf, bdr.bdr_connections holds a commas seperated list of all the other nodes it must communicate with. For each nodename in bdr.bdr_connections, a bdr.nodename_dsn entry is supplied. For one of the entries, set bdr.nodename_init_replica=on. This entry must also have bdr.nodename_local_replica_dsn = '...superuser localhost dsn ...', i.e. a DSN that gets a superuser connection to the local target DB.
An example from a real world deployment, with some redaction, is:
bdr.connections = 'thedb' # Password is in the .pgpass file bdr.thedb_dsn = 'host=the-target-server dbname=thedb user=bdr port=5433 sslmode=require' bdr.thedb_init_replica = on bdr.thedb_replica_local_dsn = 'dbname=thedb user=bdr port=5433'
Test the replica_local_dsn
Note that you should test the dsn given in replica_local_dsn and set any required pg_hba.conf entries. Test with psql running as the same user BDR runs as, e.g.:
psql 'dbname=thedb user=bdr port=5433'
Adding connections to the new node
Once the new node is running, you need to add connection entries to all other nodes to point to it.
Remember to both add them to bdr.bdr_connections on each node *and* add the new bdr.conname_dsn entry for it to each node.
Removing a node
The process to remove a node is:
- Remove mention of the node from all configuration files
- Stop all nodes
- Start all nodes (note this is not the same thing as restarting all nodes)
- Delete the node record from bdr_nodes
- Drop all slots that point to the removed node
Syncing all nodes
Finding a lagging node
If a node doesn't keep up it can cause slots to fall behind and pg_xlog on other nodes to fill. Monitoring by admin should prevent this.
Dealing with full pg_xlog
If a slot doesn't progress, pg_xlog can fill up. Admin must deal with this problem.
BDR backups are no different to regular PostgreSQL - you can use pg_dump, pg_basebackup, pg_start_backup/rsync/pg_stop_backup with archiving, a file system level atomic snapshot, PgBarman, or whatever suits you.
Designing apps to work best with BDR
Most apps will run unmodified on BDR when they're targeting a single node at a time for writes. Switching to a new node is treated as failover for unmodified applications; they shouldn't just be reconfigured to point to a new node without checks.
Apps may need changes to support full multi-master operation because of BDR's asynchronous nature and the lack of inter-node data locking. It is not safe for most unmodified applications to just round-robin connections or automatically fail over to a new connection because this exposes them to greater risk of data conflicts. Design changes and/or user-defined conflict triggers are generally necessary to make writing to multiple masters or transparent failover safe.
Note that BDR also imposes command restrictions - some DDL is not supported, either because of implementation restrictions or because the architecture cannot safely support it. Applications that perform lots of dynamic DDL may not be well suited to deployment with BDR.
Any app that targets Pg may "fail over" to a new write master safely, so long as:
- To prevent conflicts, the BDR masters were in sync when the switch occurred, so there were no outstanding transactions waiting to be replicated. See syncing all nodes; and
- All sequences used by the application are defined as global sequences so there are no issues with duplication of generated keys.
Some apps may be able to set weaker requirements - for example, an app that uses append-only tables may not mind having rows replicate from the old master after failover. Conflicts are only an issue if apps insert rows with keys not generated from a global sequence, or if apps update or delete existing rows.