Multimaster

From PostgreSQL wiki

Jump to: navigation, search

PostgreSQL supports unidirectional master-slave replication. Moreover it supports hot standby mode in which it is possible to execute read only queries at replicas. Replication can be either asynchronous either synchronous, but even in case of synchronous replication there is a time gap between master and replica, so client executing read-only query at replica may not see changes it has performed in previous transaction at master. Also, prior to PostgreSQL 9.6 only one synchronous replica was supported, so it was practically not possible to use synchronous replication for load balancing. As a result, current streaming replication in PostgreSQL provides only fault tolerance (HA), but not scaling performance.

2ndQuadrant provides Bidirectional Replication for PostgreSQL (BDR). In this case updates can be performed at any node of the cluster and then propagated to other nodes. BDR is essentially asynchronous: changes are applied at nodes some time later after committing transaction at master and various ways of resolving conflicts are proposed. BDR is really fast (provides almost the same speed as hot standby), but certainly there is no global consistency in such model.

BDR is based on new PostgreSQL feature named logical decoding. Changes are extracted from WAL and are proceeded by logical output plugin. In can then apply this changes to some other database, save in log or do whatever else it likes. BDR uses logical replication to deliver changes to other nodes. Logical replication is now part of PostgreSQL 9.5. Our multimaster is based on pglogical_output plugin provided by 2ndQuadrant. We have implemented receiver part for this plugin, which is also partly based on BDR code. At receiver side we have a pool of background workers which concurrently apply changes received from remote walsender.

Global consistency is enforced by DTM (distributed transaction manager). From client's point of view it works just with set of identical PostgreSQL instances. It can login and send queries to any of them. It doesn't mean whether it is read-only or update query. But certainly, as far as updates has to be applied to all nodes, multimaster is able to provide scaling only for read-only queries.

The diagram below shows performance results of multimaster installed at three nodes cluster. We run our dtmbench benchmark, varying percent of updates. We compare results of multimaster with performance of standalone PostgreSQL. Providing ACID properties for distributed transactions adds essential overhead: multimaster is about 2 times slower on updates than single node. In case of asynchronous replication it is possible to get much better results but without global consistency. At mostly read-only workloads multimaster provides much better performance, but still there is on linear scalability.


Rw ratio.png

Reads.png

Updates.png

Vertical axis: TPS, thouthands

Horizontal axis: number of client connections

Personal tools