2PC on FDW

From PostgreSQL wiki

Jump to: navigation, search

2PC (two-phase-commit) on FDW provides functionality that guarantees distributed transaction did either commit or rollback on all foreign servers, which is as known as Cluster-wide atomic commit. If the transaction modifies data on multiple foreign servers and does COMMIT then the transaction is committed or rollback-ed on foreign servers using two-phase-commit protocol.

Discussion is here

Contents

Usage

Users Overview

  • New parameters
    • max_prepared_foreign_transactions is a new GUC parameter, which controls the upper bound of the number of transaction on foreign servers the local transaction prepares. Note that it does not control the number of transactions on local server that involves foreign server. Since one transaction can prepare transaction on multiple foreign servers, max_foreign_prepared_transactions should be set at least,
(max_connections) * (# of foreign server wih two_phase_commit = on)

Changing this parameter requires restart.

    • two_phase_commit is a new foreign server parameter, which means that specified foreign server is capable of two phase commit protocol. The modification transaction could be committed using two-phase-commit protocol on foreign server with two_phase_commit = on. We can set this parameter by CREATE/ALTER SERVER command.
  • Using two-phase-commit protocol
    • Transaction is committed or rollback-ed using two phase commit protocol in following cases.
      • The transaction changes local data.
    • * The transaction changes data on more than one foreign server whose two_phase_commit is on.
  • Handling in-doubt transaction
    • Any crash or connection failure in phase 2 of two-phase-commit leaves the prepared transaction in unresolved state (called in-doubt transaction). Need to resolve the in-doubt transaction after foreign server recovered.
    • There are two options; calling pg_fdw_xact_resolve() function manually and using pg_fdw_xact_resolver module.
  • pg_fdw_xact_resolver
    • pg_fdw_xact_resolver is a background worker process, which periodically checks if there is in-doubt transaction and tries to resolve such transactions.
  • New FDW APIs
    • GetPreparedID() is called to get transaction identifier on pre-commit phase.
    • EndForeignTransaction() is called on commit phase and executes either COMMIT or ROLLBACK on foreign servers.
    • PrepareForeignTransaction() is called on pre-commit phase and executes PREPARE TRANSACTION on foreign servers.
    • ResolvePrepareForeignTransaction() is called on commit phase and execute either COMMIT PREPARED or ROLLBACK PREPARED with given transaction identifier on foreign servers.

If the foreign data wrapper is not capable of two-phase-commit protocol, last two APIs are not required.

Restrictions

  • Two-phase-commit protocol is used even when the transaction involves with multiple servers but does not modify data.

How to Use

In this section , we describe about how to use this feature with one coordinator and two shard node with two-phase-commit on. The step with [C] means operation on coordinator node, and with [S] means operation on shard node.

  • 1. [C] Set GUC parameter max_foreign_prepared_transactions on coordinator node.

A transaction can involves multiple foreign servers and prepares on these, so max_foreign_prepared_transaction should be at least more than (max_connections) * (# of foreign server with two_phase_commit = on). To test, we set max_prepared_transactions more than 1.

$ $EDITOR postgresql.conf
max_prepared_foreign_transactions = 200 # max_connections = 100 and two shard servers
max_prepared_transactions = 10
  • 2. [S] Set GUC parameter max_prepared_transactions on shard nodes.

Also it would be easy to confirm behavior of this feature, we set log_statement = all on all foreign servers.

$ $EDITOR postgresql.conf
max_prepared_transactions = 100 # same as max_connections
log_statement = all
log_line_prefix = '[S1]' # on shard2 server, we can set '[S2]'
  • 3. [C] Create postgres_fdw extension.
  • 4. [C] Create foreign servers with two_phase_commit parameter = on.
$ psql
=# CREATE SERVER shard_node1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'shard1', dbname 'postgres', port '5432', two_phase_commit 'on');
CREATE SERVER
=# CREATE SERVER shard_node2 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'shard2', dbname 'postgres', port '5342', two_phase_commit 'on')
CREATE SERVER
=# SELECT * FROM pg_foreign_server;
   srvname   | srvowner | srvfdw | srvtype | srvversion | srvacl |                   srvoptions                    
-------------+----------+--------+---------+------------+--------+-------------------------------------------------
 shard_node1 |       10 |  16387 |         |            |        | {host=shard1,dbname=postgres,port=5432,two_phase_commit=on}
 shard_node2 |       10 |  16387 |         |            |        | {host=shard2,dbname=postgres,port=5432,two_phase_commit=on}
(2 rows)
  • 5. [C] Create user mapping.

Example

After set up according to #How to Use section, we create two tables; ft1 on shard1 server and ft2 on shard2 server, also create corresponding foreign tables on coordinator node.

  • Example 1. Example of transaction NOT using two-phase-commit.

First, we show an example; the transaction doesn't use two-phase-commit. If the transaction involving with single foreign server then we don't need to use two-phase-commit protocol.

=# BEGIN;
=# INSERT INTO ft1 VALUES(1);
=# COMMIT;

On shard1 server, we will see the following server logs.

[S1] LOG:  statement: SET search_path = pg_catalog
[S1] LOG:  statement: SET timezone = 'UTC'
[S1] LOG:  statement: SET datestyle = ISO
[S1] LOG:  statement: SET intervalstyle = postgres
[S1] LOG:  statement: SET extra_float_digits = 3
[S1] LOG:  statement: START TRANSACTION ISOLATION LEVEL REPEATABLE READ
[S1] LOG:  execute pgsql_fdw_prep_1: INSERT INTO public.ft1(c) VALUES ($1)
[S1] DETAIL:  parameters: $1 = '1'
[S1] LOG:  statement: DEALLOCATE pgsql_fdw_prep_1
[S1] LOG:  statement: COMMIT TRANSACTION
  • Example 2. Example of transaction using two-phase-commit.

Next, we show an example; the transaction involving with multiple foreign servers uses two-phase-commit protocol.

$ psql
=# BEGIN;
=# INSERT INTO ft1 VALUES(2);
=# INSERT INTO ft2 VALUES(2);
=# COMMIT;

On shard servers, we can see the following logs.

[S1] LOG:  statement: SET search_path = pg_catalog
[S1] LOG:  statement: SET timezone = 'UTC'
[S1] LOG:  statement: SET datestyle = ISO
[S1] LOG:  statement: SET intervalstyle = postgres
[S1] LOG:  statement: SET extra_float_digits = 3
[S1] LOG:  statement: START TRANSACTION ISOLATION LEVEL REPEATABLE READ
[S1] LOG:  execute pgsql_fdw_prep_1: INSERT INTO public.ft1(c) VALUES ($1)
[S1] DETAIL:  parameters: $1 = '2'
[S1] LOG:  statement: DEALLOCATE pgsql_fdw_prep_1
[S2] LOG:  statement: SET search_path = pg_catalog
[S2] LOG:  statement: SET timezone = 'UTC'
[S2] LOG:  statement: SET datestyle = ISO
[S2] LOG:  statement: SET intervalstyle = postgres
[S2] LOG:  statement: SET extra_float_digits = 3
[S2] LOG:  statement: START TRANSACTION ISOLATION LEVEL REPEATABLE READ
[S2] LOG:  execute pgsql_fdw_prep_2: INSERT INTO public.ft2(c) VALUES ($1)
[S2] DETAIL:  parameters: $1 = '2'
[S2] LOG:  statement: DEALLOCATE pgsql_fdw_prep_2
[S1] LOG:  statement: PREPARE TRANSACTION 'px_1389361800_16388_10'
[S2] LOG:  statement: PREPARE TRANSACTION 'px_53866648_16389_10'
[S1] LOG:  statement: COMMIT PREPARED 'px_1389361800_16388_10'
[S2] LOG:  statement: COMMIT PREPARED 'px_53866648_16389_10'
  • Example 3. Example of preparing transaction on coordinator node.

When the transaction involving multiple foreign servers issues PREPARE TRANSACTION, the transaction does only phase 1 of two-phase-commit on foreign server.

=# BEGIN;
=# INSERT INTO ft1 VALUES (3);
=# INSERT INTO ft2 VALUES (3);
=# PREPARE TRANSACTION 'gxid';

We can see fdw_xact entries in preparing state on pg_fdw_xacts.

=# SELECT * FROM pg_fdw_xacts;
 dbid  | transaction | serverid | userid |  status  |      identifier       
-------+-------------+----------+--------+----------+-----------------------
 13182 |         564 |    16389 |     10 | prepared | px_450388264_16389_10
 13182 |         564 |    16388 |     10 | prepared | px_569713952_16388_10
(2 rows)

After issued COMMIT PREPARED 'gxid', these entries are committed on foreign servers.

  • Example 4. Example of rollbacking distributed transaction.
    • 1. Insert data to two foreign servers.
=# BEGIN;
=# INSERT INTO lt VALUES(4);
=# INSERT INTO ft1 VALUES(4);
=# INSERT INTO ft2 VALUES(4);
    • 2. Before commit, stop shard1 server.
    • 3. Commit transaction.
=# COMMIT; -- error
    • 4. Check data on local and shard2 server.

We can see that the transaction on shard2 server rollbacked.

=# SELECT * FROM lt WHERE c = '4'; -- data on local server
 c 
---
(0 rows)
=# SELECT * FROM ft2 WHERE c = '4'; -- data on shard2 server
 c 
---
(0 rows)
    • 5. Check data on shard1 server.

After restart shard1 server, we can see that the transaction on shard1 server rollbacked as well.

=# SELECT * FROM ft1 WHERE c = '4'; -- data on shard2 server
 c 
---
(0 rows)

Internal

Transaction processing

In order to manage foreign transaction, the patch changes PostgreSQL core so that it keeps track of foreign transaction. These entries exist on shared buffer but it's written to fdw_xact file in $PGDATA/fdw_xact directory by checkpoint. We can check all foreign transaction entries via pg_fdw_xacts system view.

The commit of distributed transaction using two phase commit protocol is executed as follows;

In 1st phase, every foreign server with two_phase_commit = on needs to register the connection to MyFDWConnection while starting new transaction on a foreign connection using RegisterXactForeignServer(). During pre-commit phase following steps are executed.

  • 1. Get transaction identifier used for PREPARE TRANSACTION on foreign servers.
  • 2. Execute COMMIT on foreign server with two_phase_commit = off.
  • 3. Register fdw_xact entry into shared memory and write XLOG_FDW_XACT_INSERT WAL.
  • 4. Execute PREPARE TRANSACTION on foreign server with two_phase_commit = on.

After that, local changes is committed (calls RecordTransactionCommit()). Meantime of phase 1 and local commit, the transaction could be failed due to serialization failure and pre-commit of notify. In such case, all foreign transactions are rollback-ed.

In 2nd phase, foreign transaction on foreign server with two_phase_commit = off are already finished in 1st phase, so we focus on only the foreign server with two_phase_commit = on. During commit phase following steps are executed.

  • 1. Resolve foreign prepared transaction.
  • 2. Remove foreign transaction entry and write XLOG_FDW_XACT_REMOVE WAL.

In case server crashes after step 1 and before step 2, a resolved foreign transaction will be considered unresolved when the local server recovers or standby takes over the master. It will try to resolve the prepared transaction again and should get an error from foreign server.

Crash recovery

During crash recovery, the fdw_xact entry are inserted to KnownFDWXactList or removed from KnownFDWXact list when corresponding WAL records are replayed. After the redo is done fdw_xact file is re-created and then pg_fdw_xact directory is scanned for unresolved foreign prepared transactions.

The files in this directory are named as triplet (xid, foreign server oid, user oid) to create a unique name for each file. This scan also emits the oldest transaction id with an unresolved prepared foreign transactions. This affects oldest active transaction id, since the status of this transaction id is required to decide the fate of unresolved prepared foreign transaction. On standby during WAL replay files are just inserted or removed. If the standby is required to finish recovery and take over the master, pg_fdw_xact is scanned to read unresolved foreign prepared transactions into the shared memory.

Many of fdw_xact.c code is inspired by two_phase.c code. So recovery mechanism and process are almost same as two_phase. The patch incorporated recent optimization of two_phase.c.

Personal tools