Conflict Detection and Resolution

From PostgreSQL wiki
Jump to navigationJump to search

Introduction

In case of logical replication, if the node is subscribed to multiple providers, or when local writes happen on a subscriber, conflicts can arise for the incoming changes. CDR is the mechanism to automatically detect and resolve these conflicts depending on the application and configurations. CDR is an important and necessary feature for active-active replication.

CDR is not applicable for the initial table sync. If locally, there exists conflicting data on table, table sync worker will fail.

This page describes the CDR implementation in apply worker for INSERT, UPDATE and DELETE operations.

INSERT


To resolve INSERT conflict on subscriber, it is important to find out the conflicting row (if any) before we attempt an insertion. The indexes or search preference for the same will be:

  • First check for replica identity (RI) index.
    • if not found, check for the primary key index.
      • if not found, then check for unique indexes (individual ones or added by unique constraints)
        • if unique index also not found, skip CDR

Note: if no RI index, pk, or unique index is found but REPLICA_IDENTITY_FULL is defined, CDR will still be skipped. The reason being that even though a row can be identified with REPLICAT_IDENTITY_FULL, such tables are allowed to have duplicate rows. Hence, we should not go for conflict detection in such a case. In case of replica identity ‘nothing’ and in absence of any suitable index (as defined above), CDR will be skipped for INSERT.

Insert Conflict Types and Resolutions

Conflict Type: insert_exists: A conflict is detected when the table has the same value for a key column as the new value in the incoming row. By default, the remote change will be applied as an update.

Conflict Resolutions are:

  • latest_timestamp_wins: The change with later commit timestamp wins. If the remote has a later timestamp, then it is converted to UPDATE and applied. If local has a later timestamp, remote change is ignored.
  • earliest_timestamp_wins: The change with earlier commit timestamp wins. If the remote has an earlier timestamp, then it is converted to UPDATE and applied. If local has an earlier timestamp, remote change is ignored.
  • apply: Always apply the remote change. Convert to update and apply in this case.
  • skip:    Remote change is skipped if local row with the same key is found.
  • error:   Error out on conflict. Replication is stopped, manual action is needed.

It is important to have commit timestamp info available on subscriber when latest_timestamp_wins or earliest_timestamp_wins method is chosen as resolution method. Thus, ‘track_commit_timestamp’ must be enabled on subscriber, in absence of which, configuring the said timestamp-based resolution methods will result in error.

Default for insert_exists is latest_timestamp_wins

Example Test Case

This example illustrates INSERT conflict detection and resolution on Subscriber (Sub) which has subscribed to a table from the Publisher(Pub).

Setup:

 -- Table on Both Nodes:
 CREATE TABLE t1 (id INTEGER PRIMARY KEY, val1 INTEGER, val2 VARCHAR);
 -- Insert initial data
 Publisher (Pub): INSERT INTO t1 VALUES (1, 1, 'pub');
 Subscriber (Sub): INSERT INTO t1 VALUES (2, 11, 'sub');
 -- Data on both nodes before conflict:
 Pub: 1 row  - (1,1,pub)   
 Sub: 2 rows - (1,1,pub); (2,11,sub)

Conflict Scenario:

 Pub: Inserts conflicting data
   INSERT INTO t1 VALUES (2, 1, 'pub');

At Sub, it detects 'insert_exists' conflict on id=2. As per resolutions the data on Sub will look like -

  • In case of 'latest_timestamp_wins':
 (1,1,pub) 
 (2,1,pub) -- Pub's more recent tuple
  • In case of 'earliest_timestamp_wins':
 (1,1,pub) 
 (2,11,sub) -- Sub's earlier tuple retained
  • In case of 'apply':
 (1,1,pub) 
 (2,1,pub) -- Pub's tuple applied as converted from insert to update
  • In case of 'skip':
 (1,1,pub) 
 (2,11,sub) -- Pub's tuple skipped
  • In case of 'error': apply worker will error out.

UPDATE


On Pg-Head, the existing index presence to find the row to update is:

  • First check for replica identity index.
    • if not found, check for the primary key index.
      • if not found, then check for 'REPLICA_IDENTITY_FULL'.
        • If REPLICA_IDENTITY_FULL is set, try to use an index if possible (see FindLogicalRepLocalIndex-->FindUsableIndexForReplicaIdentityFull) else go for sequential scan.

To support UPDATE CDR, the presence of either replica identity Index or primary key is required on target node. Update CDR will not be supported in absence of replica identity index or primary key even though REPLICA IDENTITY FULL is set. Please refer to "UPDATE" sub-section in “Noteworthy Scenarios" section for the details on why Update CDR cannot be supported in presence of REPLICA IDENTITY FULL alone.

Update Conflict Detection Method

  • Origin conflict detection: The ‘origin’ info is used to detect conflict which can be obtained from commit-timestamp generated for incoming txn at the source node. To compare remote’s origin with the local’s origin, we must have origin information for local txns as well which can be obtained from commit-timestamp after enabling ‘track_commit_timestamp’ locally.

Note: On publisher, it is not needed to enable ‘track_commit_timestamp’. This is because the commit-timestamp information is written by default to XLOG and is received on the subscriber.

The one drawback here is the ‘origin’ information cannot be obtained once the row is frozen and the commit-timestamp info is removed by vacuum. For a frozen row, conflicts cannot be raised, and thus the incoming changes are applied in all the cases.

Update Conflict Types and Resolutions

Conflict Types

Below are the conflict types for the cases where columns other than primary key are updated:

  • update_differ: The origin of an incoming update's key row differs from the local row i.e.; the row has already been updated locally or by different node.By default, the change with latest timestamp will be applied/retained.
  • update_missing: The row with the same value as that incoming update's key does not exist. Remote is trying to update a row which does not exist locally. By default, the remote change will be applied as an INSERT, provided the whole row can be built from the information received from the publisher, otherwise the change will be skipped
  • update_deleted: The row with the same value as that incoming update's key does not exist. The row is already deleted. This conflict type is generated only if the deleted row is still detectable i.e., it is not removed by VACUUM yet. By default, the update is skipped in this case. If the row is removed by VACUUM already, it cannot detect this conflict. It will detect it as update_missing and will follow default or configured resolver of update_missing itself.

To identify a deleted tuple, the target table can be scanned using 'SnapshotAny'. Currently the apply worker uses dirty snapshot to scan the table when applying UPDATE, so the deleted tuple will not be visible. But the dead tuple (deleted tuple) could be seen under 'SnapshotAny'. Thus, the basic logic could be, first check if the tuple to be updated is missing (using dirty snapshot), then if the tuple is missing, use 'SnapshotAny' to scan the table to see if there are any dead tuples that match the tuple to be updated. Here we are only interested in recently deleted tuples (HEAPTUPLE_RECENTLY_DEAD), thus we can rely on return value of HeapTupleSatisfiesVacuumHorizon() to identify those.

Resolutions
  • latest_timestamp_wins: The change with later commit timestamp wins. If the remote has a later timestamp, then it is applied or else it is ignored. Can be used for ‘update_differ’.
  • earliest_timestamp_wins: The change with earlier commit timestamp wins. If the remote has an earlier timestamp, then it is applied or else it is ignored. Can be used for ‘update_differ’.
  • apply: The remote change is always applied. Can be used for ‘update_differ’.
  • apply_or_skip: Remote change is converted to INSERT and is applied. If the complete row cannot be constructed from the info provided by the publisher, then the change is skipped. Can be used for ‘update_missing’ or ‘update_deleted’.
  • apply_or_error: Remote change is converted to INSERT and is applied. If the complete row cannot be constructed from the info provided by publisher, then error is raised. Can be used for ‘update_missing’ or ‘update_deleted’.
  • skip: Remote change is skipped and local one is retained. Can be used for any conflict type.
  • error: Error out on conflict. Replication is stopped, manual action is needed. Can be used for any conflict type.

Note: If user has chosen latest or earliest_timestamp_wins, and the remote and local timestamps are same, then it will go by system identifier. The change with higher system identifier will win. This will ensure that the same change is picked on all the nodes. Choosing any other resolution method may result in data divergence.

Defaults
  • update_differ: latest_timestamp_wins.
  • update_missing: apply_or_skip
  • update_deleted: skip

Note: For the default of ‘latest_timestamp_wins’, it is required that ‘track_commit_timestamp’ is enabled. Thus, it is mandatory to enable ‘track_commit_timestamp’ when user enables conflict detection but leave resolvers to default or explicitly configure them to latest/earliest_timestamp_wins.

Example Test Case

This example illustrates UPDATE conflict detection and resolution on Subscriber (Sub) which has subscribed to a table from the Publisher(Pub).

Setup:

-- Table on Both Nodes:
  CREATE TABLE t1 (id INTEGER PRIMARY KEY, val1 INTEGER, val2 VARCHAR);
-- Insert initial data
 Publisher (Pub): 
   INSERT INTO t1 VALUES (1, 1, 'pub');
   INSERT INTO t1 VALUES (2, 1, 'pub');
 Subscriber (Sub): update a tuple -
   UPDATE t1 SET val2='sub' where id=2;
-- Data on both nodes before conflict:
 On Pub: 2 rows - (1,1,pub); (2,1,pub) 
 On Sub: 2 rows - (1,1,pub); (2,1,sub)

Conflict Scenario 1:

Pub: Updates tuple with id=2
  UPDATE t1 SET val2='PUB' where id=2;

Now, at Sub, the origin differs for the id=2 tuple and it detects a 'update_differ' conflict. After resolution, the data on Sub will look like -

  • In case of 'latest_timestamp_wins':
 (1,1,pub) 
 (2,1,PUB) -- Pub's more recent tuple
  • In case of 'earliest_timestamp_wins':
 (1,1,pub) 
 (2,1,sub) -- Sub's earlier tuple retained
  • In case of 'apply':
 (1,1,pub) 
 (2,1,PUB) -- Pub's tuple applied as converted from insert to update
  • In case of 'skip':
 (1,1,pub) 
 (2,1,sub) -- Pub's tuple skipped
  • In case of 'error': apply worker will error out.

Conflict Scenario 2:

 Sub: deletes the tuple with id=2
   DELETE t1 where id=2
 Pub: Updates tuple with id=2
   UPDATE t1 SET val2='PUB' where id=2;

Now, at Sub, depending on if the tuple is vacuumed or not, it will raise either 'update_missing' or 'update_deleted'. For both conflict types, the data on Sub after possible resolutions will look like -

  • In case of 'apply_or_skip':
 (1,1,pub) 
 (2,1,PUB) -- Able to build full row from Pub's data, hence applied. Skip otherwise.
  • In case of 'apply_or_error':
 (1,1,pub) 
 (2,1,PUB) -- Able to build full row from Pub's data, hence applied. Skip otherwise.
  • In case of 'skip':
 (1,1,pub) -- Pub's update is skipped
  • In case of 'error': apply worker will error out.

Primary Key Update Conflict Types and Resolutions

Concurrent Primary Key Updates may primarily result in below conflicts:

  • 1. pkey_exists: An incoming update is trying to update the primary key to value which already exists on the node.
    • Conflict Resolutions
      • latest_timestamp_wins
      • earliest_timestamp_wins
      • apply
      • skip
      • error
    • Default: latest_timestamp_wins

Example:

 Table: CREATE TABLE t1 (pk integer primary key, val1 integer, val2 varchar); 
 Pub data (2 rows) : (1,1,1);  (2,2,2) 
 Sub data (2 rows):  (1,1,1);  (2,2,2) 
 On Sub: Update pk 2->3; data now: (1,1,1); (3,2,2) 
 On Pub: Update pk 1->3; data now: (3,1,1); (2,2,2)

In above scenario, while applying pk 1-->3 update on sub, if configured resolution choice results in update, then it will result in delete of an existing row (3,2,2) on sub. Thus, after resolution, the result will be:

 On Pub: data: (3,1,1); (2,2,2) 
 On Sub: data: (3,1,1) 
  • 2. update_missing: Row with required primary key is missing on subscriber.
    • Conflict Resolutions - Going by regular update_missing conflicts, these are the resolvers:
      • latest_timestamp_wins
      • earliest_timestamp_wins
      • apply
      • skip
      • error
    • Default: apply_or_skip

Example:

 Table: CREATE TABLE t1 (pk integer primary key, val1 integer, val2 varchar) 
 Pub data (1 row) - (1,1,1)             
 Sub data (1 row) - (1,1,1)                
 On Sub: Update pk 1->2; data now - (2,1,1)              
 On Pub: Update pk 1->3; data now - (3,1,1)  

Now when we try to apply this change to subscriber, it will not be able to find row with ‘pk’=1 and thus result in update_missing conflict.

Using default apply_or_skip, UPDATE will be converted to INSERT and thus the data will be:

 Pub: 1 row - (3,1,1)           
 Sub: 2 rows - (2,1,1); (3,1,1)                 

If the row could not be fully constructed using the data received from the publisher, then the change will be skipped.

Another case to consider is, if there was a row already existing on subscriber with 'pk’ as 3, then this insertion would result in ‘duplicate key’ error or in other words ‘pkey_exists’ conflict. This will be the second level of conflict, if detected. Currently the scope of design doc is restricted to single -level conflict detection. Functionality can be extended to deal with multi-level conflict detection in future.

apply_or_skip will not skip the ‘duplicate key’ error in this case. The ‘skip’ part deals with skip applying change if all the columns are not available to construct complete row for INSERTION. It will not skip the error ‘duplicate-key-violation’ in above scenario.

  • 3. update_deleted: The row with the required primary key is deleted on subscriber and is not cleaned by vacuum yet.
    • Conflict Resolutions: Resolutions are same as ‘update_missing’ conflict. Default is ‘skip’.

Foreign Key Constraint Update conflicts

Concurrent operation on referencing table (one with the foreign key) and referenced table (one with the primary key) on different nodes may result in conflicts resulting in dangling foreign key w/o any primary-key present. This can happen if a DELETE or UPDATE is performed on the referenced table at the same time as inserts into the referencing table on different node.

Please see the example below:

Schema:

 CREATE TABLE products (product_no integer PRIMARY KEY, name text, price numeric);     
 CREATE TABLE orders (order_id integer PRIMARY KEY,  product_no integer REFERENCES products (product_no), quantity integer);   

Data on both pub and sub:

 Two items in products table - (1, 'item1', 10); (2, 'item2', 10) 
 Two orders for product_no=1 in orders table, while no orders yet for product_no=2 
 (101, 1, 100), (102, 1, 100) 

Now say, on subscriber, user inserts order for product_no=2, then orders data on subscriber:

 101, 1, 100 
 102, 1, 100 
 103, 2, 100 (additional order than pub) 

Now on pub, let us say user performs update of product_no=2

 update products set product_no=22 where product_no=2; 

Since product_no=2 has no corresponding orders on publisher, this update is valid and allowed here. But when this change is applied to subscriber, it leaves a dangling Foreign Key there.

After applying update on subscriber, products data:

 (1, 'item1', 10); (22, 'item2', 10) 

While orders data still refer to product_no=2

 (101, 1, 100); (102, 1, 100); (103, 2, 100) ----------> Dangling FK with product_no as 2 

The same situation can happen if instead of update, a delete was performed on pub for product_no=2 (delete from products where product_no=2). This too would have left dangling FK.

In the first version of CDR, Foreign-Key Conflict handling is not targeted. We can have suggestions added to the doc to avoid reaching this conflict situation by creating required triggers on referencing and referenced tables.

Delta Conflict Detection and Resolution

With delta conflict detection, the conflict can arise due to an update on incrementing or sum field. The quite common use case scenario for this conflict is ‘balance’ in ‘account’ getting credited or debited at two different nodes. This is like any other conflict, but the resolution is different. Here if we follow resolution of ‘update_differ’ where we retain the change with latest timestamp, then it may override account's balance inappropriately as the credit or debit on one of the nodes may be discarded resulting in wrong balance amount.

Thus, to cater to such financial needs, delta conflict resolution is needed rather than latest/earliest_timestamp_wins etc. Here, the conflict is resolved by adding the difference between the new and old values in the incoming row to the value in the target table. For example, if a bank balance is updated at two sites concurrently, then the converged value accounts for all debits and credits. Let us say:

  • At Node A, the value of ‘Balance’ in ‘Account’ table is changed from 100 to 110, indicating a credit of 10.
  • And at Node B, the value of ‘Balance’ in ‘Account’ table is changed from 100 to 120, indicating a credit of 20.
  • Thus, to resolve the conflict at node A, the value of the difference between the new and old values in the incoming row from Node B is added to the value in the table. The difference between the new and old values is 20 (120–100=20). Therefore, the current value in the table (110) is increased by 20 so that the value after conflict resolution is 130.
  • Similarly, to resolve the conflict at node B, the difference in values of incoming row from Node A is added to the column, resulting in 130 as the final value.

Unlike all other CDR configurations for UPDATE, which are applicable to all the tables for a subscription, DELTA CDR is column based instead of table or subscription based. Here the columns of the table need to be configured as ‘delta’ columns on all the nodes to go for Delta CDR. Once the required columns are configured as delta-column, logical replication will ship their old values as well along with new values.

If the column is marked as delta column on node A while not marked as delta column on Node B, then applying the change which has nonzero ‘delta column’ value in incoming row from Node A, will result in error on Node B.

If user has triggered update on multiple columns including delta columns, then this will be the flow:

  • It will first find the row to be updated using RI or PK.
  • It will check the origin based conflict.
  • If conflict is detected, then it will look at resolution method.
    • If resolution is in favor of applying the remote change, then
      • It will take the remote new row as base and will override the delta column value with (local + (remote_new - remote_old)) and will apply that.
    • If resolution is not in favor of applying the remote change, then
      • It will take the local row as base and will override the delta column value with (local + (remote_new - remote_old)) and will apply that. Here the delta column update is taken while update of rest of the columns are skipped.

DELETE


On Pg-Head, the existing index presence to find the row to delete is:

  • first check for replica identity index.
    • if not found, check for the primary key index.
      • if not found, then check for 'REPLICA_IDENTITY_FULL'.
        • If REPLICA_IDENTITY_FULL is set, try to use an index if possible (see FindLogicalRepLocalIndex-->FindUsableIndexForReplicaIdentityFull) else go for sequential scan.

Like Update, to support DELETE CDR, the presence of either replica identity Index or primary key is required. DELETE CDR will not be supported in absence of replica identity index or primary key even though REPLICA IDENTITY FULL is set. The non-deterministic results mentioned in "UPDATE" sub-section of “Noteworthy Scenarios" section are valid even when DELETE operation is done instead of UPDATE.

  • Conflict Types
    • delete_missing: An incoming delete is trying to delete a row on target node which does not exist.
  • Conflict Resolutions
    • error
    • skip
  • Default: skip

Example Test Case

This example illustrates DELETE conflict detection and resolution on Subscriber (Sub) which has subscribed to a table from the Publisher(Pub).

Setup:

-- Table on Both Nodes:
  CREATE TABLE t1 (id INTEGER PRIMARY KEY, val1 INTEGER, val2 VARCHAR);
-- Insert initial data
 Publisher (Pub): 
  INSERT INTO t1 VALUES (1, 1, 'pub');
  INSERT INTO t1 VALUES (2, 1, 'pub');
-- Data on both nodes before conflict:
  Pub:(1,1,pub); (2,1,pub) 
  Sub:(1,1,pub); (2,1,pub)

Conflict Scenario:

Sub: First deletes the tuple with id=2
  DELETE t1 where id=2;
Pub: deletes tuple with id=2
  DELETE t1 where id=2;

Now, at Sub, the tuple to be deleted (id=2) is not found and it will raise 'delete_missing' conflict. Sub behavior with possible resolutions will be -

  • In case of 'skip':
(1,1,pub) -- the Pub's DELETE will be skipped.
  • In case of 'error': apply worker will error out.

TRUNCATE


Truncate physically removes all the data from the table rather than performing row wise DELETE, thus row level conflict detection and resolution is not applicable to Truncate operation.

Unequal Table Structures


There are cases when source and target tables are different in structure in terms of number of columns, such conflicts can be detected and resolved:

  • Conflict Types
    • source_column_extra: The source table has extra column(s) than the target table and thus the target table is missing one or more columns present in the incoming row on the applying node. On pg HEAD, currently it errors out: ERROR: logical replication target relation "public.t" is missing replicated column: "b
    • target_column_extra: The target table has extra column(s) than the source table and thus the incoming row is missing one or more columns existing in target table. On pg HEAD, during INSERT operation, currently it inserts DEFAULT to the column and if there are constraints present which are violated by that default, it errors out during execution. In case of UPDATE operation, it does not touch values of extra columns.
  • Conflict Resolutions
    • error: Throw error and stop replication. Can be used for both.
    • skip: Skip applying such a change. Can be used for both.
    • ignore: Ignore extra incoming columns and proceed with the rest of the change. Can be used for source_column_extra.
    • use_default: if decision is in favor of applying the change due to resolution configured for insert or update, then use default values for missing columns. If any constraint is violated by the default value, it will error out during execution and will stop replication. Can be used for target_column_extra.

Multiple UNIQUE constraints


If the incoming row violates more than one UNIQUE constraint/index (of which one might be the Primary Key or Replica Identity index) and that results in a conflict against more than one other row, then applying the replication change produces a multiple unique constraint conflict. Both INSERT and UPDATE can cause this conflict. Example:

Schema -

 CREATE TABLE tab1 (col1 integer PRIMARY KEY, col2 integer UNIQUE, col3 integer UNIQUE); 
 col1 is Replica Identity. 

Data -

 on pub: 1, 11, 111 
 on sub: 3 additional local Inserts: (2, 22, 222); (3, 33, 333); (4, 44, 444) 
 Concurrently on pub, new insert: (2, 33, 444) 

While applying this change on sub, it will raise ‘insert_exists’ conflict based on primary key on col1.

Let us say if resolver is set to ‘apply’, then it will try to update the existing row (2, 22, 222) with (2, 33, 444) on sub. But during execution, it will hit the error ‘duplicate key value violates unique constraint on col2. Key (col2) = (33) already exists’.

Now, let us delete col2=33 on sub

 delete from tab1 where col1=3; 

And then when apply worker retries to apply the same change, it will hit unique constraint violation on col3 this time. Key (col3) = (444) already exists.

Such conflict both on col2 and col3 (independent of each other), after resolving conflict on col1, is an example of ‘Multiple Unique Constraints’ violation.

  • Conflict type
    • multiple_unique_conflicts: If the incoming row conflicts with multiple unique constraints/indexes in the target table (as shown above), it will raise this conflict.
  • Conflict resolution
    • apply: always apply the remote change. Delete all the rows that conflict with the remote row and insert or update the remote row.
    • skip: skip applying the change.
    • error: error out on conflict.
  • Default: error.

Note: To detect the conflict, the index scan keys can be built from the new tuple values based on all the unique keys. The target table needs to be scanned N times (N is the number of unique indexes and constraints). To resolve the conflict, all the found tuples need to be deleted while preserving the row with correct primary key so that incoming change (update/insert) can be applied on it.

CDR for the partitioned table


When applying change on partitioned table, we first find the target leaf partition where insert, update or delete needs to be performed and then apply the change on the leaf partition. So, there is no significant difference for CDR on it, we only need to perform CDR based on the target leaf partition

INSERT

No special handling is needed when applying INSERT change. The row will be inserted to the target leaf partition, so CDR can be performed based on that leaf partition.

UPDATE

For all three cases ‘update_differ’, ‘update_missing’, ‘update_deleted’, no special handling is needed for partitioned tables. CDR flow will be same as that for non-partitioned table.

But for partitioned tables, we can get an extra error which we do not get for non-partitioned tables which is “no partition found for the old tuple” (e.g.: if the old tuple of UPDATE does not belong to any leaf partition). We can exclude this case from being considered as ‘update_missing’ conflict and leave the ERROR as is (as on pg HEAD), because it is not possible to find the deleted row in this case and thus perform the CDR. Also, missing the expected leaf partition on subscriber seems quite rare scenario.

Another noteworthy case is cross-partition update (e.g., the old tuple belongs to partition A while the new tuple belongs to a different partition B). In this case, the apply worker will DELETE the old tuple from one partition and INSERT the new tuple into another partition. Although it applies the UPDATE as DELETE and INSERT changes, it is still doable to resolve the conflict in this case based on the resolution defined for update_differ/update_missing (needs some attention during implementation as it touches two partitions).

DELETE

No special handling is needed for ‘delete_missing’.

Configuring Conflict Resolution


There are two parts when it comes to configuring CDR:

  • Enabling/Disabling conflict detection.
  • Configuring conflict resolvers for different conflict types.

Users can sometime create multiple subscriptions on same node, subscribing to different tables to improve replication performance by starting multiple apply workers. If the tables in one subscription are less likely to cause conflict, then it is possible that user may want conflict detection disabled for that subscription while enabling it for other subscriptions. This is because conflict detection comes with some latency and thus it will be good to enable it only if needed.

This generates a requirement to make ‘conflict detection’ configurable per subscription. While the conflict resolver configuration can remain global. All the subscriptions which opt for ‘conflict detection’ will follow global conflict resolver configuration.

  • Change needed in subscription commands:

The new parameter ‘conflict_resolution’ will be added, default will be ‘off’.

 CREATE SUBSCRIPTION sub WITH (conflict_resolution=on/off);  
 ALTER SUBSCRIPTION name SET (conflict_resolution=on/off); 
  • To enable global level conflict resolvers, new DDL will be added:
 CONFLICT RESOLVER ON <conflict_type> IS <conflict_resolver> 
 Example: CONFLICT RESOLVER ON ‘insert_exists’ IS ‘apply’ 
 (This is like existing command 'SECURITY LABEL ON TABLE <object> IS <label>' )

A new catalog table will be added

 pg_conflict(conflict_type text, conflict_resolver text) 

During database creation, this catalog table will be populated with default resolver for each conflict type. The 'CONFLICT RESOLVER' command by the user will override the resolver in it. If conflict detection is enabled for a subscription, but the user has not configured resolvers by executing ‘CONFLICT RESOLVER’ command, then default resolvers will be followed.

The table will be cached and apply worker(s) will refer to the cache while applying logical changes. Execution of 'CONFLICT RESOLVER' command by user will invalidate the cache and apply worker will get notified (CacheRegisterSyscacheCallback) and will reload the values before applying the next change.

Noteworthy Scenarios


There are certain specific scenarios that require special attention during the implementation of this feature and thus are worth mentioning. Please find the list:

INSERT

  • There could be a scenario where there is no existing row when conflict-detection is verifying it, but by the time the apply worker reaches execution stage, a row is inserted by user. In such a case, apply of insert will fail and apply worker will exit. But next time the apply worker starts and tries to apply the same change from publisher to subscriber, it will be able to find conflict and will resolve it in accordance with the resolution configured by user.
  • We do not allow user to configure ‘latest_timestamp_wins’ and ‘earliest_timestamp_wins’ resolution method if ‘track_commit_timestamp’ is not enabled. But there could be the case that ‘track_commit_timestamp’ is enabled but the commit-timestamp info no longer exists (i.e., removed by vacuum after freezing the txn); in such scenario, we can consider local timestamp as 0 (as also returned by 'TransactionIdGetCommitTsData() currently) and take the decision based on that. For ‘latest_timestamp_wins’, that would mean, applying the remote change while for ‘earliest_timestamp_wins’, remote change will be discarded.

UPDATE

‘REPLICA Identity FULL’ may result in non-deterministic resolution in certain cases.

On pg Head, publishing of ‘update’ operation is allowed when REPLICA Identity FULL is set either with or without pk present. But on enabling CDR for the same, it may result in non-deterministic results even in the same data-set and same CDR configuration scenarios. Please see the example below where Setup1 and Setup2 are exactly same in terms of data and resolution configured while results in different resultant data set:

  • Setup 1: Resolution is set to ‘Skip’. Duplicate Rows are inserted on Sub after the original rows are applied from Pub.

Schema -

 CREATE TABLE t (val1 int, val2 int); 
 Pub: T - REPLICA_IDENTITY_FULL 
 Sub: T - REPLICA_IDENTITY_FULL  

Data -

 Pub -> (3,1; 3,2;)  
 Sub -> (3,1; 3,2; 3,1; 3,2;) - [first "3,1; 3,2;" are replicated ones (Pub origin), last "3,1; 3,2;" are inserted locally on Sub (Sub origin)]  
 Pub -> update t set val1=333 where val1=3; 
 At Pub, both rows updated. 
 At Sub, during replication - 
    a) it finds the first matching row and attempt for CDR 
    b) as the row origin is "Pub" only, no conflict raised 
    c) rows are updated

Now data on both nodes look like -

 Pub -> (333,1; 333,2;) 
 Sub -> (333,1; 333,2; 3,1; 3,2;) 
  • Setup 2: Resolution is set to ‘Skip’. Duplicate rows inserted on Sub but before the original rows are applied from Pub.

Data -

 Pub -> (3,1; 3,2;) 
 Sub -> (3,1; 3,2; 3,1; 3,2;) [first "3,1; 3,2;" inserted by Sub (Sub origin), last "3,1; 3,2;” are replicated ones (Pub origin)]. This order is different from Setup 1. 
 Pub -> update t set val1=333 where val1=3; 
 At Pub, both rows updated.  
 At Sub, during replication for both incoming rows - 
    a) it finds the first matching row and attempt for CDR 
    b) as the row origin is "Sub", conflict is detected. 
    c) since resolution method is configured as SKIP, the update will be ignored. 

So data on both nodes look like -

 Pub -> (333,1; 333,2;) 
 Sub -> (3,1; 3,2; 3,1; 3,2;) 

With Replica Identity FULL, we cannot distinguish between 2 duplicate rows on the receiving node and thus dependent upon the result of the search (as in which origin’s row is picked out of multiple duplicate rows), the change can be applied or rejected and thus resulting in different results in almost same scenarios. This can be very confusing to the user. Thus, CDR will not be supported when pub has REPLICA IDENTITY FULL, and sub has either no key set or only REPLICA IDENTITY FULL set. A msg will be logged for this case.

Toast Columns Update

The toast values are logged into WAL only when RI FULL is specified or the toast column is updated itself. In any other UPDATE case, such columns are skipped to be logged to WAL. This may result in divergence in case of UPDATE conflicts, when the 'update' which did not change toast value logs a row without it. Please consider the example below:

Node 1:

 Update set toast_col=val_1    

Node 2: Node1's change is received and applied and then it performs an update locally:

 Update set other_column=val_2    

Node 3: Node 1's change is not received yet, but Node 2's change is received first. The 'other_column' update is applied on Node 3. Later when Node 1's change is received, conflict is detected and if say resolution is set to 'skip', the change is skipped. In such a case 'toast' column update will not be applied to Node 3 as the update received from Node 2 did not log toast-column at all which was updated previously on Node 2.

On pg HEAD, the toast-column update gets applied on Node 3 from Node 1 even after receiving Node 2's 'other_column' update first on Node3. With CDR's 'skip' resolution set, it will not behave same.

This special scenario will be documented, and the user needs to take care of it. Resolution of such cases will not be provided in CDR first version.

Unequal Table Structures

Other than the unequal table structures which can result in 'source_column_extra' and 'target_column_extra' conflicts, the next type of unequal structures are tables differing in terms of Primary Keys or Replica Identity columns. Few such cases are:

  • Case 1: Sub alone having PK
 Pub: T1 (col1 int) 
 Sub: T1 (col1 int primary key) 

In such a case, the decision on sub will be taken based on the Primary Key or Replica Identity present locally and the resolution configured. Here, pub can have duplicates, but sub cannot.

For INSERT, since we have ‘col1’ as primary key on sub, conflicts get detected for ‘col1’ and resolved as per resolution configured.

UPDATE on the other hand will not be allowed for such queries in pub (due to there being no Primary Key or Replica Identity present) and thus CDR is not applicable for update.

  • Case 2: Pub having PK and Sub having RIF
 Pub: T1 – (col1 int PK) 
 Sub: T1 - (col1 int) - REPLICA_IDENTITY_FULL 

Here duplicates are allowed on sub and thus CDR for INSERT is not applicable.

For UPDATE, the apply-worker will fail due to lack of Replica Identity Index or Primary Key Index on sub. Thus, CDR is not applicable even for UPDATE.

  • Case 3: Pub having RIF and Sub having PK
 Pub: T1 – (col1 int) - REPLICA_IDENTITY_FULL 
 Sub: T1 - (col1 int PK) 

Here, pub can have duplicates, but sub cannot.

For INSERT, conflicts for ‘col1’ get detected and resolved as per resolution configured.

For UPDATE, pub can have multiple rows updated for a value of ‘col1’ because of duplicates. OTOH, sub will have unique row due to primary key. So, in case of multiple rows update for same value of ‘col1’ on pub, only first incoming update will be applied on sub, others will be ignored because no-matching row will be found. And for the update being applied, if there is any conflict, it will be resolved as per resolution configured.

  • Case 4: Primary Key on Different Columns
 Pub: T (col1 int PK, col2 int, col3 varchar (5)) 
 Sub: T (col1 int, col2 int PK, col3 varchar (5)) 

For INSERT, since sub has primary key on ‘col2’, CDR will kick in and incoming rows having conflicting 'col2' will go through resolution.

For UPDATE, since replica-identity is not compatible, table will be marked as not-updatable i.e., application of update will fail. CDR not applicable. For table to be updatable on sub, Replica Identity’s columns on subscriber should be equal or subset of that of publisher’s RI.

  • Case 5: Primary Key on Different Columns but RI full set
 Pub: T (col1int PK, col2 int, col3 varchar (5)), RI full set. 
 Sub: T (col1 int, col2 int PK, col3 varchar (5)) 

For INSERT, since sub has primary key on ‘col2’, CDR will kick in and incoming rows having conflicting 'col2' will go through resolution.

For UPDATE, since ‘col2’ (RI of subscriber) is subset of ‘RI full’ of publisher, table will be updatable, and CDR will be applicable. The row to be updated will be found based on ‘col2’ on sub and conflicts will be resolved as per resolutions configured.