Logical replication of DDLs

This page describes some of the design aspects for Logical Replication of DDLs. This feature is majorly divided into two parts. One is Initial Schema Sync and another one is Incremental Sync aka DDL Replication. The few researched approaches around both are mentioned below.

For reference, Commitfest Entry: https://commitfest.postgresql.org/43/3595/

PGCon 2023 presentation slides: https://wiki.postgresql.org/images/d/dd/20230602-DDL_Replication.pdf

Initial Schema Sync

The first thing to discuss is how to achieve initial schema sync as that will also help us to split the functionality into various steps. We could think of the following approaches:

pg_dump client binary vs. API

pg_dump
- The pg_dump also doesn't dump all dependent objects like types, and operators but dumps objects like indexes, triggers, rules, etc. It appears that we dump the objects that are directly related to the table whereas objects like types, and operators which can be shared and used for other tables as well don't get dumped along with the table (aka by using -t option).
- Going by above, one possibility is that we split functionality like tables only by using -t option and say a new option --exclude-table-dependents. I think this would be similar to what we display in psql by using \d <table_name> and \d+ <table_name). Then as a second step, we include all directly table-dependent objects like indexes, triggers, rules, etc. Then as a third step, we support all other objects in the database like operators, types, etc.
- Cons
  - The pg_dump client library needs to be present along with the server. I think it could be a problem since pg_dump (and other client binaries) and server binary are included in different packages. Also,there could be a concern that there is no such precedent that the backend uses client binaries.
  - We might need to add a few new options (like --exclude-table-dependents) to pg_dump to get the required information. But those shouldn't be many if we follow the above kind of design idea.

API
- In this design, we generate DDLs of all types of database objects on the publisher side so that the subscriber can fetch and execute them, we would need pg_get_XXXdef() functions for each database object type. We have such functions for indexes, triggers, views, etc for now but we don't have them for other types of object types such as types, tables, etc. We would ideally need to support all missing pg_get_XXXdef() functions.
- With the individual APIs, we have a problem that they use recent catalog snapshots (they use syscache APIs which use the recent catalog snapshot) and for initial sync we want data to be dumped with transaction-level snapshots. This is required to maintain consistency with the replication of schema following the initial sync. I think this may be solvable as we have tried it in email https://www.postgresql.org/message-id/CAJpy0uBvAyGOEHGiZMCC%2BoCugi-4m4EJep-e_YR3uViChOBCfw%40mail.gmail.com.
- With the API approach, we need to have separate functions that don't have the snapshot problem for each type of object, and that can easily lead to a lot of code duplication. We also need to either extend existing pg_get_*_def functions to use transaction-snapshot (may not be feasible) or write a new one that uses transaction-snapshot.
- Now, pg_dump works without a snapshot problem for tables where it uses a query to fetch the required info. But it uses the pg_get_* function where available (e.g., views, triggers, indexes), so there, it also has the snapshot problem. So, one approach would be that change pg_dump to follow the model similar to the table instead of using separate functions. Even if we decide to have a switch like --exclude-table-dependents to initially support only tables, I think it would be a good idea to have some solution for this problem in pg_dump if we want to follow pg_dump approach.
- In general, the problems related to having separate APIs are discussed in emails https://www.postgresql.org/message-id/CAFEN2wxsDSSuOvrU03CE33ZphVLqtyh9viPp6huODCDx2UQkYA%40mail.gmail.com. and https://www.postgresql.org/message-id/11221.1179761185%40sss.pgh.pa.us.

pg_dump_library

The idea is to have a common library which can be used both at client and server side by pg_dump and DDL-Replication respectively. As a starting point, we may expose the following functions so that the backend can use them to generate DDL command.

ConnectDatabase() - connect to the target server.
getTables() - query the target server and get necessary table information. We may also expose other similar function for other objects as well, like getFuncs() ,getConstraints() etc
sortDumpableObjects() - sort the object based on the dependency
dumpTableSchema() - use the fetched information to form the DDL command. We may provide other similar functions like dumpFunc(), dumpConstraint() etc

Pros:
- In this approach, we don't need to make sure the pg_dump client library is present along with the server.
- we also don't need to add more options in pg_dump itself as we can control which DDL to generate in a more flexible way by using these APIs.

Cons:
- Refactoring: It would require some hard refactoring work to achieve this. There are serval client-side functions and data structure which we have similar server-side version (like: ConnectDatabase/typedef struct _connParams). When pushing pg_dump to server backend, we need to replace or rename these objects one by one.
- Error handling: pg_dump is a client command which doesn't use the server side standard error reporting. If we use pg_dump library in backend, we might need to rewrite the error handling of pg_dump(make it similar to what libpq does).
- Logging: The pg_dump use some client-side logging functions (pg_log_info()), and we might need to modify these functions to make it also work in backend.
- Memory management: The current pg_dump doesn't rely on memory context, it directly uses pg_malloc/pg_free which are not used in backend. We might need to replace all the allocation functions to palloc(or malloc). Also, we would need to check if there is some potential memory leak problems which is not noticeable in a client command but may cause problems in a backend.

To avoid these client/server difference, we would need to refactor the pg_dump and extract all the common code into functions and expose them which only include common logic that can be shared in both client and server.

DDL Replication (Incremental Replication)

For incremental replication, the design aspect to be discussed is how the DDL commands should be wal-logged. One potential design is by deparsing it to jsonb format with schema qualified objects. So basically we do: parse-tree-->jsonb-->jsonb string-->wal-logged(pub)-->expand back to ddl-command(sub). This approach has several benefits over 'plain ddl text + search path' approach and thus seems a good option to be adopted.

JSONB Benefits

Flexibility:

The point of the JSON-blob route is that the publisher sends a command representation that can be parsed/processed/transformed arbitrarily by the subscriber using generic rules; it should be trivial to use a JSON tool to change schema A to schema B in any arbitrary DDL command, and produce another working DDL command without having to know how to write that command specifically. So if I have a rule that "schema A there is schema B here", all DDL commands can be replayed with no further coding (without having to rely on getting the run-time search_path correct). See discussion around email https://www.postgresql.org/message-id/202204081134.6tcmf5cxl3sz%40alvherre.pgsql.

Security:

A schema-qualified json string is safer to be applied on the subscriber side. Because if the schema is different between pub/sub, replicating a plain DDL with a search path could silently cause unexpected behavior, like creating or dropping a table in a different schema. See discussion around email https://www.postgresql.org/message-id/OS0PR01MB5716768A1AE79CC77836EA3594B59%40OS0PR01MB5716.jpnprd01.prod.outlook.com.

Filtering:

If we want to support DDLs for a particular table say: Create Publication mypub For Table mytbl;. For such cases before sending the DDL command, we may need to check if DDL is for the published table, having jsonb format makes such a thing easier.

Splitting:

We may need to split commands like "ALTER TABLE foo ADD COLUMN bar double precision DEFAULT random();" so that random() have consistent values on publisher and subscriber. It would be convenient to break commands via deparsing approach rather than via plain string.

Outfuncs

We have explored the pros/cons of using outfuncs.c in an email: https://www.postgresql.org/message-id/OS0PR01MB571684CBF660D05B63B4412C94AB9%40OS0PR01MB5716.jpnprd01.prod.outlook.com.

Ownership of objects

The ownership of the objects says the table after replication should be the same on the subscriber as it was on the publisher. As per discussion, it is sufficient to have the same name role and if the role has the privileges to create the new object on the subscriber. See https://www.postgresql.org/message-id/CAD21AoA1Rs6U8hLc6bpJkCwrYb9FF7ymaWQqdHidONbbGQmw5w%40mail.gmail.com. The responsibility of having the same roles on pub-sub is of the user. Once we have replication of global objects like roles, user intervention may not be required.

Replication of functions

The replication and further execution of replicated functions would be tricky because we override the search path to an empty string to avoid any risks. However, function execution requires a search path to access objects referenced inside the function. I think now that we have run_as_owner functionality, if run_as_owner is false, we could set the valid search path or maybe even get the search path input from the user. See discussion around email https://www.postgresql.org/message-id/CAA4eK1JrdZJPmvfMFOHTetcVgv8jqWQAcALPq8dW2P7xJbur_Q%40mail.gmail.com.

Logical replication of DDLs

Contents