PgCon 2020 Developer Unconference

From PostgreSQL wiki
Jump to navigationJump to search

This is the PGCon 2020 Unconf Page

Unconference schedule by Time and Room

Time Stream 1 Stream 2 Stream 3
Tue 11:15-12:15 On-disk Encryption Edge cases of synchronous replication in HA solutions Notes The title is "evolving data loading for data warehousing".
Tue 12:15-13:15 Lunch
Tue 13:15-14:15 Table AM Notes SQL standard stuff Which new illustrations do we need in documentation?
Tue 14:15-14:30 Coffee Break
Tue 14:30-15:30 Zedstore Competitive Benchmarking & Building a new open source benchmarking kit - Etherpad Parallel grouping sets and refactoring nodeAgg.c
Tue 15:30-15:45 Break
Tue 15:45-16:45 Adding async / direct IO support to PG Schema Management/Migration install and initial setup experience


Data Loading for Data Warehousing

- There was a brief presentation indicating areas where there have been issues - issues emerge of reliable loading with ever increasing data volumes

Features NOT implemented yet

  • parallel load, insert via multiple processes
  • DATE cache
    • cache to reduce cost of date conversion
  • UNLOGGED - without WAL
    • format input data into Postgres blocks, write them directly
    • no lock contention
    • no search for available blocks
    • detect invalid input, constraint violations
    • write data to a file
  • ROWS
    • execute a commit every /n/ rows during COPY to avoid all-or-nothing
    • if error occurs, continue from last commit
    • various mechanisms/approaches
      • avoid (e.g. - validate before trying to load)
      • drop errors into a file (if you can validate this)
      • COMMIT once in a while so that parts of the data get in and we don't have lots of dead tuples


  • on error stop
  • on error resume next
  • workers :: number of threads
  • concurrency :: how many tasks to handle data
  • retry by cutting input into 25K row groups to localize error tuples


  • report error lines and continue
    • concern of subtransaction overhead
  • batch update of indexes
    • ACCESS EXCLUSIVE lock for a new relfilenode
  • Parallel COPY (ongoing)
    • COPY parser cannot be parallelized, should be replaced
    • ring buffer to indicate split boundary of start position
      • allows workers a place to start reading data
      • there's a performance bottleneck here
    • WAL writes may be the next bottleneck
    • For DW, avoiding WAL may be worthwhile
    • No WAL
    • No copy of existing data
    • under streaming replication, recreate standby based on backup /after/ data loading
    • to recover data, empty and recreate the relation

Idea that Slony team had

  • create a "smarter COPY" protocol
  • this would allow the client to stream data, with the various merits of COPY
    • A single stream of data, but in this case, where that stream contains data for many tables
    • The stream effects flow control and avoids the need for clients to expressly allocate memory for the data being streamed
  • This stream would split data in 2 fashions
    • By table :: want to load multiple tables
      • initial protocol allows specifying tables to load
      • Table, action, columns involved
        | ID | FQTN | Action | WHERE     | Col 1 | Col 2 | Col 3 | ... |
        |  1 | tbl1 | Insert |           | col1  | col2  | col3  |     |
        |  2 | tbl1 | Delete | col1      |       |       |       |     |
        |  3 | tbl1 | Update | col1      | col2  | col3  |       |     |
        |  4 | tbl1 | Update | col1      | col4  |       |       |     |
        |  5 | tbl2 | Insert |           | col1  | col2  | col3  |     |
        |  6 | tbl2 | Delete | col2,col3 |       |       |       |     |
    • Then there would be requests to inject data
      | ID | Where | Data    |
      |  1 |       | 1,d1,d2 |
      |  1 |       | 2,d2,d4 |
      |  2 | 1     |         |
      |  5 |       | 1,2,3   |
      |  6 | 2,3   |         |
  • The "New COPY" stream would consist of a series of requests shaped like this latter part; the stream of 5 requests are:
    • Adding a new tuple for tbl1
    • Another new tuple for tbl1
    • Deleting the first tuple
    • Adding a tuple to tbl2
    • Deleting the tuple from tbl2
  • Over time, it might be necessary to add/remove additional entries to the "table control" table; several possibilities emerge:
    • Maybe the "table control" is handled as a regular table that is used as a parameter for NEW_COPY
      • create table copy_control (id serial, fqtn name, action text, where_cols string[], data_cols[]);
      • INSERT into copy_control() values (stuff);
      • NEW_COPY using copy_control from '/tmp/file-that-is-the-stream-of entries'
      • This approach requires knowing all of the tables and the shapes of requests on those tables in advance
      • If a new table comes along, it is probably necessary to terminate the stream, add new tuples to copy_control, and start another NEW_COPY stream
      • Using a table to carry the configuration is nicely "relational"
    • Alternately, maybe the NEW_COPY protocol would include "metadata" requests
      • An "ID 0" request would be used to submit the metadata to indicate the shape of a new data injection request
      • That makes it natural to add new metadata while the NEW_COPY stream is running

There is a lot of handwaving here; it seemed like an interesting idea, perhaps more generalized than what was actually needed.

In view that this mechanism would have required an enormous amount of integration effort to get it into a new version of PostgreSQL, and would have been no help at all for earlier versions, the Slony team did not proceed with implementation efforts. It is likely that the very same issues would adversely affect others that might like this sort of functionality.

Present Pain Points

  • Parser for COPY does perhaps 100x as much work to find end of record as it needs to do, it is getting to be a visible bottleneck
  • Lots of places need to reproduce foreign key logic, index management
  • Very nice if we could have AMs to do new things
    • Just do partitioning
    • Just do bulk updates to indexes

Which new illustrations do we need in documentation?

State of Art and Problems:

  • We do have a way how to add pics into documentation: [1]
  • But since we have it, we only have 2 very basic things: page structure and GIN diagramm

Ideas and TODO:

  • Ilya will publish a list of Would-be-nice-to-have pics fro discussion

Schema Management

Only a small handful of attendees for this; it mostly reverted into discussions of "war stories"

Some approaches

  • Old Timey: Throw SQL files into a directory
  • Terrible Old Timey: Include some of the schema as queries embedded in a Microsoft WORD document, but probably not all of the schema...
  • Newer Timey: Throw a shell script in as well

More modern tools

  • Sqitch
  • Liquibase
  • Flyway
  • DBSteward


  • Sqitch
    • trap is that either they must do things non-transactionally, and are error prone
    • or via transactions, so very lock-happy
  • Migrations will happily run terabyte index work
  • Slony with EXECUTE SCRIPT where that has to be hyper customized
    • Adding in a new table on the master versus replicas may have lags where app needs to be compatible with new and/or old versions
    • documenting what tables are going to have potential for pain due to locks

DB2 ESQL: A Curious Answer

  • There some means on DB2 where if the program discovered that the schema in the target environment differed from what was apparently captured in the compiled code, then "KABOOM!!!!"
  • It appears some aspects of the schema were carried inside the program, presumably compiled into object code by the ESQL compiler
  • It's a very good question to ask, "how much schema?" was captured; answer is unknown

The interesting thing to do as consequence, in modern systems; it would be neat to have programs assert things about the schema that they need. The popular tools (Sqitch, Liquibase, Flyway) tend to use a version number as a surrogate for "do we have the right schema in place"

Other approaches

Other approaches include capturing some aspects of shapes of tables

  • Which tables?
  • Does it need to capture more about the tables
    • Columns
    • Data types
  • is there a compact representation that we could run a runtime test on???
  • given pl/pgsql now can manage COMMITs?
    • does this change anything?
  • someone added a column and want an index on that column
    • what then happens?
  • if the table is very big, and this needs to rewrite everything... ouch...
  • UPDATE ... LIMIT ???
  • nice to have throttling? Could SQL have a keyword added to support automatically doing a COMMIT every 50000 tuples?