PgCon 2021 Developer Unconference
This is the PGCon 2021 Unconf Page
Unconference schedule by Time and Room
|Time||Stream 1||Stream 2||Stream 3|
|Fri 11:15-12:15 EDT||Fun with WAL||Data block repair In PostgreSQL||Five Approaches to Systems Configuration|
|Fri 12:15-13:15 EDT||Lunch|
|Fri 13:15-14:15 EDT||Monitoring||Testing framework||PGXN|
|Fri 14:15-14:30 EDT||Coffee/Tea Break|
|Fri 14:30-15:30 EDT||remaining gaps in autovacuum||Temporal database progress||Referential Integrity and should it be enforced with Triggers|
|Fri 15:30-15:45 EDT||Break|
|Fri 15:45-16:45 EDT||containerized databases||ProxyCommand support||function profiling|
Five Approaches to Systems Configuration
rpm, deb, apk, pkg ... site_.tgz
chef, puppet, salt, bcfg2
ansible, salt-ssh, pyinfra
4. Distributed KV
5. Remote Execution
rset, drist, judo
2. There is an experience of pgbackrest project, which started from a similar approach, but end up with unit tests.
3. Stop events don't solve all the cases, which could be solved by proper usage of unit tests. But unit tests seem to be much harder to implement than stop/failure events. Nowadays, at least a few parties implement analogs of stop/failure events independently. Having some basic implementation stop/failure events into PostgreSQL core seems better than nothing.
Gaps in autovacuum
(in no particular order, nor order of appearance)
1. Index cleanup is expensive
Improvements have been made, but we still need to scan all indexes before a heap tuple may be fully deleted
2. Autovacuum tuning only has limited knobs: n-% or n-updated
Due to #1, n-updated becomes N^2 in cost at scale, whereas n-% introduces a great amount of bloat and a single value doesn't work for all scales, making it difficult to use for the 'growing' use case. A potential solution was proposed where a user might supply their own autovacuum trigger weight function based on the table statistics.
3. Cleanup of small amounts of dead tuples is very expensive due to the low ntup/npage fraction
Cleanup of index tuples by finding each of them in the index would be preferable in such situations.
Limiting factors are that dead tuples' TIDs are usually set to LP_DEAD and data is truncated when the page needs more space, but overall it might be feasable.
BTREE, GIST and SP-GIST should generally be doable, they only have one authorative indextuple for each heap tuple.
GIN indexes might implement a 'hot tuple deleted' optimization where tuples that are deleted before they are flushed from the GIN fastupdate space are also removed from the fastupdate space, removing the need to flush those tuples. (NOTE: this might be implemented already in some form). However, because GIN doesn't have a 1:1 relation between heap and index tuples, implementing retail index tuple deletion might not be as efficient for non-miniscule amounts of tuples if they are not in the fastupdate cache.
BRIN generally doesn't care about deleted tuples, so this optimization does not apply to BRIN.
4. Empty pages still take up space
E.g. a table of a few gigabytes may have only a few tuples at the end of the table. We might want to implement sparse files for such tables, as it would save a lot of space. Additionally, indicating empty pages as empty would also help in improving performance of all table scans, as there would be no need to load that page in e.g table scans.
5. VACUUM VERBOSE shows some stats which are indicative of bloat, but index statistics do not have such metrics
You cannot derive index bloat from the shown index statistics. pgstattuple.pgstatindex helps, but that's another full index scan.
6. VACUUM-generated statistics are not (all) persisted
Some of the metrics of pg_stat_progress_vacuum and VACUUM VERBOSE are not available anywhere in the system after the vacuum is complete. It would be very useful to have them in some persistent table, so that we don't have to manually run vacuum or need to be just in time to retrieve the statistics from the progress table when autovacuum runs.
7. Automatic reindex in autovacuum
For some indexes it might make more sense to routinely rebuild that index instead of scanning it for deleted tuples. As such, we might want to implement some mechanism that concurrently reindexes these indexes instead during the autovacuum.
ProxyCommand for psql / libpq
Add an option for libpq to execute a command and have the stdin/stdout of program be the stdin/stdout of the socket client's socket connection. Similar option exists for pgjdbc driver (custom SocketFactory) SSH: https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Proxies_and_Jump_Hosts
ProxyCommand allows you to automatically hop across multiple bastion hosts without having to manually forward the intermediate ports. For example to have psql connect to a server that's behind a firewall that can only be accessed by first connecting over SSH through a bastion. Adding this to libpq (v.s. just psql) would allow pg_dump and pg_restore to work as well.
Simplest proxy command would be "netcat" connecting to the host:port of the target server. More complicated examples could tunnel over SSH, forward a remote unix socket, or something totally different like spiped. Advantages over manually forwarding ports is that it happens on demand, they close when the parent libpq process exits, and they're only usable by that process.
The command could be specified in the connection string or sourced from an environment variable, e.g. PGPROXYCOMMAND. The command could read environment variables with the target server information (host, port, database, user, ...) or use string substitution such as "my-custom-command %h:%p" to build out full command that will be executed by $SHELL.
Some issues to keep in mind include process bookkeeping, security issues if we allow the proxy command to be part of the database URL, and how to decide whether to allow something like this from the server itself (e.g. from postgres_fdw). Might have a sepearate environment variable to control whether to allow processing of the proxy command in a URL.
Right now libpq only performs TLS handshake if the underlying connection is a TCP socket, i.e. no TLS handshake if it thinks it's a unix socket. Would need to change that so it works regardless of connection type.