From PostgreSQL wiki
Jump to navigationJump to search



Patch Reviews

Patches that I have, or am, reviewing Patch Reviews.



In Development

  • Mark a transaction (or savepoint) uncommittable
Many a times we start a transaction that we never intend to commit; e.g testing, EXPLAIN ANALYZE, etc.
Allow the user to mark a transaction that can never be committed.
Allow such marker to be placed when the transaction is being started, or while it's in progress.
Once marked uncommittable, do not allow the mark to be removed.
Any command that ends a transaction (END, COMMIT, ROLLBACK) must result in a rollback.
All of these apply to savepoints, too.
  • Named Operators
The operators in Postgres can get very cryptic; guess what does <@ mean?
The proposal is to allow any identifier to be used as a name for an operator.
For example, the named operator :contained_in: can be used to represent the tsquery <@ tsquery operator.
See this comment on HN for at least one person's desire to identify an operator by name.
Patch submitted. See discussion here
  • PG Plan Guarantee
  • Websockets
    See also HTTP API
  • Add support for RISC-V memory-barriers
    Actively in the process of acquiring a SiFive HiFive Unmatched-RISCV Development Board
    Active as of 2022-08-10: See
    Won auction, waiting for delivery (2022-08-16)
    Received, unpacked. (2022-08-23)
    Used this video for guidance on which components to buy.
    Ordered a PSU (2022-08-30); other items, such as SSD and WiFi are optional, so will buy those after a boot-up test.
    Received the PSU (2022-09-01). Bootup test completed. Logged-in via serial console, and can verify Linux working. See pictures
    Ordered 1TB SSD. (2022-09-02)
    Ordered quieter CPU fans, and Mini-ITX enclosure. (2022-09-03)
    Received SSD, and CPU fans. (2022-09-05)
    Received Mini-ITX enclosure.
    Assembled all the components into Mini-ITX enclosure; took about 3 hours.
    Booted up again from the flash card, and observed boot messages over the serial-to-usb connection.
    Created partition on the SSD.
    Ordered Intel WiFi card (2022/09/07). See
    Intel WiFi card arrived; installed, connected to Home WiFi (2022/09/09)
    Latest Linux distro to install: Ubuntu. See
    TODO: Install Ubuntu 22.04.1 on SSD. See
    TODO: Install Nix package manager (to support various compilers simultaneously)
    TODO: Set up Postgres buildfarm
    TODO: Register as a new animal in the Buldfarm
    TODO: Add RISC-V to supported platforms

Future / Potential / Interested

  • Allow cloning / create-database-from-template while source database has active connections.
Initial version may allow only read-only connections
Eventually allow read-write operations while the template/source database is being copied
Changing contents of the source database (INSERT, UPDATE, VACUUM, etc.) are the main problem.
Possibly use redo logs of the source database to patch-up the destination
  • Allow users to attach and detach database
drop database X export to '/dest/path';
maybe just `export database X to '/dest/path'`, without dropping the source DB
may need to block all physical write operations
drop the database, but don't delete the database directory
freeze all relations in the database, to mark all committed rows always visible
alternatively, let the import command do this work
hence less work for a presumably busy exporting database
importing instance will need the transaction snapshot information to determine which transaction IDs had committed
move the directory to given path in the filesystem
create database X import from '/source/path';
Move the folder into $PGDATA
Give it a new OID, just like a normal `create database` does.
Verify that all relations are frozen.
  • Allow setting limits on how much of shared_buffers each database can populate
Allow setting this limit as memory size (MB, GB, etc.) or a percent of shared_buffers
In multi-tenant environments, this will help limit a single database from hogging the whole shared_buffers.
  • Make VACUUM FULL purge remnants of dropped columns
The 'limits' section, linked and quoted below, says dropped columns continue to occupy space
Make 'vacuum full', perhaps with an additional flag, remove all remnants of the column
Remove the column from catalog tables, and compress the bitmap in every row to reflect this removal.
Limits section:
> Columns that have been dropped from the table also contribute to the maximum column limit. Moreover,
> although the dropped column values for newly created tuples are internally marked as null in the
> tuple's null bitmap, the null bitmap also occupies space.
  • Implement traffic mirroring in pgbouncer (and other connection poolers)
For many tests you need an accurate replication of production traffic
Implement a mirroring feature in the pooler that sends identical traffic to the mirror
By default, throw away _all_ the responses from the mirror.
To hear about a real-world use case for such a need, see the following presentation
Why Machine Learning for Automatically Optimizing Databases Doesn't Work by Andy Pavlo
@ time 21:57
  • Change psql's \! command to perform variable interpolation
\! currently passes all its input to a subshell, without any variable interpolation
Allowing variable interpolation will allow users to use query-results to perform interesting operations in subshell
Introduce a new boolean psql variable, off by default, that enables this new behaviour
  • Add psql command-line option to start without attempting connection
Allows the user to use psql's features without first connecting to a database
Name the option --no-connect
  • Add shell-like suspend command to psql
Keyboard sequence ctrl-Z suspends psql, and returns control to invoking shell
OS shells, like Bash, have a command `suspend` to do the same thing
Exposing \suspend as a scriptable command allows users to create complex interactions with the invoking shell
e.g. The shell script can expect psql to suspend, and do some stuff before executing `fg` to resume psql's operations.
  • Create a built-in enum guc_context
It would help in queries that want to compare enums relative to each other
E.g. see last query in this blog post: Understanding Postgres Parameter Context
  • Raw Export and Import of Tables
Support unplugging and plugging-in of tables across DB Instances
All current interfaces to export and import a table are very slow
These are slow since they need to traverse the stack, and sometimes network roundtrip to client
Fastest method would be to simply copy the table segments
But that can't be done since the row-headers include MVCC info
So perform the following trickery, to raw-export a table
Copy all file segments of a table to make a duplicate table
Run VACUUM FREEZE on the duplicate table
Move the segments of the duplicate table to the export location (outside $PGDATA)
Place an appropriate CREATE TABLE command in a sibling file.
And perform the following to raw-import the table into the target DB Instance
Create the table; Use the CREATE TABLE command placed by the exporting DB Instance
Move the table segments from the staging area to $PGDATA/base/nnn/, with appropriate names
Support only individual tables, not complete hierarchies of partitions, etc.
The exported CREATE TABLE command should contain only the physical attributes of the stored data
Specifically, should not create any indexes, foreign keys, etc.
Export the associated TOAST table, as well.
Perform these operations in transactions, so that concurrent transactions don't see any effects
The exporting transaction must rollback
The importing transaction must commit
A different approach was tried and rejected
A different feature, but restricting filesystem access was one of the eventual goals
This is useful for autovacuuming customers' databases @work.
  • Eventually Consistent (EC) Tables
EC Tables bring some of the advantages of NoSQL databases to the Postgres users.
  • Users keeps their main table simple, with just a PKey
Hence they get fast responses to SQL commands
  • User creates an EC Table based off a normal table
  • System creates a table identical in structure to the original table
  • System creates a trigger on the EC Table to reject any direct DML commands.
  • ?? Does the system recreate the partitions, too, if the original table is partitioned? No
No, for now. The intended goal is to keep the main table fast; partitioned main tables are not fast.
  • A Background (BG) process hooks into the WAL stream
  • The BG Process replays the changes of main table on the EC Table.
  • It considers WAL records only from successful transactions
Hence the 'eventual' in the feature's name.
  • It converts WAL records into DML commands, and applies them to the EC Table
  • User can create additional indexes on the the EC Table to speed up commands on that table.
  • Potential problems
  • What to do when a DDL command changes the main table?
  • Advantages
  • The commands and queries that rely on PKey access are the fastest they can be.
  • The commands that can use slightly stale data can use the EC Table, and its indexes
  • Disadvantages
  • At least double the space consumption
No different than how NoSQL databases deal with it
  • Long running transactions can fill up WAL storage
  • Possible improvements that can help
  • Index-only Tables
  • EC Tables only on a replica.
  • tz_name data-type
    Before: functions accept a text param and then try to convert it to timezone name/abbreviation
    After: functions can simply declare they accept a tz_name, and the type-I/O functions will do the validation.
    Example: make_timestamptz_at_timezone function accepts 'text' and then converts it into TZ-name
  • pl/wasm
    Postgres extension to run WASM code
  • plpgsql-to-WASM
    Convert plpgsql to WASM, for speed, while retaining safety.
  • pg_wasm
    Postgres compiled to WebAssembly; to potentially run in Web Browsers
  • emPostgres
    Postgres as a library, embedded in other applications
  • csvPostgres
    emPostgres embedded in an application, with file_fdw enabled, to process CSV files using SQL.
    It could be an alternative to Q.
  • file.postgres
    Data directory in a single-file (think: SQLite replacement, when combined with emPostgres)
  • PGLite
    emPostgres and file.postgres, combined, will provide a SQLite alternative.
  • Muheam - Multi-headed Monster
  • Export/Import
    Enable copying relations as files between Postgres Instances
  • SQL - Engine Support
    List of SQL features and support level in various databases; a. la. C++ compiler support
  • Consistent read-after-write on replica
    Enable read-after-write consistency when using read-only replicas for read-scalability
    Notify-XID + libpq + backend-wait-for-XID
    - Application writes to primary
    - Application gets XID upon commit
    - Applciation sends that XID to start new transaction on replica
    - Replica backend waits until that XID is applied
  • Copy-on-Write system/object snapshots
    Similar to how Filesystem (e.g. ZFS) snapshots work
  • Contribute to PostgresML
  • Optimized Mode: Chained, One-shot Transactions
    In this mode of server operation, only one transaction is allowed to run at a time, theoretically.
    This allows the transaction to run without taking any locks, hence improving latency.
    The transaction must be fully available to the server; interactivity not allowed; savepoints not allowed.
    For better throughput, multiple transactions run at the same time.
    Each transaction knows exactly which transaction is ahead of it.
    A transaction cannot be committed before its predecessor transaction.
    Optimistically, each transaction assumes the transaction ahead of it is going to COMMIT; and _not_ ROLLBACK.
    Based on that optimism
    * the transaction ignores row-versions deleted by its precedecessor, and
    * considers the new row-versions created by the predecessor as committed
    - Problem: the predecessor may create many row-versions of a row;
    - How to identify which is the last row-version, so we can ignore the non-last row-versions.
    - This is currently a blocker.
    If a transaction fails (on ERROR, etc.), then all following transactions that are in-progress also fail.
    - This is the cost of optimistic transactions.
    Any transaction that starts after such failure, starts a fresh list/chain of transactions.
    When done with a row, a transaction passes the row on to its follower transaction.
    - This may be our antidote to the problem of identifying the non-last row-version.
    - A transaction _must_ only pass a row's last row-version to the follower transaction.
    - ... it _must_not_ pass any non-last row-versions of a row to the follower transaction.
    If not interested in a relation, the transaction must pass the whole relation to the next transaction.
    - This would help increase throughput, since the transactions working on unrelated sets of relations may proceed in parallel.
    As an optimization, a predecessor transaction would pass the negative-list of relations
    - Any relations not in the list are freely available to the follower to operate on.
    - The follower may add its own relations to the negative-list, and pass on the list to its follower.