Todo

This list contains some known PostgreSQL bugs, some feature requests, and some things we are not even sure we want. Many of these items are hard, and some are perhaps impossible. If you would like to work on an item, please read the Developer FAQ first. There is also a development information page.

- marks incomplete items
[D] - marks changes that are done, and will appear in the PostgreSQL 17 release.

Over time, it may become clear that a TODO item has become outdated or otherwise determined to be either too controversial or not worth the development effort. Such items should be retired to the Not Worth Doing page.

For help on editing this list, please see Talk:Todo. Please do not add or remove items here without discussion on the mailing list.

Development Process

WARNING for Developers: Unfortunately this list does not contain all the information necessary for someone to start coding a feature. Some of these items might have become unnecessary since they were added --- others might be desirable but the implementation might be unclear. When selecting items listed below, be prepared to first discuss the value of the feature. Do not assume that you can select one, code it and then expect it to be committed. Always discuss design on Hackers list before starting to code. The flow should be:

    Desirability -> Design -> Implement -> Test -> Review -> Commit

Administration

Check for unreferenced table files created by transactions that were in-progress when the server terminated abruptly Removing unreferenced files

Allow log_min_messages to be specified on a per-module basis: This would allow administrators to see more detailed information from specific sections of the backend, e.g. checkpoints, autovacuum, etc. Another idea is to allow separate configuration files for each module, or allow arbitrary SET commands to be passed to them. See also Logging Brainstorm.

Prevent query cancel packets from being replayed by an attacker, especially when using SSL Replay attack of query cancel

Maintain an approximate xid->"time of assignment" mapping. This would allow controlling the maximum effect of hot_standby_feedback on the primary, and would be useful for other features. https://www.postgresql.org/message-id/8800FAF9-22B0-46AC-817D-9DD6FA08A45D%40anarazel.de

Tablespaces

Allow WAL replay of CREATE TABLESPACE to work when the directory structure on the recovery computer is different from the original Some possible approaches Remarks about the difficulty of the item

Allow tablespaces on RAM-based partitions for unlogged tables http://archives.postgresql.org/pgsql-advocacy/2011-05/msg00033.php

Allow toast tables to be moved to a different tablespace moving toast table to its own tablespace patch : Allow toast tables to be moved to a different tablespace

Statistics Collector

Testing pgstat via pg_regress is tricky and inefficient. Consider making a dedicated pgstat test-suite. Re: REVIEW: Track TRUNCATE via pgstat

Teach stats collector to differentiate between internal and leaf index pages Stats collector's idx_blks_hit value is highly misleading in practice

Standby server mode

Prevent variables inherited from the server environment from being used for making streaming replication connections Re: Parameter name standby_mode

Data Types

Add support for public SYNONYMs Proposal for SYNONYMS http://archives.postgresql.org/pgsql-hackers/2010-11/msg02043.php http://archives.postgresql.org/pgsql-general/2010-12/msg00139.php

Consider a special data type for regular expressions Why is there a tsquery data type?

Allow deleting enumerated values from an existing enumerated data type Alter or rename enum value

Add overlaps geometric operators that ignore point overlaps http://archives.postgresql.org/pgsql-hackers/2010-03/msg00861.php

Arrays

Allow single-byte header storage for array elements

Text Search

Consider changing error to warning for strings larger than one megabyte BUG #3975: tsearch2 index should not bomb out of 1Mb limit

Improve default parser, to more easily allow adding new tokens http://archives.postgresql.org/message-id/23485.1297727826@sss.pgh.pa.us

Add additional support functions http://archives.postgresql.org/pgsql-hackers/2011-06/msg00319.php

XML

Add XML Schema validation and xmlvalidate functions (SQL:2008)

Add xmlvalidatedtd variant to support validating against a DTD?

Relax-NG validation; libxml2 supports this already

Add functions from SQL:2006: XMLDOCUMENT, XMLCAST, XMLTEXT

XMLDocument: https://commitfest.postgresql.org/51/5431/
XMLCast: https://commitfest.postgresql.org/50/5110/
XMLText: https://commitfest.postgresql.org/43/4257/

Add XMLNAMESPACES support in XMLELEMENT and elsewhere

XMLNamespaces: https://commitfest.postgresql.org/51/5456/ (for XMLElement)

Report errors returned by the XSLT library http://archives.postgresql.org/pgsql-hackers/2010-08/msg00562.php

XML Canonical: Convert XML documents to canonical form to compare them. libxml2 has support for this. https://commitfest.postgresql.org/43/4237/

Add pretty-printed XML output option

Parse a document and serialize it back in some indented form. libxml2 might support this.

https://commitfest.postgresql.org/42/4162/

Add XMLQUERY (from the SQL/XML standard)

Allow XML shredding: In some cases shredding could be better option (if there is no need to keep XML docs entirely, e.g. if we have already developed tools that understand only relational data. This would be a separate module that implements annotated schema decomposition technique, similar to DB2 and SQL Server functionality.

XPath: Adding the <x> at the root causes problems [1] [2] [3]

Verify Xpath escaping behavior Xpath behaviour unintuitive / arguably wrong xpath missing entity decoding - bug or feature

Functions

Implement Boyer-Moore searching in LIKE queries TODO item: Implement Boyer-Moore searching (First time hacker) Implement Boyer-Moore searching in LIKE queries Details of why this might be hard

Prevent malicious functions from being executed with the permissions of unsuspecting users

Indexed functions are safe, so VACUUM and ANALYZE are safe too. Triggers, CHECK and DEFAULT expressions, and rules are still vulnerable.

Some notes about the index-functions security vulnerability

Fix /contrib/btree_gist's implementation of inet indexing BUG #5705: btree_gist: Index on inet changes query result

Multi-Language Support

Add NCHAR (as distinguished from ordinary varchar) UTF8 national character data type support WIP patch and list of open issues.

Integrate collations with text search configurations Some TODO items for collations

Integrate collations with to_char() and related functions Some TODO items for collations

Support collation-sensitive equality and hashing functions contrib/citext versus collations

Support multiple simultaneous character sets, per SQL:2008

Fix contrib/fuzzystrmatch to work with multibyte encodings soundex function returns UTF-16 characters dmetaphone woes

Change memory allocation for multi-byte functions so memory is allocated inside conversion functions: Currently we preallocate memory based on worst-case usage.

Add ability to use case-insensitive regular expressions on multi-byte characters

Currently it works for UTF-8, but not other multi-byte encodings

Improve encoding of connection startup messages sent to the client

Currently some authentication error messages are sent in the server encoding

Windows: Cache MessageEncoding conversion for use outside transactions http://www.postgresql.org/message-id/20150812055719.GA1945333@tornado.leadboat.com

Views and Rules

Improve ability to modify views via ALTER TABLE Re: idea: storing view source in system catalogs modifying views Re: patch: Add columns via CREATE OR REPLACE VIEW

SQL Commands

Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT Adding CORRESPONDING to Set Operations. How not to write this patch.

Improve type determination of unknown (NULL or quoted literal) result columns for UNION/INTERSECT/EXCEPT UNION construct type cast gives poor error message

Allow prepared transactions with temporary tables created and dropped in the same transaction, and when an ON COMMIT DELETE ROWS temporary table is accessed Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables A suggestion on how to implement this

Allow NOTIFY in rules involving conditionals

Allow NOTIFY to handle transaction wraparound XID-wraparound hazards in LISTEN/NOTIFY

Allow LISTEN on patterns http://www.postgresql.org/message-id/52693FC5.7070507@gmail.com

Simplify dropping roles that have objects in several databases

Add support for WITH RECURSIVE ... CYCLE WITH RECURSIVE ... CYCLE in vanilla SQL: issues with arrays of rows

Add DEFAULT .. AS OWNER so permission checks are done as the table owner: This would be useful for SERIAL nextval() calls and CHECK constraints.

Add comments on system tables/columns using the information in catalogs.sgml: Ideally the information would be pulled from the SGML file automatically.

Prevent the specification of conflicting transaction read/write options Re: SET TRANSACTION and SQL Standard

Allow PREPARE of cursors

Have DISCARD PLANS discard plans cached by functions

DISCARD ALL should do the same.

http://archives.postgresql.org/pgsql-hackers/2011-01/msg00431.php

Avoid multiple-evaluation of BETWEEN and IN arguments containing volatile expressions http://archives.postgresql.org/message-id/4D95B605.2020709@enterprisedb.com

CREATE

Allow CREATE TABLE AS to determine column lengths for complex expressions like SELECT col1

Have WITH CONSTRAINTS also create constraint indexes Re: CREATE TABLE LIKE INCLUDING INDEXES support

Move NOT NULL constraint information to pg_constraint

Currently NOT NULL constraints are stored in pg_attribute without any designation of their origins, e.g. primary keys. One manifest problem is that dropping a PRIMARY KEY constraint does not remove the NOT NULL constraint designation. Another issue is that we should probably force NOT NULL to be propagated from parent tables to children, just as CHECK constraints are. (But then does dropping PRIMARY KEY affect children?)

Prevent concurrent CREATE TABLE from sometimes returning a cryptic error message BUG #3692: Conflicting create table statements throw unexpected error

Add CREATE SCHEMA ... LIKE that copies a schema

Fix CREATE OR REPLACE FUNCTION to not leave objects depending on the function in inconsistent state indexes on functions and create or replace function

Allow temporary tables to exist as empty by default in all sessions what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL idea: global temp tables Re: idea: global temp tables global temporary tables Temporary tables under hot standby Implementation of global temporary tables?

Allow the creation of "distinct" types Distinct types

Consider analyzing temporary tables when they are first used in a query

Autovacuum cannot analyze or vacuum temporary tables.

autovacuum and temp tables support

UPDATE

Research self-referential UPDATEs that see inconsistent row versions in read-committed mode Concurrently updating an updatable view

Improve performance of EvalPlanQual mechanism that rechecks already-updated rows

This is related to the previous item, which questions whether it even has the right semantics

Add system that disables HOT and block-local updates when there are significant amounts of free space earlier in the table, and automates updating the tuples in the tail of the table.

This moves tuples more to the front of the table, increasing the chance a subsequent VACUUM operation can truncate parts of the table.

Disabling Heap-Only Tuples

ALTER

Have ALTER TABLE RENAME of a SERIAL column rename the sequence Re: newbie: renaming sequences task renaming implicit sequences

Add ALTER DOMAIN to modify the underlying data type

Allow ALTER TABLESPACE to move the tablespace to different directories

Allow moving system tables to other tablespaces, where possible: Currently non-global system tables must be in the default database tablespace. Global system tables can never be moved.

Have ALTER INDEX update the name of a constraint using that index

Allow column display reordering by recording a display, storage, and permanent id for every column? Re: column ordering, was Re: [PATCHES] Enums patch v2 Column reordering in pg_dump http://archives.postgresql.org/message-id/1324412114-sup-9608@alvh.no-ip.org logical column order and physical column order logical column ordering

Allow deactivating (and reactivating) indexes via ALTER TABLE http://archives.postgresql.org/pgsql-hackers/2010-12/msg01191.php

Add ALTER OPERATOR ... RENAME

needs to consider effects of changing operator precedence

Missing rename support

CLUSTER

Automatically maintain clustering on a table

This might require some background daemon to maintain clustering during periods of low usage. It might also require tables to be only partially filled for easier reorganization. Another idea would be to create a merged heap/index data file so an index lookup would automatically access the heap data too. A third idea would be to store heap rows in hashed groups, perhaps using a user-supplied hash function.

Allow CLUSTER to be used on partial indexes http://www.postgresql.org/message-id/CAMkU%3D1zYwoHHsqJ8wfK3GdG_t_a6t4RK-GFDSKymQ0EGP%3DtypA@mail.gmail.com

COPY

Allow COPY to report errors sooner Timely reporting of COPY errors

Allow COPY FROM to create index entries in bulk Batch update of indexes on data loading

Allow a stalled COPY to exit if the backend is terminated Re: possible bug not in open items

GRANT/REVOKE

Allow dropping of a role that has connection rights DROP ROLE dependency tracking ...

DECLARE CURSOR

Prevent DROP TABLE from dropping a table referenced by its own open cursor?

Provide some guarantees about the behavior of cursors that invoke volatile functions Re: Cursor with hold emits the same row more than once across commits in 8.3.7

SHOW/SET

Rationalize the discrepancy between settings that use values in bytes and SHOW that returns the object count Re: [ADMIN] shared_buffers and shmmax

ANALYZE

Improve how ANALYZE computes in-doubt tuples VACUUM/ANALYZE counting of in-doubt tuples

Reduce memory use when analyzing many tables in a single command by making catcache and syscache flushable or bounded. Thousands of schemas and ANALYZE goes out of memory

EXPLAIN

Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage

Have EXPLAIN ANALYZE report rows as floating-point numbers explain analyze rows=%.0f Re: explain analyze rows=%.0f

Have EXPLAIN ANALYZE report buckets and memory usage for HashAggregate to-do item for explain analyze of hash aggregates?

Window Functions

See TODO items for window functions.

Support creation of user-defined window functions: We have the ability to create new window functions written in C. Is it worth the effort to create an API that would let them be written in PL/pgsql, etc?

Investigate tuplestore performance issues

The tuplestore_in_memory() thing is just a band-aid, we ought to try to solve it properly. tuplestore_advance seems like a weak spot as well.

tuplestore potential performance problem

Do we really need so much duplicated code between Agg and WindowAgg?

Teach planner to evaluate multiple windows in the optimal order

Currently windows are always evaluated in the query-specified order.

http://archives.postgresql.org/message-id/3CDAD71E9D70417290FCF66F0178D1E1@amd64

Implement DISTINCT clause in window aggregates: Some proprietary RDBMSs have implemented it already, so it helps with porting from those.

Integrity Constraints

Keys

Improve deferrable unique constraints for cases with many conflicts: The current implementation fires a trigger for each potentially conflicting row. This might not scale well for an update that changes many key values at once.

Referential Integrity

Add MATCH PARTIAL referential integrity

Change foreign key constraint for array -> element to mean element in array? foreign keys for array/period contains relationships

Fix problem when cascading referential triggers make changes on cascaded tables, seeing the tables in an intermediate state Re: [PATCHES] Work-in-progress referential action trigger timing

Are ri_KeysEqual checks in the RI enforcement triggers still necessary? Re: Effects of cascading references in foreign keys

Check Constraints

Run check constraints only when affected columns are changed http://archives.postgresql.org/message-id/1326055327.15293.13.camel@vanquo.pezone.net

Do not scan the table when a check constraint is added in the same command that adds the column skip table scan for adding column with provable check constraints

Server-Side Languages

Add support for polymorphic arguments and return types to languages other than PL/PgSQL

Add support for OUT and INOUT parameters to languages other than PL/PgSQL

Add more fine-grained specification of functions taking arbitrary data types RfD: more powerful "any" types

Allow holdable cursors in SPI

Rethink query plan caching and timing of parse analysis within SQL-language functions

They should work more like plpgsql functions do ...

Re: BUG #6019: invalid cached plan on inherited table

Allow regex operations in PL/Perl using UTF8 characters in non-UTF8 encoded databases

PL/pgSQL

Allow listing of record column names, and access to record columns via variables, e.g. columns := r.(*), tval2 := r.(colname) Re: PL/PGSQL: Dynamic Record Introspection colnames: Extension to retrieve column names from a record

Allow row and record variables to be set to NULL constants, and allow NULL tests on such variables

Because a row is not scalar, do not allow assignment from NULL-valued scalars.

NULL and plpgsql rows

Consider keeping separate cached copies when search_path changes pl/pgsql Plan Invalidation and search_path

Improve handling of NULL row values vs. NULL rows Null row vs. row of nulls in plpgsql http://archives.postgresql.org/pgsql-hackers/2010-10/msg01973.php

Improve PERFORM handling of WITH queries or document limitation http://archives.postgresql.org/pgsql-bugs/2011-03/msg00309.php

PL/Python

Create a new restricted execution class that will allow passing function arguments in as locals. Passing them as globals means functions cannot be called recursively. Re: pl/python do not delete function arguments

Add a DB-API compliant interface on top of the SPI interface http://petereisentraut.blogspot.com/2011/11/plpydbapi-db-api-for-plpython.html

For functions returning a setof record with a composite type, cache the I/O functions for the composite type http://archives.postgresql.org/pgsql-hackers/2010-12/msg02007.php

Clients

Split out pg_resetxlog output into pre- and post-sections http://archives.postgresql.org/pgsql-hackers/2010-08/msg02040.php

Improve pg_rewind Proposal: pg_rewind to skip config files

psql

Move psql backslash database information into the backend, use mnemonic commands?

This would allow non-psql clients to pull the same information out of the database as psql.

Re: psql \d option list overloaded

Consistently display privilege information for all objects in psql

Add a \set variable to control whether \s displays line numbers

Another option is to add \# which lists line numbers, and allows command execution.

Re: psql possible TODO

Include the symbolic SQLSTATE name in verbose error reports Re: Checking is TSearch2 query is valid

Add option to wrap column values at whitespace boundaries, rather than chopping them at a fixed width.

Currently, "wrapped" format chops values into fixed widths. Perhaps the word wrapping could use the same algorithm documented in the W3C specification.

Fix FETCH_COUNT to handle SELECT ... INTO and WITH queries http://archives.postgresql.org/pgsql-hackers/2010-05/msg01565.php http://archives.postgresql.org/pgsql-bugs/2010-05/msg00192.php

Prevent psql from sending remaining single-line multi-statement queries after reconnecting http://archives.postgresql.org/pgsql-bugs/2010-05/msg00159.php http://archives.postgresql.org/pgsql-hackers/2010-05/msg01283.php

Improve speed of tab completion by using LIKE http://www.postgresql.org/message-id/20120821174847.GL1267@tamriel.snowman.net

Add handling of Unicode grapheme clusters http://www.postgresql.org/message-id/Y8iMr2wi2ABrOSBH@momjian.us

pg_dump / pg_restore

Dump security labels and comments on databases in a way that allows to load a dump into a differently named database security labels on databases are bad for dump & restore

Add full object name to the tag field. eg. for operators we need '=(integer, integer)', instead of just '='.

Avoid using platform-dependent names for locales in pg_dumpall output

Using native locale names puts roadblocks in the way of porting a dump to another platform. One possible solution is to get CREATE DATABASE to accept some agreed-on set of locale names and fix them up to meet the platform's requirements.

http://archives.postgresql.org/message-id/21396.1241716688@sss.pgh.pa.us

In a selective dump, allow dumping of an object and all its dependencies

Stop dumping CASCADE on DROP TYPE commands in clean mode

Allow pg_restore to load different parts of the COPY data for a single table simultaneously

Preserve sparse storage of large objects over dump/restore TODO item: teach pg_dump about sparsely-stored large objects

Prevent PL/pgSQL comment from throwing an error in a non-superuser restore Reloading dump fails at COMMENT ON EXTENSION plpgsql

Delay REFRESH MATERIALIZED VIEW until dependent indexes are created pg_restore unusable for expensive matviews

Remove or document the behavior of --data-only request clarification on pg_restore documentation

pg_upgrade

Handle large object comments: This is difficult to do because the large object doesn't exist when --schema-only is loaded.

Consider using pg_depend for checking object usage in version.c

Migrate pg_statistic by dumping it out as a flat file, so analyze is not necessary pg_upgrade and statistics Speeding up pg_upgrade

Find cleaner way to start/stop dedicated servers for upgrades http://archives.postgresql.org/pgsql-hackers/2012-08/msg00275.php

Desired changes that would prevent upgrades with pg_upgrade 32-bit page checksums Add metapage to GiST indexes Clean up hstore's internal representation Remove tuple infomask bit HEAP_MOVED_OFF and HEAP_MOVED_IN fix char() index trailing space handling Use non-collation-aware comparisons for GIN opclasses

ecpg

Docs: Document differences between ecpg and the SQL standard and information about the Informix-compatibility module.

Solve cardinality > 1 for input descriptors / variables?

Add a semantic check level, e.g. check if a table exists

fix handling of DB attributes that are arrays

sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified

Make SET CONNECTION thread-aware, non-standard?

Allow multidimensional arrays

Implement COPY FROM STDIN

Provide a way to specify size of a bytea parameter BUG #4866: ECPG and BYTEA

Allow reuse of cursor name variables Problems with variable cursorname in ecpg

libpq

Add PQexecf() that allows complex parameter substitution Last minute mini-proposal (I know, know) for PQexecf()

Add SQLSTATE and severity to errors generated within libpq itself v8.1: Error severity on libpq PGconn* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01425.php

Add support for interface/ipaddress binding to libpq SR/libpq - outbound interface/ipaddress binding

When receiving a FATAL error remember it, so that it doesn't profess ingnorance about why the session was closed Idle In Transaction Session Timeout, revived

Pipelining support for libpq async API and an array-valued PQexecPrepared that uses it Foreign table batched inserts

Triggers

Improve storage of deferred trigger queue

Right now all deferred trigger information is stored in backend memory. This could exhaust memory for very large trigger queues. This item involves dumping large queues into files, or doing some kind of join to process all the triggers, some bulk operation, or a bitmap.

With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN KEY: If the dump is known to be valid, allow foreign keys to be added without revalidating the data.

When statement-level triggers are defined on a parent table, have them fire only on the parent table, and fire child table triggers only where appropriate Statement-level triggers and inheritance

Tighten trigger permission checks Security leak with trigger functions?

Allow BEFORE INSERT triggers on views Re: Why can't I put a BEFORE EACH ROW trigger on a view?

Add database and transaction-level triggers Proposal for db level triggers triggers on prepare, commit, rollback... ?

Avoid requirement for AFTER trigger functions to return a value http://archives.postgresql.org/pgsql-hackers/2011-02/msg02384.php

Allow creation of inline triggers http://archives.postgresql.org/pgsql-hackers/2012-02/msg00708.php

Inheritance

Allow unique indexes across inherited tables (requires multi-table indexes) Postgres 11 allows unique indexes across partitions if the partition key is part of the index.

Research whether ALTER TABLE / SET SCHEMA should work on inheritance hierarchies (and thus support ONLY)

ALTER TABLE variants sometimes support recursion and sometimes not, but this is poorly/not documented, and the ONLY marker would then be silently ignored. Clarify the documentation, and reject ONLY if it is not supported.

Indexes

Prevent index uniqueness checks when UPDATE does not modify the column

Uniqueness (index) checks are done when updating a column even if the column is not modified by the UPDATE. However, HOT already short-circuits this in common cases, so more work might not be helpful.

http://www.postgresql.org/message-id/CA+TgmoZOyaTanfEvNUdiHBCuu9Zh0JVP1e_UTVbx6Rvj9vFC9Q@mail.gmail.com

Consider having a larger statistics target for indexed columns and expression indexes.

Allow multiple indexes to be created concurrently, ideally via a single heap scan

pg_restore allows parallel index builds, but it is done via subprocesses, and there is no SQL interface for this. Cluster could definitely benefit from this.

Consider sorting entries before inserting into btree index Re: ATTN: Clodaldo was Performance problem. Could it be related to 8.3-beta4?

Consider using "effective_io_concurrency" for index scans

Currently only bitmap scans use this, which might be fine because most multi-row index scans use bitmap scans.

Prefetch index pages for B-Tree index scans

Allow GIN indexes to be used for exclusion constraints http://archives.postgresql.org/pgsql-hackers/2012-05/msg00669.php

Allow "loose" scans on btree indexes in which the first column has low cardinality Re: Loose Index Scans by Planner On -hackers, but using the term "skip scan" instead of "loose index scan": Index Skip Scan, and Index Skip Scan (new UniqueKeys)

Allow "skip" scans on multi-column btree indexes This means applying techniques similar to those detailed in the Multidimensional Access Method (MDAM) paper Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

Make the planner's "special index operator" mechanism extensible http://www.postgresql.org/message-id/27270.1364700924@sss.pgh.pa.us

Allow index-only COUNT(*) for indexes which don't support index-only scans

Improve GIN performance Small GIN optimizations (after 9.4)

Teach GIN cost estimation about "fast scans" http://www.postgresql.org/message-id/53208B4D.5000806@vmware.com

Allow unlogged indexes http://www.postgresql.org/message-id/11561.1414793261@sss.pgh.pa.us

GIST

Add more GIST index support for geometric data types

Allow GIST indexes to create more complex index types, like digital trees (see Aoki)

Fix performance issues in contrib/seg and contrib/cube GiST support GiST index performance draft patch Re: GiST index performance GiST index performance

GiST index support for arrays

Hash

Add UNIQUE capability to hash indexes Re: PG10 Crash-safe and replicable Hash Indexes and UNIQUE

Allow multi-column hash indexes This requires all columns to be specified for a query to use the index. Write Ahead Logging for Hash Indexes

Sorting

Consider detoasting keys before sorting

Allow sorts of skinny tuples to use even more available memory.

Now that it is not limited by MaxAllocSize, don't limit by INT_MAX either.

Re: MemoryContextAllocHuge(): selectively bypassing MaxAllocSize

Cache Usage

Consider automatic caching of statements at various levels: Parsed query tree Query execute plan Query results Cached Query Plans (was: global prepared statements) PoC plpgsql - possibility to force custom or generic plan Cached/global query plans, autopreparation

Consider allowing higher priority queries to have referenced shared buffer pages stay in memory longer Re: How to keep a table in memory?

Fix memory leak caused by negative catcache entries Re: Memory leak in PL/pgSQL function which CREATE/SELECT/DROP a temporary table Protect syscache from bloating with negative cache entries

Better regexp cache management: This would be fairly isolated, but involve memory management, error handling, data structures, GUCs. Each backend has a cache of compiled regular expressions. If you run SELECT 'abc' ~ 'a.c' , then SELECT ident FROM pg_backend_memory_contexts WHERE name = 'RegexpMemoryContext' will show the cached expression. Currently the size of the cache is fixed at MAX_CACHED_RES (32), and the array of cached expressions is searched linearly by RE_compile_and_cache(), with a performance mitigation that recently used expressions are moved to the front of the array. An application making heavy use of regular expressions might want to increase the limit, possibly expressing it using something like SET regexp_cache_size='1MB' , and to index it with a hash table, possibly using simplehash.h, while still maintaining an LRU list, possibly using dlist.

Vacuum

Allow VACUUM FULL and CLUSTER to update the visibility map https://www.postgresql.org/message-id/CAPpHfdtcxUVys6r6JqX2s%2B5CowXDKMOnfcJoJWHyjWDe%3DhwHMw%40mail.gmail.com

Improve tracking of total relation tuple counts now that vacuum doesn't always scan the whole heap Partial vacuum versus pg_class.reltuples

Bias FSM towards returning free space near the beginning of the heap file, in hopes that empty pages at the end can be truncated by VACUUM FSM search modes

Add a way to compact tables without exclusive locking, similar to pre-9.0 VACUUM FULL Re: Feedback on getting rid of VACUUM FULL

Provide more information in order to improve user-side estimates of dead space bloat in relations Re: Bloated Table

Reduce the number of table scans performed by vacuum http://archives.postgresql.org/pgsql-hackers/2011-05/msg01119.php http://archives.postgresql.org/pgsql-hackers/2011-07/msg00624.php

Vacuum Gin indexes in physically order rather than logical order http://archives.postgresql.org/pgsql-hackers/2012-04/msg00443.php

Avoid creation of the free space map for small tables http://archives.postgresql.org/pgsql-hackers/2011-11/msg01751.php

Auto-vacuum

Issue log message to suggest VACUUM FULL if a table is nearly empty? discussion

Prevent long-lived temporary tables from causing frozen-xid advancement starvation

The problem is that autovacuum cannot vacuum them to set frozen xids; only the session that created them can.

Re: AutoVacuum Behaviour Question

Prevent autovacuum from running if an old transaction is still running from the last vacuum Re: Autovacuum and OldestXmin

Have autoanalyze of parent tables occur when child tables are modified http://archives.postgresql.org/pgsql-performance/2010-06/msg00137.php

Allow visibility map all-visible bits to be set even when an auto-ANALYZE is running http://archives.postgresql.org/pgsql-hackers/2012-01/msg00356.php

Improve autoanalyze thresholds for small tables http://www.postgresql.org/message-id/5078AD6B.8060802@agliodbs.com

Make autovacuum scheduling smarter https://www.postgresql.org/message-id/0A3221C70F24FB45833433255569204D1F8A4DC6%40G01JPEXMBYT05 https://www.postgresql.org/message-id/CAD21AoBUaSRBypA6pd9ZD%3DU-2TJCHtbyZRmrS91Nq0eVQ0B3BA%40mail.gmail.com

Locking

Fix problem when multiple subtransactions of the same outer transaction hold different types of locks, and one subtransaction aborts FOR SHARE vs FOR UPDATE locks Re: FOR SHARE vs FOR UPDATE locks Re: [PATCHES] [pgsql-patches] Phantom Command IDs, updated patch Re: savepoints and upgrading locks

Improve deadlock detection when a page cleaning lock conflicts with a shared buffer that is pinned BUG #3883: Autovacuum deadlock with truncate? Thoughts about bug #3883 Re: pgsql: Add checks to TRUNCATE, CLUSTER, and REINDEX to prevent

Detect deadlocks involving LockBufferForCleanup() Thoughts about bug #3883

Allow finer control over who is cancelled in a deadlock http://archives.postgresql.org/pgsql-hackers/2011-03/msg01727.php

Startup Time Improvements

Allow backends to change their database without restart

This allows for faster server startup.

Write-Ahead Log

Eliminate need to write full pages to WAL before page modification

Currently, to protect against partial disk page writes, we write full page images to WAL before they are modified so we can correct any partial page writes during recovery. These pages can also be eliminated from point-in-time archive files.

When full page writes are off, write CRC to WAL and check file system blocks on recovery: If CRC check fails during recovery, remember the page in case a later CRC for that page properly matches. The difficulty is that hint bits are not WAL logged, meaning a valid page might not match the earlier CRC.

Write full pages during file system write and not when the page is modified in the buffer cache

This allows most full page writes to happen in the background writer. It might cause problems for applying WAL on recovery into a partially-written page, but later the full page will be replaced from WAL.

Page Checksums + Double Writes

Allow WAL information to recover corrupted pg_controldata Re: [HACKERS] pg_resetxlog -r flag

Speed WAL recovery by allowing more than one page to be prefetched

This should be done utilizing the same infrastructure used for prefetching in general to avoid introducing complex error-prone code in WAL replay.

Improve WAL concurrency by increasing lock granularity Reworking WAL locking

Have resource managers report the duration of their status changes Recovery of Multi-stage WAL actions

Close deleted WAL files held open in *nix by long-lived read-only backends Deleted WAL files held open by backends in Linux Re: Deleted WAL files held open by backends in Linux

Use less bytes to store the information of WAL records.

The WAL record format is quite verbose, and often contains more bytes than necessary for redo and decoding. By updating the WAL format we can reduce the average WAL record size of e.g. pgbench by several percent.

Updating the WAL infrastructure tracks progress on some active development
Re: RFC: WAL infrastructure issues, updates and improvements

Optimizer / Executor

Improve selectivity functions for geometric operators

Consider adding a useful and convenient way of seeing paths rejected by the planner. OPTIMIZER_DEBUG does this so some extent but it seemingly gets very little use and must be enabled during compilation. http://archives.postgresql.org/pgsql-hackers/2012-08/msg00597.php https://postgr.es/m/1fd8e494-4740-4108-6d06-d5a173836047@gmail.com

Log statements where the optimizer row estimates were dramatically different from the number of rows actually found?

Consider compressed annealing to search for query plans

This might replace GEQO.

http://archives.postgresql.org/message-id/15658.1241278636%40sss.pgh.pa.us

Add planner support for cardinality-reducing functions: estimate_num_groups currently assumes (under item 2 in its header comment) that any function on an attribute won't meaningfully decrease the cardinality (number of groups) generated from that attribute, i.e. num_groups(a) ~= num_groups(f(a)). However, for truncating functions such as int4mod, date_trunc, or date_bin, this assumption is wrong and will result in plans that can significnatly overestimate the number of groups.

Hashing

Allow single batch hash joins to preserve outer pathkeys Re: Potential Join Performance Issue a few crazy ideas about hash joins

Avoid building the same hash table more than once during the same query a few crazy ideas about hash joins

Avoid hashing for distinct and then re-hashing for hash join Re: Fixing Grittner's planner issues a few crazy ideas about hash joins

Background Writer

Consider having the background writer update the transaction status hint bits before writing out the page

Implementing this requires the background writer to have access to system catalogs and the transaction status log.

Re: Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables

Consider adding buffers the background writer finds reusable to the free list Background LRU Writer/free list our buffer replacement strategy is kind of lame Page replacement algorithm in buffer cache Move unused buffers to freelist

Automatically tune bgwriter_delay based on activity rather then using a fixed interval Background LRU Writer/free list our buffer replacement strategy is kind of lame

Consider whether increasing BM_MAX_USAGE_COUNT improves performance Bgwriter LRU cleaning: we've been going at this all wrong

Test to see if calling PreallocXlogFiles() from the background writer will help with WAL segment creation latency Re: Load Distributed Checkpoints, final patch

Concurrent Use of Resources

Do async I/O for faster random read-ahead of data

Async I/O allows multiple I/O requests to be sent to the disk with results coming back asynchronously.

The above patch is already applied as of 8.4, but it still remains to figure out how to handle plain indexscans effectively.

Problems with the patch submitted for posix_fadvise in index scans

TOAST

Allow user configuration of TOAST thresholds Re: Proposed adjustments in MaxTupleSize and toastthresholds pg_lzcompress strategy parameters

Reduce unnecessary cases of deTOASTing Re: [PATCHES] Eliminate more detoast copies for packed varlenas

Reduce costs of repeat de-TOASTing of values WIP patch: reducing overhead for repeat de-TOASTing

Monitoring

Have pg_stat_activity display query strings in the correct client encoding pg_stats queries versus per-database encodings

Allow reporting of stalls due to wal_buffer wrap-around http://archives.postgresql.org/pgsql-hackers/2012-02/msg00826.php

Restructure pg_stat_database columns tup_returned and tup_fetched to return meaningful values http://www.postgresql.org/message-id/20121012060345.GA29214@toroid.org

Improve handling of pg_stat_statements handling of bind "IN" variables Revisiting pg_stat_statements and IN()

Miscellaneous Performance

Allow configuration of backend priorities via the operating system

Though backend priorities make priority inversion during lock waits possible, research shows that this is not a huge problem.

Priorities for users or queries?

Consider if CommandCounterIncrement() can avoid its AcceptInvalidationMessages() call pgsql: Avoid incrementing the CommandCounter when

Consider Cartesian joins when both relations are needed to form an indexscan qualification for a third relation Re: TB-sized databases

Consider not storing a NULL bitmap on disk if all the NULLs are trailing Proposal for Null Bitmap Optimization(for Trailing NULLs) Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

Sort large UPDATE/DELETEs so it is done in heap order Possible future performance improvement: sort updates/deletes by ctid

Add auto-tuning of work_mem Auto-tuning work_mem and maintenance_work_mem

Consider decreasing the I/O caused by updating tuple hint bits Hint Bits and Write I/O Re: [HACKERS] Hint Bits and Write I/O http://archives.postgresql.org/pgsql-hackers/2010-10/msg00695.php http://archives.postgresql.org/pgsql-hackers/2010-11/msg00792.php http://archives.postgresql.org/pgsql-hackers/2011-01/msg01063.php http://archives.postgresql.org/pgsql-hackers/2011-03/msg01408.php http://archives.postgresql.org/pgsql-hackers/2011-03/msg01453.php

Avoid reading in b-tree pages when replaying vacuum records in hot standby mode Hot Standby tuning for btree_xlog_vacuum()

Restructure truncation logic to be more resistant to failure

This also involves not writing dirty buffers for a truncated or dropped relation

http://archives.postgresql.org/pgsql-hackers/2010-08/msg01032.php

Enhance foreign data wrappers, parallelism, partitioning, and perhaps add a global snapshot/transaction manager to allow creation of a proof-of-concept built-in sharding solution

Ideally these enhancements and new facilities will be available to external sharding solutions as well.

Built-in sharding wiki

Consider a built-in connection pooler Builtin connection polling

Miscellaneous Other

Deal with encoding issues for filenames in the server filesystem a proposed patch here some issues about it here Windows-specific patch here

Deal with encoding issues in the output of localeconv() bug report draft patch review of patch

Provide schema name and other fields available from SQL GET DIAGNOSTICS in error reports How to get schema name which violates fk constraint patch - Report the schema along table name in a referential failure error message Re: NOT NULL violation and error-message the case for machine-readable error fields

Use sa_mask to close race conditions between signal handlers http://www.postgresql.org/message-id/20130911013107.GA225735@tornado.leadboat.com

Allow pg_export_snapshot() to run on hot standby servers

This will allow parallel pg_dump on such servers.

pg_export_snapshot on standby side

Provide a way to enumerate and unregister background workers

Right now the only way to unregister bgworkers is from within the worker with proc_exit(0) or registering with BGW_NEVER_RESTART

https://www.postgresql.org/message-id/CAMsr%2BYG-fD%2BmP-BNZDheVYucC7%3DoUn8ByTQSFz7RKkVX2MRS2Q%40mail.gmail.com

Source Code

Rationalize division of labor between initdb and bootstrap initdb / bootstrap design

Allow cross-compiling by generating the zic database on the target system

Improve NLS maintenance of libpgport messages linked onto applications

Use UTF8 encoding for NLS messages so all server encodings can read them properly

Consider making NAMEDATALEN more configurable

There is demand for making 128 the default, but there are also concerns about storage and memory usage and performance. So a rearchitecting to make the storage variable-length might be preferred.

Discussions when it was changed from 32 to 64: [4] [5] [6] [7]
Revisiting NAMEDATALEN
NAMEDATALEN increase because of non-latin languages (contains ideas about variable-length storage)

Research use of signals and sleep wake ups Restartable signals 'n all that

Consider simplifying how memory context resets handle child contexts Re: Memory leak in nodeAgg

Implement the non-threaded Avahi service discovery protocol Re: [PATCHES] Avahi support for Postgresql Re: Avahi support for Postgresql Re: [PATCHES] Avahi support for Postgresql Re: [HACKERS] Avahi support for Postgresql

Reduce data row alignment requirements on some 64-bit systems [WIP] Reduce alignment requirements on 64-bit systems.

Restructure TOAST internal storage format for greater flexibility Re: PG_PAGE_LAYOUT_VERSION 5 - time for change

Consider removing the attribute options cache http://archives.postgresql.org/pgsql-hackers/2011-03/msg00039.php

Restructure /contrib section http://archives.postgresql.org/pgsql-hackers/2011-06/msg00705.php

Windows

Allow psql to use readline once non-US code pages work with backslashes

Improve signal handling Simplify Win32 Signaling code

Support PGXS when using MSVC

Fix MSVC NLS support, like for to_char() NLS on MSVC strikes back! Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

Fix global namespace issues when using multiple terminal server sessions problems with Windows global namespace

Improve consistency of path separator usage http://archives.postgresql.org/message-id/49C0BDC5.4010002@hagander.net

Fix cross-compiling on Windows http://archives.postgresql.org/pgsql-bugs/2010-10/msg00110.php

Reduce file statistics overhead on directory reads http://www.postgresql.org/message-id/1338325561.82125.YahooMailNeo@web39304.mail.mud.yahoo.com

Fix hang with long file paths Long paths for tablespace leads to uninterruptible hang in Windows

Wire Protocol Changes / v4 Protocol

Allow dynamic character set handling

place like application_name but for arbitrary labels, and protocol support to pass labels inline with the query instead of a separate round trip. at present, things like sqlcommenter that add labels to queries to support tracing (eg. OpenTelemetry) break pg_stat_statements usefulness bc every query has trace_id in comments. https://www.postgresql.org/message-id/flat/20250310230900.01b58a29%40ardentperf.com

Ensure the client can determine the encoding of messages sent early in the handshake http://www.postgresql.org/message-id/541A5659.5050006@2ndquadrant.com

Let the client indicate character encoding of database names, user names, passwords, and of pre-auth error messages returned by the server http://www.postgresql.org/message-id/16160.1360540050@sss.pgh.pa.us http://www.postgresql.org/message-id/20131220030725.GA1411150@tornado.leadboat.com

Send numeric version to clients in fixed header patch to send server_version_num in v3 protocol as GUC_REPORT

Add decoded type length/precision (i.e. typmod information)

Mark result columns as known-not-null when possible Adding nullable indicator to Describe

Use compression

Specify and implement wire protocol compression. If SSL transparent compression is used, hopefully avoid the overhead of key negotiation and encryption when SSL is configured only for compression. Note that compression is being removed from TLS 1.3 so we really need to do it ourselves.

http://archives.postgresql.org/pgsql-hackers/2012-06/msg00793.php

Update clients to use data types, typmod, schema.table.column names of result sets using new statement protocol

Set protocol for wire format negotiation GUC_REPORT for protocol tunables

Make sure upgrading to a 4.1 protocol version will actually work smoothly Re: libpq, PQdescribePrepared -> PQftype, PQfmod, no PQnullable

Allow multi-state authentication (e.g. try client peer, fall back to md5) http://www.postgresql.org/message-id/51A44185.5060306@2ndquadrant.com http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi http://www.postgresql.org/message-id/54DB1D4E.8090700@vmware.com https://commitfest.postgresql.org/6/320/

Allow re-authentication: Let the client request re-authentication as a different user mid session, for connection pools that pass through the handshake.

Identify the affected object in CommandComplete message? http://www.postgresql.org/message-id/CAAfz9KNGVoyM+z_2tnPKTDXG_RdR9a33Y5s+zQ9LdwTTsqqZng@mail.gmail.com

Allow negotiation of encryption, STARTTLS style, rather than forcing client to decide on SSL or !SSL before connecting http://www.postgresql.org/message-id/5406EAD3.7070002@2ndquadrant.com

Permit lazy fetches of large values, at least out-of-line TOASTED values http://www.postgresql.org/message-id/53FF0EF8.100@2ndquadrant.com

Add session-level whitelisting of types for binary-mode transfer http://www.postgresql.org/message-id/30470.1412055068@sss.pgh.pa.us

Send client the xid when it is allocated: Lets the client later ask the server "did this commit or not?" after interterminate result due to crash or connection loss

Report xlog position in commit message

Help enable client-side failover by providing a token clients can use to see if a commit has replayed to replicas yet

http://www.postgresql.org/message-id/53E2D346.9030806@2ndquadrant.com

Changes to make cancellations more reliable and more secure http://www.postgresql.org/message-id/CADT4RqAUd7wYYsM9D7GHJnZj3J79D4W%3Dved2kqM5mVt5cuGHgg@mail.gmail.com

Clarify semantics of statement_timeout in extended query protocol

Batched and pipelined queries have unexpected behaviour with statement_timeout. Client needs to be able to specify statement boundary with protocol message.

https://www.postgresql.org/message-id/20160528.220442.1489791680347556026.t-ishii@sraoss.co.jp

Create a more efficient way to handle out-of-line parameters Re: Slowness of extended protocol

Separate transaction delineation from protocol error recovery (in v3 both are managed via the same Sync message) https://www.postgresql.org/message-id/CADT4RqDdo9EcFbxwB_YO2H3BVZ0t-1qqZ%3D%2B%2BdVMnYaN6BpyUGQ%40mail.gmail.com https://www.postgresql.org/message-id/CAMsr%2BYEgnJ8ZAWPLx5%3DBCbYYq9SNTdwbwvUcb7V-vYm5d5uhbQ%40mail.gmail.com

Add batched RowData message supporting columnar compression; The current protocol is quite inefficient for returning tables of data.

Implement RowData mode where constant-sized types don't consume row length slots.

Use bitmap for NULL-ness of columns; Same as above: We shouldn't be wasting 4 bytes on NULL values if we can indicate that with a single bit per column.

Report commit LSN with commit for consistency models; Useful for clients that want to build various consistency models on top of PostgreSQL (e.g. read-after-write) with multiple nodes, without having to resort to sync commits

Automatic RowDescription sending for prepared statements; send a RowDescription automatically when it receives an Execute message for a prepared statement for which the types changed since the last time that it received a RowDescription. This would also basically remove the need to send explicit Describe messages at all.

Backend notification for invalidated prepared statements; might be the same as Automatic RowDescription

Ability to create holdable portal using V3 commands, currently the only way to create a holdable portal is using cursor commands.

Session and request metadata support;Specifically middleware would specify who the request is being proxied for. And for distributed tracing it would be ideal if we can tag transactions with the context identifier.

Specify binary types at startup, allows drivers to accept types in binary

Real database connection information at startup, saves having to get it by querying it. Useful if connected to a pooler such as pgbouncer.

Error handling for session state changes, That way transaction poolers could enable that and throw clear errors when clients do unsupported things, instead of failing in confusing ways.

Make current_database GUC report for clients to know which actual database they are connected to when connected through a pooler such as pgbouncer.

Documentation

Provide a manpage for postgresql.conf A smaller default postgresql.conf A smaller default postgresql.conf

Document support for N' ' national character string literals, if it matches the SQL standard http://archives.postgresql.org/message-id/1275895438.1849.1.camel@fsopti579.F-Secure.com

Exotic Features

Add pre-parsing phase that converts non-ISO syntax to supported syntax: This could allow SQL written for other databases to run without modification.

Add features of Oracle-style packages

A package would be a schema with session-local variables, public/private functions, and initialization functions. It is also possible to implement these capabilities in any schema and not use a separate "packages" syntax at all.

Add autonomous transactions autonomous transactions

Give query progress indication Query progress indication