Todo

From PostgreSQL wiki

Revision as of 07:44, 14 June 2009 by Mastermind (Talk | contribs)

Jump to: navigation, search

Contents

This list contains all known PostgreSQL bugs and feature requests. If you would like to work on an item, please read the Developer FAQ first. There is also a development information page.

  • Incomplete item - marks ordinary, incomplete items
  • [E] Incomplete item - marks items that are easier to implement
  • [D] Completed item - marks changes that are done, and will appear in the next major release


For help on editing this list, please see Talk:Todo.


Administration

[D] Completed itemAllow administrators to safely terminate individual sessions either via an SQL function or SIGTERM

Incomplete itemAllow administrators to cancel multi-statement idle transactions

This allows locks to be released, but it is complex to report the cancellation back to the client.

Incomplete itemCheck for unreferenced table files created by transactions that were in-progress when the server terminated abruptly

Incomplete itemSet proper permissions on non-system schemas during db creation

Currently all schemas are owned by the super-user because they are copied from the template1 database. However, since all objects are inherited from the template database, it is not clear that setting schemas to the db owner is correct.

[D] Completed itemAdd function to report the time of the most recent server reload

Incomplete itemAllow log_min_messages to be specified on a per-module basis

This would allow administrators to see more detailed information from specific sections of the backend, e.g. checkpoints, autovacuum, etc. Another idea is to allow separate configuration files for each module, or allow arbitrary SET commands to be passed to them.

Incomplete itemSimplify ability to create partitioned tables

This would allow creation of partitioned tables without requiring creation of triggers or rules for INSERT/UPDATE/DELETE, and constraints for rapid partition selection. Options could include range and hash partition selection.

Incomplete itemAllow auto-selection of partitioned tables for min/max() operations

Incomplete itemAllow more complex user/database default GUC settings

Currently ALTER USER and ALTER DATABASE support per-user and per-database defaults. Consider adding per-user-and-database defaults so things like search_path can be defaulted for a specific user connecting to a specific database.

Incomplete itemAllow custom variables to appear in pg_settings()

Incomplete itemAllow custom variable classes that can restrict who can set the values

Incomplete itemImplement the SQL standard mechanism whereby REVOKE ROLE revokes only the privilege granted by the invoking role, and not those granted by other roles

Incomplete itemAllow SSL authentication/encryption over unix domain sockets

Incomplete itemAllow SSL key file permission checks to be optionally disabled when sharing SSL keys with other applications

[D] Completed itemAllow SSL client certificate names to be checked against the client hostname

This is already implemented in libpq/fe-secure.c::verify_peer_name_matches_certificate() but the code is commented out.. Fixed in M. Hagander commit 2008-11-13

Incomplete itemAllow SSL CRL files to be re-read during configuration file reload, rather than requiring a server restart

Unlike SSL CRT files, CRL (Certificate Revocation List) files are updated frequently

Alternatively or additionally supporting OCSP (online certificate security protocol) would provide real-time revocation discovery without reloading

Incomplete item Allow automatic selection of SSL client certificates from a certificate store

Incomplete itemAdd 'hostgss' pg_hba.conf option to allow GSS link-level encryption

Incomplete itemImprove server security options

Incomplete itemPrevent query cancel packets from being replayed by an attacker, especially when using SSL

Incomplete itemProvide a way to query the log collector subprocess to determine what the currently active log file is

Incomplete itemAllow the client to authenticate the server in a Unix-domain socket connection, e.g., using SO_PEERCRED

Configuration files

Incomplete itemAllow pg_hba.conf to specify host names along with IP addresses

Host name lookup could occur when the postmaster reads the pg_hba.conf file, or when the backend starts. Another solution would be to reverse lookup the connection IP and check that hostname against the host names in pg_hba.conf. We could also then check that the host name maps to the IP address.

Incomplete itemAllow postgresql.conf file values to be changed via an SQL API, perhaps using SET GLOBAL

Incomplete itemAllow the server to be stopped/restarted via an SQL API

[D] Completed itemIssue a warning if a change-on-restart-only postgresql.conf value is modified and the server config files are reloaded

Incomplete itemConsider normalizing fractions in postgresql.conf, perhaps using '%'

Incomplete itemAllow Kerberos to disable stripping of realms so we can check the username@realm against multiple realms

Incomplete itemAdd functions to check correctness of configuration files before they are loaded "live"

Incomplete itemImprove LDAP authentication configuration options

Incomplete itemAdd external tool to auto-tune some postgresql.conf parameters

Tablespaces

Incomplete itemAllow a database in tablespace t1 with tables created in tablespace t2 to be used as a template for a new database created with default tablespace t2

Currently all objects in the default database tablespace must have default tablespace specifications. This is because new databases are created by copying directories. If you mix default tablespace tables and tablespace-specified tables in the same directory, creating a new database from such a mixed directory would create a new database with tables that had incorrect explicit tablespaces. To fix this would require modifying pg_class in the newly copied database, which we don't currently do.

Incomplete itemAllow reporting of which objects are in which tablespaces

This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.

Incomplete itemAllow WAL replay of CREATE TABLESPACE to work when the directory structure on the recovery computer is different from the original

Incomplete itemAllow per-tablespace quotas

Statistics Collector

[D] Completed itemReduce the frequency that the statistics file is written

[D] Completed itemAllow statistics file location to be user-configured

[D] Completed itemAllow statistics collector information to be pulled from the collector process directly, rather than requiring the collector to write a filesystem file twice a second?

Incomplete itemAllow statistics last vacuum/analyze execution times to be displayed without requiring track_counts to be enabled

Incomplete itemClear table counters on TRUNCATE

Incomplete item Allow the clearing of cluster-level statistics

Point-In-Time Recovery (PITR)

Incomplete itemAllow a warm standby system to also allow read-only statements

[E] Incomplete itemCreate dump tool for write-ahead logs for use in determining transaction id for point-in-time recovery

This is useful for checking PITR recovery.

Incomplete itemAllow recovery.conf to support the same syntax as postgresql.conf, including quoting

[D] Completed itemFix server restart problem when the server was shutdown during a PITR backup

[D] Completed itemRecreate pg_xlog/archive_status/ if it doesn't exist after restoring from a PITR backup

Incomplete itemAllow archive_mode to be changed without server restart?

Data Types

Incomplete itemChange NUMERIC to enforce the maximum precision

Incomplete itemReduce storage space for small NUMERICs

Incomplete itemFix data types where equality comparison isn't intuitive, e.g. box

Incomplete itemAdd support for public SYNONYMs

Incomplete itemFix CREATE CAST on DOMAINs

Incomplete itemAllow domains to be cast

Incomplete itemMake domains work better with polymorphic functions

Incomplete itemAdd support for SQL-standard GENERATED/IDENTITY columns

Incomplete itemImprove XML support

Incomplete itemConsider placing all sequences in a single table, or create a system view

[D] Completed itemAllow the UUID type to accept non-standard formats

Incomplete itemConsider a special data type for regular expressions

Incomplete itemReduce BIT data type overhead using short varlena headers

Incomplete itemAllow xml arrays to be cast to other data types

[D] Completed itemSimplify integer cross-data-type operators

Incomplete itemAllow adding/renaming/removing enumerated values to an existing enumerated data type

Dates and Times

[D] Completed itemAllow infinite dates just like infinite timestamps

Incomplete itemAllow infinite intervals just like infinite timestamps

Incomplete itemAllow TIMESTAMP WITH TIME ZONE to store the original timezone information, either zone name or offset from UTC

If the TIMESTAMP value is stored with a time zone name, interval computations should adjust based on the time zone rules.

Incomplete itemFix SELECT '0.01 years'::interval, '0.01 months'::interval

[D] Completed itemAdd a GUC variable to allow output of interval values in ISO8601 format

Incomplete itemHave timestamp subtraction not call justify_hours()?

Incomplete itemImprove timestamptz subtraction to be DST-aware

Currently subtracting one date from another that crosses a daylight savings time adjustment can return '1 day 1 hour', but adding that back to the first date returns a time one hour in the future. This is caused by the adjustment of '25 hours' to '1 day 1 hour', and '1 day' is the same time the next day, even if daylight savings adjustments are involved.

Incomplete itemFix interval display to support values exceeding 2^31 hours

Incomplete itemAdd overflow checking to timestamp and interval arithmetic

[D] Completed itemExtend timezone code to allow 64-bit values so we can represent years beyond 2038

[D] Completed itemUse LC_TIME for localized weekday/month names, rather than LC_MESSAGES

[D] Completed itemAdd ISO INTERVAL handling

[D] Completed itemSupport ISO INTERVAL syntax if units cannot be determined from the string, and are supplied after the string

The SQL standard states that the units after the string specify the units of the string, e.g. INTERVAL '2' MINUTE should return '00:02:00'. The current behavior has the units restrict the interval value to the specified unit or unit range, INTERVAL '70' SECOND returns '00:00:10'. For syntax that isn't uniquely ISO or PG syntax, like '1' or '1:30', treat as ISO if there is a range specification clause, and as PG if there no clause is present, e.g. interpret '1:30' MINUTE TO SECOND as '1 minute 30 seconds', and interpret '1:30' as '1 hour, 30 minutes'. This makes common cases like SELECT INTERVAL '1' MONTH SQL-standard results. The SQL standard supports a limited number of unit combinations and doesn't support unit names in the string. The PostgreSQL syntax is more flexible in the range of units supported, e.g. PostgreSQL supports '1 year 1 hour', while the SQL standard does not.

[D] Completed itemAdd support for year-month syntax, INTERVAL '50-6' YEAR TO MONTH

Incomplete itemRound or truncate values to the requested precision, e.g. INTERVAL '11 months' AS YEAR should return one or zero

[E] Incomplete itemRevise the src/timezone/tznames abbreviation files:

Arrays

[D] Completed itemDelay resolution of array expression's data type so assignment coercion can be performed on empty array expressions

Incomplete itemAdd support for arrays of domains

Incomplete itemAllow single-byte header storage for array elements

Incomplete itemAdd function to detect if an array is empty

Incomplete itemImprove handling of empty arrays

Incomplete itemImprove handling of NULLs in arrays


Binary Data

Incomplete itemImprove vacuum of large objects, like contrib/vacuumlo?

Incomplete itemAdd security checks for large objects

Incomplete itemAuto-delete large objects when referencing row is deleted

contrib/lo offers this functionality.

Incomplete itemAllow read/write into TOAST values like large objects

This requires the TOAST column to be stored EXTERNAL.

Incomplete itemAdd API for 64-bit large object access

MONEY Data Type

Incomplete itemAdd locale-aware MONEY type, and support multiple currencies

Incomplete itemMONEY dumps in a locale-specific format making it difficult to restore to a system with a different locale

Incomplete itemAllow MONEY to be easily cast to/from other numeric data types

Text Search

Incomplete itemAllow dictionaries to change the token that is passed on to later dictionaries

Incomplete itemConsider a function-based API for '@@' searches

Incomplete itemImprove text search error messages

Incomplete itemConsider changing error to warning for strings larger than one megabyte

Incomplete itemtsearch and tsdicts regression tests fail in Turkish locale on glibc

Functions

Incomplete itemAllow INET subnet tests using non-constants to be indexed

Incomplete itemAllow to_date() and to_timestamp() to accept localized month names

[D] Completed itemFix to_date()-related functions to consistently issue errors

Incomplete itemAdd missing parameter handling in to_char()

Incomplete itemAllow substring/replace() to get/set bit values

Incomplete itemAllow to_char() on interval values to accumulate the highest unit requested

Some special format flag would be required to request such accumulation. Such functionality could also be added to EXTRACT. Prevent accumulation that crosses the month/day boundary because of the uneven number of days in a month.

  • to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
  • to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
  • to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
  • to_char(INTERVAL '3 years 5 months','MM') => 41

[D] Completed itemImprove to_timestamp() handling of AM/PM, and error messages

[D] Completed itemImplement inlining of set-returning functions defined in SQL

[D] Completed itemAllow SQL-language functions to return results from RETURNING queries

Incomplete itemAllow SQL-language functions to reference parameters by parameter name

Currently SQL-language functions can only refer to dollar parameters, e.g. $1

Incomplete itemAdd SPI_gettypmod() to return the typemod for a TupleDesc

Incomplete itemEnforce typmod for function inputs, function results and parameters for spi_prepare'd statements called from PLs

Incomplete itemAllow holdable cursors in SPI

Incomplete itemTighten function permission checks

Incomplete itemFix IS OF so it matches the ISO specification, and add documentation

Incomplete itemAdd missing operators for geometric data types

Some geometric types do not have the full suite of geometric operators, e.g. box @> point

[D] Completed itemImplement Boyer-Moore searching in strpos()

Incomplete itemImplement Boyer-Moore searching in LIKE queries

Incomplete itemPrevent malicious functions from being executed with the permissions of unsuspecting users

Index functions are safe, so VACUUM and ANALYZE are safe too. Triggers, CHECK and DEFAULT expressions, and rules are still vulnerable.

Incomplete itemReduce memory usage of aggregates in set returning functions

[D] Completed itemAdd temporal versions of generate_series()

[D] Completed itemAdd array_agg() and UNNEST functions for arrays

Incomplete itemFix /contrib/ltree operator

Incomplete itemFix inconsistent precedence of =, >, and < compared to <>, >=, and <=

Incomplete itemFix regular expression bug when using complex back-references

Incomplete itemHave /contrib/dblink reuse unnamed connections

Incomplete itemAllow calling of a procedure outside a SELECT that can control the transaction state

[D] Completed itemFix all set-returning system functions so they support a wildcard target list

SELECT * FROM pg_get_keywords() works but SELECT * FROM pg_show_all_settings() does not.

Incomplete itemAdd has_sequence_privilege()

Incomplete itemImprove formatting of pg_get_viewdef() output

Multi-Language Support

Incomplete itemAdd NCHAR (as distinguished from ordinary varchar),

[D] Completed itemAllow locale to be set at database creation

Currently locale can only be set during initdb. No global tables have locale-aware columns. However, the database template used during database creation might have locale-aware indexes. The indexes would need to be reindexed to match the new locale.

Incomplete itemAllow more fine-grained collation selection; add CREATE COLLATION.

Right now the collation is fixed at database creation time.

Incomplete itemAdd a LOCALE option to CREATE DATABASE, as a shorthand

Incomplete itemSupport multiple simultaneous character sets, per SQL:2008

Incomplete itemImprove UTF8 combined character handling?

Incomplete itemAdd octet_length_server() and octet_length_client()

Incomplete itemMake octet_length_client() the same as octet_length()?

Incomplete itemFix problems with wrong runtime encoding conversion for NLS message files

Incomplete itemAdd URL to more complete multi-byte regression tests

Incomplete itemFix ILIKE and regular expressions to handle case insensitivity properly in multibyte encodings

Incomplete itemFix contrib/fuzzystrmatch to work with multibyte encodings

Incomplete itemSet client encoding based on the client operating system encoding

Currently client_encoding is set in postgresql.conf, which defaults to the server encoding.

Incomplete itemChange memory allocation for multi-byte functions so memory is allocated inside conversion functions

Currently we preallocate memory based on worst-case usage.

Incomplete itemAdd ability to use case-insensitive regular expressions on multi-byte characters

ILIKE already works with multi-byte characters

Incomplete itemImprove encoding of connection startup messages sent to the client

Currently some authentication error messages are sent in the server encoding

Incomplete itemHave pg_stat_activity display query strings in the correct client encoding

Incomplete itemMore sensible support for Unicode combining characters, normal forms

Views / Rules

Incomplete itemAutomatically create rules on views so they are updateable, per SQL:2008

We can only auto-create rules for simple views. For more complex cases users will still have to write rules manually.

Incomplete itemAdd the functionality for WITH CHECK OPTION clause of CREATE VIEW

Incomplete itemAllow VIEW/RULE recompilation when the underlying tables change

Another issue is whether underlying table changes should be reflected in the view, e.g. should SELECT * show additional columns if they are added after the view is created.

Incomplete itemMake it possible to use RETURNING together with conditional DO INSTEAD rules, such as for partitioning setups

Incomplete itemAdd the ability to automatically create materialized views

Right now materialized views require the user to create triggers on the main table to keep the summary table current. SQL syntax should be able to manager the triggers and summary table automatically. A more sophisticated implementation would automatically retrieve from the summary table when the main table is referenced, if possible.

Incomplete itemImprove ability to modify views via ALTER TABLE

Incomplete itemEnable creation of RETURNING rules on inherited tables that have dropped columns

SQL Commands

Incomplete itemAdd CORRESPONDING BY to UNION/INTERSECT/EXCEPT

Incomplete itemAdd ROLLUP, CUBE, GROUPING SETS options to GROUP BY

[E] Incomplete itemAllow SET CONSTRAINTS to be qualified by schema/table name

[D] Completed itemAdd a separate TRUNCATE permission

Currently only the owner can TRUNCATE a table because triggers are not called, and the table is locked in exclusive mode.

Incomplete itemChange TRUNCATE to operate on an inheritance hierarchy by default, and add ONLY support to affect a single table

[E] Incomplete itemFix TRUNCATE ... RESTART IDENTITY so its effect on sequences is rolled back on transaction abort

Incomplete itemAllow PREPARE of cursors

Incomplete itemAllow finer control over the caching of prepared query plans

Currently queries prepared via the libpq API are planned on first execute using the supplied parameters --- allow SQL PREPARE to do the same. Also, allow control over replanning prepared queries either manually or automatically when statistics for execute parameters differ dramatically from those used during planning.

Incomplete itemImprove logging of prepared transactions recovered during startup

[D] Completed itemImprove failure message when DROP DATABASE is used on a database that has prepared transactions

Incomplete itemAllow prepared transactions with temporary tables created and dropped in the same transaction, and when an ON COMMIT DELETE ROWS temporary table is accessed

Incomplete itemAdd a GUC variable to warn about non-standard SQL usage in queries

Incomplete itemAdd SQL-standard MERGE/REPLACE/UPSERT command

MERGE is typically used to merge two tables. REPLACE or UPSERT command does UPDATE, or on failure, INSERT. This is similar to UPDATE, then for unmatched rows, INSERT. Whether concurrent access allows modifications which could cause row loss is implementation independent. To implement this cleanly requires that the table have a unique index so duplicate checking can be easily performed. It is possible to do it without a unique index if we require the user to LOCK the table before the MERGE.

Incomplete itemAdd NOVICE output level for helpful messages like automatic sequence/index creation

Incomplete itemAdd GUC to issue notice about statements that use unjoined tables

Incomplete itemAllow EXPLAIN to identify tables that were skipped because of constraint_exclusion

Incomplete itemAllow EXPLAIN output to be more easily processed by scripts, perhaps XML

Incomplete itemEnable standard_conforming_strings by default in 8.6?

When this is done, backslash-quote should be prohibited in non-E'' strings because of possible confusion over how such strings treat backslashes. Basically, '' is always safe for a literal single quote, while \' might or might not be based on the backslash handling rules.

Incomplete itemSimplify dropping roles that have objects in several databases

Incomplete itemAllow the count returned by SELECT, etc to be represented as an int64 to allow a higher range of values

[D] Completed itemAdd SQL:2008 Common Table Expression (WITH [RECURSIVE]) clause to SELECT

Incomplete itemAdd support for WITH RECURSIVE ... CYCLE

Incomplete itemAdd DEFAULT .. AS OWNER so permission checks are done as the table owner

This would be useful for SERIAL nextval() calls and CHECK constraints.

Incomplete itemAllow DISTINCT to work in multiple-argument aggregate calls

Incomplete itemAdd column to pg_stat_activity that shows the progress of long-running commands like CREATE INDEX and VACUUM

[D] Completed itemImplement SQL:2008 window functions

Incomplete itemAllow INSERT/UPDATE ... RETURNING inside a SELECT 'FROM' clause or target list

[D] Completed itemIncrease locking when DROPing objects so dependent objects cannot get dropped while the DROP operation is happening

[D] Completed itemAllow AS in "SELECT col AS label" to be optional in certain cases

Incomplete itemAllow INSERT ... DELETE ... RETURNING, namely allow the DELETE ... RETURNING to supply values to the INSERT

Incomplete itemAdd comments on system tables/columns using the information in catalogs.sgml

Ideally the information would be pulled from the SGML file automatically.

[D] Completed itemImprove reporting of UNION type mismatches

Incomplete itemPrevent the specification of conflicting transaction read/write options

Incomplete itemSupport LATERAL subqueries

Lateral subqueries can reference columns of tables defined outside the subquery at the same level, i.e. laterally. For example, a LATERAL subquery in a FROM clause could reference tables defined in the same FROM clause. Currently only the columns of tables defined above subqueries are recognized.

CREATE

Incomplete itemAllow CREATE TABLE AS to determine column lengths for complex expressions like SELECT col1

Incomplete itemHave WITH CONSTRAINTS also create constraint indexes

Incomplete itemMove NOT NULL constraint information to pg_constraint

Currently NOT NULL constraints are stored in pg_attribute without any designation of their origins, e.g. primary keys. One manifest problem is that dropping a PRIMARY KEY constraint does not remove the NOT NULL constraint designation.

Incomplete itemPrevent concurrent CREATE TABLE from sometimes returning a cryptic error message

Incomplete itemAdd CREATE SCHEMA ... LIKE that copies a schema

Incomplete itemAdd CREATE TABLE LIKE ... INCLUDING COMMENTS

Incomplete itemHave CREATE TABLE LIKE copy column storage parameters

Incomplete itemCREATE OR REPLACE FUNCTION might leave dependent objects depending on the function in inconsistent state


Incomplete itemAllow GLOBAL temporary tables to exist as empty by default in all sessions

Incomplete itemAdd OR REPLACE to CREATE LANGUAGE

Incomplete itemAllow the creation of "distinct" types

UPDATE

ALTER

Incomplete itemHave ALTER TABLE RENAME rename SERIAL sequence names

[E] Incomplete itemAllow ALTER TABLE ... ALTER CONSTRAINT ... RENAME

Incomplete itemAdd ALTER TABLE RENAME CONSTRAINT, update index name also

Incomplete itemHave ALTER SEQUENCE RENAME rename the sequence name stored in the sequence table

Incomplete itemAdd ALTER DOMAIN to modify the underlying data type

[E] Incomplete itemAllow ALTER TABLE to change constraint deferrability and actions

Incomplete itemAdd missing object types for ALTER ... SET SCHEMA

Incomplete itemAllow ALTER TABLESPACE to move to different directories

[D] Completed itemAllow databases to be moved to different tablespaces

Incomplete itemAllow moving system tables to other tablespaces, where possible

Currently non-global system tables must be in the default database tablespace. Global system tables can never be moved.

[D] Completed itemPrevent parent tables from altering or dropping constraints like CHECK that are inherited by child tables unless CASCADE is used

[D] Completed itemPrevent child tables from altering or dropping constraints like CHECK that were inherited from the parent table

Incomplete itemHave ALTER INDEX update the name of a constraint using that index

Incomplete itemAllow column display reordering by recording a display, storage, and permanent id for every column?

Incomplete itemAllow an existing index to be marked as a table's primary key

Incomplete itemAllow ALTER TYPE on composite types to perform operations similar to ALTER TABLE

Incomplete itemDon't require table rewrite on ALTER TABLE ... ALTER COLUMN TYPE, when the old and new data types are binary compatible

CLUSTER

Incomplete itemAutomatically maintain clustering on a table

This might require some background daemon to maintain clustering during periods of low usage. It might also require tables to be only partially filled for easier reorganization. Another idea would be to create a merged heap/index data file so an index lookup would automatically access the heap data too. A third idea would be to store heap rows in hashed groups, perhaps using a user-supplied hash function.

[E] Incomplete itemAdd default clustering to system tables

To do this, determine the ideal cluster index for each system table and set the cluster setting during initdb.

[D] Completed itemAdd VERBOSE option to report tables as they are processed, like VACUUM VERBOSE

Incomplete itemImprove CLUSTER performance by sorting to reduce random I/O

COPY

Incomplete itemAllow COPY to report error lines and continue

This requires the use of a savepoint before each COPY line is processed, with ROLLBACK on COPY failure.

Incomplete itemAllow COPY on a newly-created table to skip WAL logging

On crash recovery, the table involved in the COPY would be removed or have its heap and index files truncated. One issue is that no other backend should be able to add to the table at the same time, which is something that is currently allowed. This currently is done if the table is created inside the same transaction block as the COPY because no other backends can see the table.

[D] Completed itemConsider using a ring buffer for COPY FROM

Incomplete itemAllow COPY FROM to create index entries in bulk

Incomplete itemAllow COPY in CSV mode to control whether a quoted zero-length string is treated as NULL

Currently this is always treated as a zero-length string, which generates an error when loading into an integer column

Incomplete itemImprove COPY performance

Incomplete itemAllow COPY to report errors sooner

Incomplete itemImprove bytea COPY format

GRANT/REVOKE

[D] Completed itemAllow column-level privileges

[E] Incomplete itemAllow GRANT/REVOKE permissions to be applied to all schema objects with one command

The proposed syntax is: GRANT SELECT ON ALL TABLES IN public TO phpuser; GRANT SELECT ON NEW TABLES IN public TO phpuser;

Incomplete itemAllow GRANT/REVOKE permissions to be inherited by objects based on schema permissions

Incomplete itemAllow SERIAL sequences to inherit permissions from the base table?

Incomplete itemAllow dropping of a role that has connection rights

DECLARE CURSOR

Incomplete itemPrevent DROP TABLE from dropping a table referenced by its own open cursor?

Incomplete itemProvide some guarantees about the behavior of cursors that invoke volatile functions

INSERT

Incomplete itemAllow INSERT/UPDATE of the system-generated oid value for a row

Incomplete itemIn rules, allow VALUES() to contain a mixture of 'old' and 'new' references

SHOW/SET

Incomplete itemAdd SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM ANALYZE, and CLUSTER

Incomplete itemRationalize the discrepancy between settings that use values in bytes and SHOW that returns the object count

LISTEN/NOTIFY

Incomplete itemAllow LISTEN/NOTIFY to store info in memory rather than tables?

Currently LISTEN/NOTIFY information is stored in pg_listener. Storing such information in memory would improve performance.

Incomplete itemAdd optional textual message to NOTIFY

This would allow an informational message to be added to the notify message, perhaps indicating the row modified or other custom information.

Incomplete itemAllow multiple identical NOTIFY events to always be communicated to the client, rather than sent as a single notification to the listener

Incomplete itemAllow NOTIFY in rules involving conditionals

Incomplete itemImprove LISTEN concurrency

Window Functions

See TODO items for window functions.

Incomplete itemSupport creation of user-defined window functions.

We have the ability to create new window functions written in C. Is it worth the effort to create an API that would let them be written in PL/pgsql, etc?

Incomplete itemImplement full support for window framing clauses.

The cases we support now are basically those where no row ever exits the frame as the current row advances. To do better requires some rethinking of the window aggregate support.

Incomplete itemLook at tuplestore performance issues.

The tuplestore_in_memory() thing is just a band-aid, we ought to try to solve it properly. tuplestore_advance seems like a weak spot as well.

Incomplete itemDo we really need so much duplicated code between Agg and WindowAgg?

Incomplete itemTeach planner to evaluate multiple windows in the optimal order.

Currently windows are always evaluated in the query-specified order.

Integrity Constraints

Keys

Incomplete itemAllow DEFERRABLE and end-of-statement UNIQUE constraints?

This would allow UPDATE tab SET col = col + 1 to work if col has a unique index. Currently, uniqueness checks are done while the command is being executed, rather than at the end of the statement or transaction.

Referential Integrity

Incomplete itemAdd MATCH PARTIAL referential integrity

Incomplete itemChange foreign key constraint for array -> element to mean element in array?

Incomplete itemFix problem when cascading referential triggers make changes on cascaded tables, seeing the tables in an intermediate state

Incomplete itemOptimize referential integrity checks

[E] Incomplete itemReport the schema along table name in a referential failure error message

Server-Side Languages

PL/pgSQL

Incomplete itemFix RENAME to work on variables other than OLD/NEW

Incomplete itemAllow function parameters to be passed by name, get_employee_salary(12345 AS emp_id, 2001 AS tax_year)

Incomplete itemAllow handling of %TYPE arrays, e.g. tab.col%TYPE[]

Incomplete itemAllow listing of record column names, and access to record columns via variables, e.g. columns := r.(*), tval2 := r.(colname)

Incomplete itemAdd support for SCROLL cursors

Incomplete itemAdd support for WITH HOLD cursors

Incomplete itemAllow row and record variables to be set to NULL constants, and allow NULL tests on such variables

Because a row is not scalar, do not allow assignment from NULL-valued scalars.

Incomplete itemReview handling of MOVE and FETCH

Incomplete itemImprove logic of determining if an identifier is a variable or a column name

Incomplete itemConsider keeping separate cached copies when search_path changes

Incomplete itemImprove PL/pgSQL's ability to cope with rowtypes containing dropped columns

[D] Completed itemAdd CASE capability to language (already in SQL)

[D] Completed itemAdd translations for PL/pgSQL error messages

Incomplete itemImprove handling of NULL row values vs. NULL rows

Incomplete itemRethink memory management for simple expression evaluation

[D] Completed itemConsider honoring standard_conforming_strings in function bodies

Other

Incomplete itemAdd table function support to pltcl, plpythonu

Incomplete itemAdd support for polymorphic arguments and return types to languages other than PL/PgSQL

Incomplete itemAdd capability to create and call PROCEDURES

Incomplete itemAdd support for OUT and INOUT parameters to languages other than PL/PgSQL

Incomplete itemAdd PL/PythonU tracebacks

Incomplete itemAdd support for Python 3.0 to PL/Python

Incomplete itemAllow data to be passed in native language formats, rather than only text

Clients

Incomplete itemHave pg_ctl look at PGHOST in case it is a socket directory?

Incomplete itemAllow pg_ctl to work properly with configuration files located outside the PGDATA directory

pg_ctl can not read the pid file because it isn't located in the config directory but in the PGDATA directory. The solution is to allow pg_ctl to read and understand postgresql.conf to find the data_directory value.

Incomplete itemAdd a function like pg_get_indexdef() that report more detailed index information

psql

[D] Completed itemHave psql show current values for a sequence

Incomplete itemHave psql \ds show all sequences and their settings

Incomplete itemHave \d on a sequence indicate if the sequences is owned by a table

Incomplete itemMove psql backslash database information into the backend, use mnemonic commands?

This would allow non-psql clients to pull the same information out of the database as psql.

Incomplete itemMake psql's \d commands more consistent in its handling of schemas

Incomplete itemConsistently display privilege information for all objects in psql

Incomplete itemAdd auto-expanded mode so expanded output is used if the row length is wider than the screen width.

Consider using auto-expanded mode for backslash commands like \df+.

Incomplete itemPrevent tab completion of SET TRANSACTION from querying the database and therefore preventing the transaction isolation level from being set.

Currently SET <tab> causes a database lookup to check all supported session variables. This query causes problems because setting the transaction isolation level must be the first statement of a transaction.

Incomplete itemAdd a \set variable to control whether \s displays line numbers

Another option is to add \# which lists line numbers, and allows command execution.

Incomplete itemPrevent escape string warnings when object names have backslashes

[D] Completed itemHave \d show foreign keys that reference a table's primary key

Incomplete itemHave \d show child tables that inherit from the specified parent

[D] Completed itemHave \l+ show database size, if permissions allow

Ideally it will not generate an error for invalid permissions

Incomplete itemInclude the symbolic SQLSTATE name in verbose error reports

[D] Completed itemImprove display of enums to show valid enum values

Incomplete itemAdd prompt escape to display the client and server versions

Incomplete itemAdd option to wrap column values at whitespace boundaries, rather than chopping them at a fixed width.

Currently, "wrapped" format chops values into fixed widths. Perhaps the word wrapping could use the same algorithm documented in the W3C specification.

Incomplete itemAdd "auto" expanded mode that outputs in expanded format if "wrapped" mode can't wrap the output to the screen width

Incomplete itemSupport the ReST table output format

Details about the ReST format: http://docutils.sourceforge.net/rst.html#reference-documentation

pg_dump / pg_restore

[E] Incomplete itemAdd dumping of comments on index columns and composite type columns

[E] Incomplete itemAdd full object name to the tag field. eg. for operators we need '=(integer, integer)', instead of just '='.

Incomplete itemAdd pg_dumpall custom format dumps?

Incomplete itemAvoid using platform-dependent locale names in pg_dumpall output

Using native locale names puts roadblocks in the way of porting a dump to another platform. One possible solution is to get CREATE DATABASE to accept some agreed-on set of locale names and fix them up to meet the platform's requirements.

Incomplete itemAllow selection of individual object(s) of all types, not just tables

Incomplete itemIn a selective dump, allow dumping of an object and all its dependencies

Incomplete itemAdd options like pg_restore -l and -L to pg_dump

Incomplete itemAdd support for multiple pg_restore -t options, like pg_dump

pg_restore's -t switch is less useful than pg_dump's in quite a few ways: no multiple switches, no pattern matching, no ability to pick up indexes and other dependent items for a selected table. It should be made to handle this switch just like pg_dump does.

Incomplete itemStop dumping CASCADE on DROP TYPE commands in clean mode

Incomplete itemAllow pg_dump --clean to drop roles that own objects or have privileges

tgl says: if this is about pg_dumpall, it's done as of 8.4. If it's really about pg_dump, what does it mean? pg_dump has no business dropping roles.

Incomplete itemChange pg_dump so that a comment on the dumped database is applied to the loaded database, even if the database has a different name. This will require new backend syntax, perhaps COMMENT ON CURRENT DATABASE.

Incomplete itemRemove unnecessary function pointer abstractions in pg_dump source code

Incomplete itemAllow pg_dump to utilize multiple CPUs and I/O channels by dumping multiple objects simultaneously

The difficulty with this is getting multiple dump processes to produce a single dump output file. It also would require several sessions to share the same snapshot.

[D] Completed itemAllow pg_restore to utilize multiple CPUs and I/O channels by restoring multiple objects simultaneously

This might require a pg_restore flag to indicate how many simultaneous operations should be performed. Only pg_dump's -Fc format has the necessary dependency information.

Incomplete itemTo better utilize resources, allow pg_restore to check foreign keys simultaneously, where possible

Incomplete itemAllow pg_restore to create all indexes of a table concurrently, via a single heap scan

This requires a pg_dump -Fc file because that format contains the required dependency information.

Incomplete itemAllow pg_restore to load different parts of the COPY data simultaneously

[D] Completed itemPrevent pg_dump/pg_restore from being affected by statement_timeout

Using psql to restore a pg_dump dump is also affected.

Incomplete itemRemove support for dumping from pre-7.3 servers

In 7.3 and later, we can get accurate dependency information from the server. pg_dump still contains a lot of crufty code to try to deal with the lack of dependency info in older servers, but the usefulness of maintaining that code grows small.

Incomplete itemAllow pre/data/post files when schema and data are dumped separately, for performance reasons

Incomplete itemHave pg_dump -C emit ALTER DATABASE ... SET commands after database creation

Incomplete itemAllow parallel restore of tar dumps


ecpg

Incomplete itemDocs

Document differences between ecpg and the SQL standard and information about the Informix-compatibility module.

Incomplete itemSolve cardinality > 1 for input descriptors / variables?

Incomplete itemAdd a semantic check level, e.g. check if a table really exists

Incomplete itemfix handling of DB attributes that are arrays

[D] Completed itemUse backend PREPARE/EXECUTE facility for ecpg where possible

Incomplete itemImplement SQLDA

Incomplete itemFix nested C comments

[E] Incomplete itemsqlwarn[6] should be 'W' if the PRECISION or SCALE value specified

Incomplete itemMake SET CONNECTION thread-aware, non-standard?

Incomplete itemAllow multidimensional arrays

[D] Completed itemAdd internationalized message strings

Incomplete itemImplement COPY FROM STDIN

libpq

Incomplete itemAdd PQescapeIdentifierConn()

Incomplete itemPrevent PQfnumber() from lowercasing unquoted column names

PQfnumber() should never have been doing lowercasing, but historically it has so we need a way to prevent it

Incomplete itemAllow statement results to be automatically batched to the client

Currently all statement results are transferred to the libpq client before libpq makes the results available to the application. This feature would allow the application to make use of the first result rows while the rest are transferred, or held on the server waiting for them to be requested by libpq. One complexity is that a statement like SELECT 1/col could error out mid-way through the result set.

Incomplete itemConsider disallowing multiple queries in PQexec() as an additional barrier to SQL injection attacks

Incomplete itemAdd PQexecf() that allows complex parameter substitution

Incomplete itemAdd SQLSTATE severity to PGconn return status

Incomplete itemFix libpq initialization to play nicely with applications that use libcrypto but not libssl

It's not entirely clear if we should just fix the immediate problem, or try to provide some more general solution for controlling libpq initialization.

Incomplete itemMake libpq thread-safe in programs that use fork()

This requires the use of pthread_atfork() to release global locks held in libpq

Triggers

Incomplete itemAdd deferred trigger queue file

Right now all deferred trigger information is stored in backend memory. This could exhaust memory for very large trigger queues. This item involves dumping large queues into files, or doing some kind of join to process all the triggers, or some bulk operation.

Incomplete itemAllow triggers to be disabled in only the current session.

This is currently possible by starting a multi-statement transaction, modifying the system tables, performing the desired SQL, restoring the system tables, and committing the transaction. ALTER TABLE ... TRIGGER requires a table lock so it is not ideal for this usage.

Incomplete itemWith disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN KEY

If the dump is known to be valid, allow foreign keys to be added without revalidating the data.

Incomplete itemAllow statement-level triggers to access modified rows

Incomplete itemWhen statement-level triggers are defined on a parent table, have them fire only on the parent table, and fire child table triggers only where appropriate

Incomplete itemSupport triggers on columns

Incomplete itemAllow AFTER triggers on system tables

System tables are modified in many places in the backend without going through the executor and therefore not causing triggers to fire. To complete this item, the functions that modify system tables will have to fire triggers.

Incomplete itemTighten trigger permission checks

Incomplete itemAllow BEFORE INSERT triggers on views

[D] Completed itemAdd ability to trigger on TRUNCATE

Incomplete itemAdd database and transaction-level triggers

Incomplete itemReduce locking requirements for creating a trigger

Inheritance

Incomplete itemAllow inherited tables to inherit indexes, UNIQUE constraints, and primary/foreign keys

Incomplete itemHonor UNIQUE INDEX on base column in INSERTs/UPDATEs on inherited table, e.g. INSERT INTO inherit_table (unique_index_col) VALUES (dup) should fail

The main difficulty with this item is the problem of creating an index that can span multiple tables.

[D] Completed itemAllow SELECT ... FOR UPDATE on inherited tables

[D] Completed itemRequire all CHECK constraints to be inherited

[D] Completed itemAdd checks to prevent a CREATE RULE views on inherited tables

Incomplete itemDetermine whether ALTER TABLE / SET SCHEMA should work on inheritance hierarchies (and thus support ONLY). If yes, implement it.

Incomplete itemALTER TABLE variants sometimes support recursion and sometimes not, but this is poorly/not documented, and the ONLY marker would then be silently ignored. Clarify the documentation, and reject ONLY if it is not supported.

Indexes

Incomplete itemAdd UNIQUE capability to non-btree indexes

Incomplete itemPrevent index uniqueness checks when UPDATE does not modify the column

Uniqueness (index) checks are done when updating a column even if the column is not modified by the UPDATE. However, HOT already short-circuits this in common cases, so more work might not be helpful.

Incomplete itemAllow the creation of on-disk bitmap indexes which can be quickly combined with other bitmap indexes

Such indexes could be more compact if there are only a few distinct values. Such indexes can also be compressed. Keeping such indexes updated can be costly.

Incomplete itemAllow accurate statistics to be collected on indexes with more than one column or expression indexes, perhaps using per-index statistics

[D] Completed itemIncrease the default and maximum number of statistics targets

Incomplete itemConsider having a larger statistics target for indexed columns and expression indexes.

Incomplete itemConsider smaller indexes that record a range of values per heap page, rather than having one index entry for every heap row

This is useful if the heap is clustered by the indexed values.

Incomplete itemAdd REINDEX CONCURRENTLY, like CREATE INDEX CONCURRENTLY

This is difficult because you must upgrade to an exclusive table lock to replace the existing index file. CREATE INDEX CONCURRENTLY does not have this complication. This would allow index compaction without downtime.

Incomplete itemAllow multiple indexes to be created concurrently, ideally via a single heap scan, and have pg_restore use it

Incomplete itemConsider sorting entries before inserting into btree index

Incomplete itemAllow index scans to return matching index keys, not just the matching heap locations

Incomplete itemAllow creation of an index that can do comparisons to test if a value is between two column values

Incomplete itemConsider using "effective_io_concurrency" for index scans

  • Currently only bitmap scans use this, which might be fine because most multi-row index scans use bitmap scans.

GIST

Incomplete itemAdd more GIST index support for geometric data types

Incomplete itemAllow GIST indexes to create certain complex index types, like digital trees (see Aoki)

Incomplete itemFix performance issues in contrib/seg and contrib/cube GiST support

GIN

Incomplete itemSupport empty indexed values (such as zero-element arrays) properly

Incomplete itemBehave correctly for cases where some elements of an indexed value are NULL

Incomplete itemSupport queries that require a full scan

Hash

Incomplete itemPack hash index buckets onto disk pages more efficiently

Currently only one hash bucket can be stored on a page. Ideally several hash buckets could be stored on a single page and greater granularity used for the hash algorithm.

[D] Completed itemConsider sorting hash buckets so entries can be found using a binary search, rather than a linear scan

[D] Completed itemIn hash indexes, consider storing the hash value with or instead of the key itself

Incomplete itemAdd hash WAL logging for crash recovery

Incomplete itemAllow multi-column hash indexes

[D] Completed itemDuring index creation, pre-sort the tuples to improve build speed

Catalogs

Incomplete itemImprove performance of information_schema views

Incomplete itemImprove information_schema's entries for precision and scale

Sorting

[D] Completed itemConsider using hash buckets to do DISTINCT, rather than sorting

This would be beneficial when there are few distinct values. This is already used by GROUP BY.

Incomplete itemConsider whether duplicate keys should be sorted by block/offset

[D] Completed itemAvoid some tuple copying in sort routines

Incomplete itemConsider being smarter about memory and external files used during sorts

Incomplete itemConsider detoasting keys before sorting

Fsync

Incomplete itemDetermine optimal fdatasync/fsync, O_SYNC/O_DSYNC options

Ideally this requires a separate test program that can be run at initdb time or optionally later. Consider O_SYNC when O_DIRECT exists.

Incomplete itemAdd program to test if fsync has a delay compared to non-fsync

Incomplete itemConsider sorting writes during checkpoint

Cache Usage

Incomplete itemSpeed up COUNT(*)

We could use a fixed row count and a +/- count to follow MVCC visibility rules, or a single cached value could be used and invalidated if anyone modifies the table. Another idea is to get a count directly from a unique index, but for this to be faster than a sequential scan it must avoid access to the heap to obtain tuple visibility information.

Incomplete itemProvide a way to calculate an "estimated COUNT(*)"

Perhaps by using the optimizer's cardinality estimates or random sampling.

Incomplete itemAllow data to be pulled directly from indexes

Currently indexes do not have enough tuple visibility information to allow data to be pulled from the index without also accessing the heap. One way to allow this is to set a bit on index tuples to indicate if a tuple is currently visible to all transactions when the first valid heap lookup happens. This bit would have to be cleared when a heap tuple is expired. Another idea is to maintain a bitmap of heap pages where all rows are visible to all backends, and allow index lookups to reference that bitmap to avoid heap lookups, perhaps the same bitmap we might add someday to determine which heap pages need vacuuming. Frequently accessed bitmaps would have to be stored in shared memory. One 8k page of bitmaps could track 512MB of heap pages. A third idea would be for a heap scan to check if all rows are visible and if so set a per-table flag which can be checked by index scans. Any change to the table would have to clear the flag. To detect changes during the heap scan a counter could be set at the start and checked at the end --- if it is the same, the table has not been modified --- any table change would increment the counter.

Incomplete itemConsider automatic caching of statements at various levels:

  • Parsed query tree
  • Query execute plan
  • Query results

Incomplete itemConsider increasing internal areas (NUM_CLOG_BUFFERS) when shared buffers is increased

Incomplete itemConsider decreasing the amount of memory used by PrivateRefCount

Incomplete itemConsider allowing higher priority queries to have referenced buffer cache pages stay in memory longer

Vacuum

Incomplete itemImprove speed with indexes

For large table adjustments during VACUUM FULL, it is faster to cluster or reindex rather than update the indexes. Also, this situation tends to bloat the indexes.

Incomplete itemAuto-fill the free space map by scanning the buffer cache or by checking pages written by the background writer

[D] Completed itemCreate a bitmap of pages that need vacuuming

Instead of sequentially scanning the entire table, have the background writer or some other process record pages that have expired rows, then VACUUM can look at just those pages rather than the entire table. In the event of a system crash, the bitmap would probably be invalidated. One complexity is that index entries still have to be vacuumed, and doing this without an index scan (by using the heap values to find the index entry) might be slow and unreliable, especially for user-defined index functions.

Incomplete itemImprove tracking of total relation tuple counts now that vacuum doesn't always scan the whole heap

Incomplete itemBias FSM towards returning free space near the beginning of the heap file, in hopes that empty pages at the end can be truncated by VACUUM

Incomplete itemMake FSM return free space based on table clustering, to assist in maintaining clustering?

[D] Completed itemImprove dead row detection during multi-statement transactions usage

Incomplete itemConsider a more compact data representation for dead tuple locations within VACUUM

Incomplete itemProvide more information in order to improve user-side estimates of dead space bloat in relations

Auto-vacuum

[E] Incomplete itemIssue log message to suggest VACUUM FULL if a table is nearly empty?

[D] Completed itemImprove control of auto-vacuum

Incomplete itemPrevent long-lived temporary tables from causing frozen-xid advancement starvation

The problem is that autovacuum cannot vacuum them to set frozen xids; only the session that created them can do that.

[D] Completed itemStore per-table autovacuum settings in pg_class.reloptions.

Incomplete itemPrevent autovacuum from running if an old transaction is still running from the last vacuum

Locking

Incomplete itemFix priority ordering of read and write light-weight locks

Incomplete itemFix problem when multiple subtransactions of the same outer transaction hold different types of locks, and one subtransaction aborts

Incomplete itemAllow UPDATEs on only non-referential integrity columns not to conflict with referential integrity locks

Incomplete itemAdd idle_in_transaction_timeout GUC so locks are not held for long periods of time

Incomplete itemImprove deadlock detection when a page cleaning lock conflicts with a shared buffer that is pinned

Incomplete itemDetect deadlocks involving LockBufferForCleanup()

Incomplete itemConsider a lock timeout parameter

Incomplete itemConsider improving serialized transaction behavior to avoid anomalies

Startup Time Improvements

Incomplete itemExperiment with multi-threaded backend for backend creation

This would prevent the overhead associated with process creation. Most operating systems have trivial process creation time compared to database startup overhead, but a few operating systems (Win32, Solaris) might benefit from threading. Also explore the idea of a single session using multiple threads to execute a statement faster.

Write-Ahead Log

Incomplete itemEliminate need to write full pages to WAL before page modification

Currently, to protect against partial disk page writes, we write full page images to WAL before they are modified so we can correct any partial page writes during recovery. These pages can also be eliminated from point-in-time archive files.

Incomplete itemWhen full page writes are off, write CRC to WAL and check file system blocks on recovery

If CRC check fails during recovery, remember the page in case a later CRC for that page properly matches.

Incomplete itemWrite full pages during file system write and not when the page is modified in the buffer cache

This allows most full page writes to happen in the background writer. It might cause problems for applying WAL on recovery into a partially-written page, but later the full page will be replaced from WAL.

Incomplete itemAllow WAL traffic to be streamed to another server for stand-by replication

Incomplete itemReduce WAL traffic so only modified values are written rather than entire rows

Incomplete itemAllow WAL information to recover corrupted pg_controldata

Incomplete itemFind a way to reduce rotational delay when repeatedly writing last WAL page

Currently fsync of WAL requires the disk platter to perform a full rotation to fsync again. One idea is to write the WAL to different offsets that might reduce the rotational delay.

Incomplete itemAllow WAL logging to be turned off for a table, but the table might be dropped or truncated during crash recovery

Allow tables to bypass WAL writes and just fsync() dirty pages on commit. This should be implemented using ALTER TABLE, e.g. ALTER TABLE PERSISTENCE [ DROP | TRUNCATE | DEFAULT ]. Tables using non-default logging should not use referential integrity with default-logging tables. A table without dirty buffers during a crash could perhaps avoid the drop/truncate.

Incomplete itemAllow WAL logging to be turned off for a table, but the table would avoid being truncated/dropped

To do this, only a single writer can modify the table, and writes must happen only on new pages so the new pages can be removed during crash recovery. Readers can continue accessing the table. Such tables probably cannot have indexes. One complexity is the handling of indexes on TOAST tables.

Incomplete itemSpeed WAL recovery by allowing more than one page to be prefetched

This should be done utilizing the same infrastructure used for prefetching in general to avoid introducing complex error-prone code in WAL replay.

Incomplete itemImprove WAL concurrency by increasing lock granularity

Incomplete itemBe more aggressive about creating WAL files

Incomplete itemHave resource managers report the duration of their status changes

Incomplete itemMove pgfoundry's xlogdump to /contrib and have it rely more closely on the WAL backend code

Optimizer / Executor

Incomplete itemImprove selectivity functions for geometric operators

Incomplete itemPrecompile SQL functions to avoid overhead

Incomplete itemCreate utility to compute accurate random_page_cost value

Incomplete itemConsider increasing the default values of from_collapse_limit, join_collapse_limit, and/or geqo_threshold

Incomplete itemImprove ability to display optimizer analysis using OPTIMIZER_DEBUG

Incomplete itemHave EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage

Incomplete itemLog statements where the optimizer row estimates were dramatically different from the number of rows actually found?

Incomplete itemImprove how ANALYZE computes in-doubt tuples

Incomplete itemConsider compressed annealing to search for query plans

This might replace GEQO.

Incomplete itemConsider using a hash for joining to a large IN (VALUES ...) list

Incomplete itemallow single batch hash joins to preserve outer pathkeys

Incomplete item"lazy" hash tables - look up only the tuples that are actually requested

Incomplete itemavoid building the same hash table more than once during the same query

Incomplete itemavoid hashing for distinct and then re-hashing for hash join

Incomplete itemallow hashing to be used on arrays, if the element type is hashable

Background Writer

Incomplete itemConsider having the background writer update the transaction status hint bits before writing out the page

Implementing this requires the background writer to have access to system catalogs and the transaction status log.

Incomplete itemConsider adding buffers the background writer finds reusable to the free list

Incomplete itemAutomatically tune bgwriter_delay based on activity rather then using a fixed interval

Incomplete itemConsider whether increasing BM_MAX_USAGE_COUNT improves performance

Incomplete itemTest to see if calling PreallocXlogFiles() from the background writer will help with WAL segment creation latency

Concurrent Use of Resources

Incomplete itemDo async I/O for faster random read-ahead of data

Async I/O allows multiple I/O requests to be sent to the disk with results coming back asynchronously.

The above patch is already applied as of 8.4, but it still remains to figure out how to handle plain indexscans effectively.

Incomplete itemExperiment with multi-threaded backend for better I/O utilization

This would allow a single query to make use of multiple I/O channels simultaneously. One idea is to create a background reader that can pre-fetch sequential and index scan pages needed by other backends. This could be expanded to allow concurrent reads from multiple devices in a partitioned table.

Incomplete itemExperiment with multi-threaded backend for better CPU utilization

This would allow several CPUs to be used for a single query, such as for sorting or query execution.

Incomplete itemSMP scalability improvements

TOAST

Incomplete itemAllow user configuration of TOAST thresholds

Incomplete itemReduce unnecessary cases of deTOASTing

Incomplete itemReduce costs of repeat de-TOASTing of values

Miscellaneous Performance

Incomplete itemUse mmap() rather than SYSV shared memory or to write WAL files?

This would remove the requirement for SYSV SHM but would introduce portability issues. Anonymous mmap (or mmap to /dev/zero) is required to prevent I/O overhead.

Incomplete itemConsider mmap()'ing files into a backend?

Doing I/O to large tables would consume a lot of address space or require frequent mapping/unmapping. Extending the file also causes mapping problems that might require mapping only individual pages, leading to thousands of mappings. Another problem is that there is no way to _prevent_ I/O to disk from the dirty shared buffers so changes could hit disk before WAL is written.

Incomplete itemAdd a script to ask system configuration questions and tune postgresql.conf

Incomplete itemConsider ways of storing rows more compactly on disk:

  • Reduce the row header size?
  • Consider reducing on-disk varlena length from four bytes to two because a heap row cannot be more than 64k in length

Incomplete itemConsider transaction start/end performance improvements

Incomplete itemAllow configuration of backend priorities via the operating system

Though backend priorities make priority inversion during lock waits possible, research shows that this is not a huge problem.

Incomplete itemConsider increasing the minimum allowed number of shared buffers

Incomplete itemExpire published xmin for read-only and idle transactions

Incomplete itemConsider if CommandCounterIncrement() can avoid its AcceptInvalidationMessages() call

[D] Completed itemImprove performance of shared invalidation queue for multiple CPUs

Incomplete itemConsider Cartesian joins when both relations are needed to form an indexscan qualification for a third relation

Incomplete itemConsider not storing a NULL bitmap on disk if all the NULLs are trailing

Incomplete itemSort large UPDATE/DELETEs so it is done in heap order

Incomplete itemAllow one transaction to see tuples using the snapshot of another transaction

This would assist multiple backends in working together.

Incomplete itemConsider decreasing the I/O caused by updating tuple hint bits

Incomplete itemInformation schema performance improvements

Miscellaneous Other

Incomplete itemDeal with encoding issues for filenames in the server filesystem

Incomplete itemDeal with encoding issues in the output of localeconv()

Source Code

Incomplete itemAdd use of 'const' for variables in source tree

Incomplete itemMove some things from contrib into main tree

[E] Incomplete itemRemove warnings created by -Wcast-align

Incomplete itemMove platform-specific ps status display info from ps_status.c to ports

Incomplete itemAdd optional CRC checksum to heap and index pages

One difficulty is how to prevent hint bit changes from affecting the computed CRC checksum.

Incomplete itemImprove documentation to build only interfaces

Incomplete itemRemove or relicense modules that are not under the BSD license, if possible

Incomplete itemAllow cross-compiling by generating the zic database on the target system

Incomplete itemImprove NLS maintenance of libpgport messages linked onto applications

Incomplete itemImprove the module installation experience (/contrib, etc)

Incomplete itemUse UTF8 encoding for NLS messages so all server encodings can read them properly

Incomplete itemUpdate Bonjour to work with newer cross-platform SDK

Incomplete itemAllow creation of universal binaries for Darwin

Incomplete itemConsider GnuTLS if OpenSSL license becomes a problem

Incomplete itemConsider making NAMEDATALEN more configurable in future releases

[D] Completed itemUpdate our code to handle 64-bit timezone files to match the zic source code, which now uses them

[D] Completed itemHave configure choose integer datetimes by default

Incomplete itemSupport scoped IPv6 addresses

[D] Completed itemConsider allowing 64-bit integers and floats to be passed by value on 64-bit platforms

Also change 32-bit floats (float4) to be passed by value at the same time.

Incomplete itemResearch use of signals and sleep wake ups

Incomplete itemAdd automated check for invalid C++ source code constructs

Incomplete itemAllow C++ code to more easily access backend code

Incomplete itemConsider simplifying how memory context resets handle child contexts

[D] Completed itemRemove use of MAKE_PTR and MAKE_OFFSET macros

Incomplete itemCreate three versions of libpgport to simplify client code

[D] Completed itemRemove old-style routines for manipulating tuples

Incomplete itemImprove detection of shared memory segments being used by others by checking the SysV shared memory field 'nattch'

Incomplete itemImplement the non-threaded Avahi service discovery protocol

Incomplete itemFix system views like pg_stat_all_tables to use set-returning functions, rather than views of per-column functions

Incomplete itemAllow table and index WITH options to be specified via hooks, for use with plugins like GiST index methods

Incomplete itemReduce data row alignment requirements on some 64-bit systems

Incomplete itemAdd support for returning multiple result sets?

Incomplete itemRestructure TOAST internal storage format for greater flexibility

Documentation

Incomplete itemConvert single quotes to apostrophes in the PDF documentation

Incomplete itemProvide a manpage for postgresql.conf

Incomplete itemChange the manpage-generating toolchain to use the new XML-based docbook2x tools

Incomplete itemConsider changing documentation format from SGML to XML

Windows

Incomplete itemRemove configure.in check for link failure when cause is found

Incomplete itemRemove readdir() errno patch when runtime/mingwex/dirent.c rev 1.4 is released

[D] Completed itemRemove psql newline patch when we find out why mingw outputs an extra newline

Incomplete itemAllow psql to use readline once non-US code pages work with backslashes

Incomplete itemFix problem with shared memory on the Win32 Terminal Server

Incomplete itemDiagnose problem where shared memory can sometimes not be attached by postmaster children

Incomplete itemImprove signal handling

Incomplete itemConvert MSVC build system to remove most batch files

[D] Completed itemPrevent SSL from sending network packets to avoid interference with Win32 signal emulation

Incomplete itemSupport pgxs when using MSVC

Incomplete itemFix MSVC NLS support, like for to_char()

Incomplete itemFind a correct rint() substitute on Windows

Incomplete itemReduce compiler warnings on 64-bit Windows

Incomplete itemFix global namespace issues when using multiple terminal server sessions

Incomplete itemChange from the current autoconf/gmake build system to cmake

Incomplete itemImprove consistency of path separator usage

Wire Protocol Changes

Incomplete itemAllow dynamic character set handling

Incomplete itemAdd decoded type, length, precision

Incomplete itemUse compression?

Incomplete itemUpdate clients to use data types, typmod, schema.table.column names of result sets using new statement protocol

Exotic Features

Incomplete itemAdd pre-parsing phase that converts non-ISO syntax to supported syntax

This could allow SQL written for other databases to run without modification.

Incomplete itemAllow plug-in modules to emulate features from other databases

Incomplete itemAdd features of Oracle-style packages

A package would be a schema with session-local variables, public/private functions, and initialization functions. It is also possible to implement these capabilities in any schema and not use a separate "packages" syntax at all.

Incomplete itemConsider allowing control of upper/lower case folding of unquoted identifiers

Incomplete itemAdd autonomous transactions

Incomplete itemGive query progress indication

Incomplete itemRethinking our type system

Features We Do Not Want

Incomplete itemAll backends running as threads in a single process (not wanted)

This eliminates the process protection we get from the current setup. Thread creation is usually the same overhead as process creation on modern systems, so it seems unwise to use a pure threaded model.

Incomplete itemOptimizer hints (not wanted)

Optimizer hints are used to work around problems in the optimizer. We would rather have the problems reported and fixed.

Incomplete itemEmbedded server (not wanted)

While PostgreSQL clients runs fine in limited-resource environments, the server requires multiple processes and a stable pool of resources to run reliably and efficiently. Stripping down the PostgreSQL server to run in the same process address space as the client application would add too much complexity and failure cases.

Incomplete itemObfuscated function source code (not wanted)

Obfuscating function source code has minimal protective benefits because anyone with super-user access can find a way to view the code. To prevent non-super-users from viewing function source code, remove SELECT permission on pg_proc.

Personal tools