User:Simon

Simon's Work in Progress: Prioritised Todo

This list contains all known PostgreSQL bugs and feature requests. If you would like to work on an item, please read the Developer FAQ first. There is also a development information page.

- marks ordinary, incomplete items
[E] - marks items that are easier to implement
[D] - marks changes that are done, and will appear in the next major release

For help on editing this list, please see Talk:Todo. Please do not add items here without discussion on the mailing list.

Administration

Allow administrators to cancel multi-statement idle transactions

This allows locks to be released, but it is complex to report the cancellation back to the client.

Check for unreferenced table files created by transactions that were in-progress when the server terminated abruptly Removing unreferenced files

Set proper permissions on non-system schemas during db creation: Currently all schemas are owned by the super-user because they are copied from the template1 database. However, since all objects are inherited from the template database, it is not clear that setting schemas to the db owner is correct.

Allow log_min_messages to be specified on a per-module basis: This would allow administrators to see more detailed information from specific sections of the backend, e.g. checkpoints, autovacuum, etc. Another idea is to allow separate configuration files for each module, or allow arbitrary SET commands to be passed to them. See also Logging Brainstorm.

Simplify ability to create partitioned tables: This would allow creation of partitioned tables without requiring creation of triggers or rules for INSERT/UPDATE/DELETE, and constraints for rapid partition selection. Options could include range and hash partition selection. See also Table partitioning

Allow auto-selection of partitioned tables for min/max() operations: There was a patch on -hackers from July 2009, but it has not been merged: http://archives.postgresql.org/pgsql-hackers/2009-07/msg01115.php

[D] Allow more complex user/database default GUC settings

Currently ALTER USER and ALTER DATABASE support per-user and per-database defaults. Consider adding per-user-and-database defaults so things like search_path can be defaulted for a specific user connecting to a specific database.

Allow custom variables to appear in pg_settings() Re: count(*) performance improvement ideas

Allow custom variable classes that can restrict who can set the values

The common cases (POSTMASTER, SIGHUP, and SUSET) are already handled to some extent as of 8.4. Should we mark this DONE?

custom variable classes

Implement the SQL standard mechanism whereby REVOKE ROLE revokes only the privilege granted by the invoking role, and not those granted by other roles Re: Grantor name gets lost when grantor role dropped

Improve server security options Re: [0/4] Proposal of SE-PostgreSQL patches Re: [0/4] Proposal of SE-PostgreSQL patches

Prevent query cancel packets from being replayed by an attacker, especially when using SSL Replay attack of query cancel

Provide a way to query the log collector subprocess to determine what the currently active log file is Current log files when rotating?

Allow the client to authenticate the server in a Unix-domain socket connection, e.g., using SO_PEERCRED http://archives.postgresql.org/message-id/20090401173756.GB21229@svana.org

[D] Allow the client to set an application_name to appear in pg_stat_activity http://archives.postgresql.org/message-id/407d949e0907161237r76ebd92av6836c6563d8a230e@mail.gmail.com

[D] Allow server-side enforcement of password policies

Password checks might include password complexity or non-reuse of passwords. This facility will require the client to send password creation/changes to the server in plain-text, not MD5.

Allow custom daemons to be automatically stopped/started along postmaster

This allows easier administration of daemons like user job schedulers or replication-related daemons.

http://archives.postgresql.org/pgsql-hackers/2010-02/msg01701.php

Have custom GUCs be transaction safe Custom GUCs still a bit broken

Configuration files

Allow pg_hba.conf to specify host names along with IP addresses

Host name lookup could occur when the postmaster reads the pg_hba.conf file, or when the backend starts. Another solution would be to reverse lookup the connection IP and check that hostname against the host names in pg_hba.conf. We could also then check that the host name maps to the IP address.

TODO Item: Allow pg_hba.conf to specify host names along with IP addresses

Allow postgresql.conf file values to be changed via an SQL API, perhaps using SET GLOBAL

Allow the server to be stopped/restarted via an SQL API

Consider normalizing fractions in postgresql.conf, perhaps using '%' Fractions in GUC variables

Allow Kerberos to disable stripping of realms so we can check the username@realm against multiple realms krb_match_realm patch

Add functions to check correctness of configuration files before they are loaded "live"

Improve LDAP authentication configuration options Proposed Patch - LDAPS support for servers on port 636 w/o TLS

Add external tool to auto-tune some postgresql.conf parameters Re: Overhauling GUCS http://archives.postgresql.org/pgsql-hackers/2008-11/msg00033.php

Add 'hostgss' pg_hba.conf option to allow GSS link-level encryption Re: Plans for 8.4

Process pg_hba.conf keywords as case-insensitive http://archives.postgresql.org/pgsql-hackers/2009-09/msg00432.php

Tablespaces

Allow a database in tablespace t1 with tables created in tablespace t2 to be used as a template for a new database created with default tablespace t2: Currently all objects in the default database tablespace must have default tablespace specifications. This is because new databases are created by copying directories. If you mix default tablespace tables and tablespace-specified tables in the same directory, creating a new database from such a mixed directory would create a new database with tables that had incorrect explicit tablespaces. To fix this would require modifying pg_class in the newly copied database, which we don't currently do.

Allow reporting of which objects are in which tablespaces: This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.

Allow WAL replay of CREATE TABLESPACE to work when the directory structure on the recovery computer is different from the original

Allow per-tablespace quotas

[D] Allow per-tablespace random_page_cost http://archives.postgresql.org/pgsql-hackers/2009-10/msg01128.php http://archives.postgresql.org/pgsql-hackers/2009-10/msg01486.php

Statistics Collector

Allow statistics last vacuum/analyze execution times to be displayed without requiring track_counts to be enabled row-level stats and last analyze time

Clear table counters on TRUNCATE Small TRUNCATE glitch

Allow the clearing of cluster-level statistics http://archives.postgresql.org/pgsql-hackers/2009-03/msg00917.php

Point-In-Time Recovery (PITR)

[D] Allow a warm standby system to also allow read-only statements Updated propsoal for read-only queries on PITR slaves (SoC 2007)

[E] Create dump tool for write-ahead logs for use in determining transaction id for point-in-time recovery: This is useful for checking PITR recovery.

Allow recovery.conf to support the same syntax as postgresql.conf, including quoting recovery.conf parsing problems

Allow archive_mode to be changed without server restart? http://archives.postgresql.org/pgsql-hackers/2008-10/msg01655.php

Consider avoiding WAL switching via archive_timeout if there has been no database activity http://archives.postgresql.org/pgsql-hackers/2010-01/msg01469.php http://archives.postgresql.org/pgsql-hackers/2010-02/msg00395.php

[E] Expose pg_controldata via SQL interface: Helpful for monitoring replicated databases; initial patch

SSL

Allow SSL authentication/encryption over unix domain sockets Re: Spoofing as the postmaster

Allow SSL key file permission checks to be optionally disabled when sharing SSL keys with other applications BUG #3809: SSL "unsafe" private key permissions bug

Allow SSL CRL files to be re-read during configuration file reload, rather than requiring a server restart

Unlike SSL CRT files, CRL (Certificate Revocation List) files are updated frequently

http://archives.postgresql.org/pgsql-general/2008-12/msg00832.php

Alternatively or additionally supporting OCSP (online certificate security protocol) would provide real-time revocation discovery without reloading

Allow automatic selection of SSL client certificates from a certificate store Allow multiple certificates or keys in the postgresql.crt/.key files

Data Types

Change NUMERIC to enforce the maximum precision

Reduce storage space for small NUMERICs Saving space for common kinds of numeric values Numeric patch to add special-case representations for < 8 bytes Re: Reducing NUMERIC size for 8.3

Fix data types where equality comparison isn't intuitive, e.g. box

Add support for public SYNONYMs Proposal for SYNONYMS

Add support for SQL-standard GENERATED/IDENTITY columns Re: Three weeks left until feature freeze GENERATED ... AS IDENTITY, Was: Re: Feature Freeze Behavior of GENERATED columns per SQL2003 Re: [HACKERS] Behavior of GENERATED columns per SQL2003 IDENTITY/GENERATED patch

Consider placing all sequences in a single table, or create a system view Re: newbie: renaming sequences task

Consider a special data type for regular expressions Why is there a tsquery data type?

Reduce BIT data type overhead using short varlena headers storage size of "bit" data type..

Allow adding/renaming/removing enumerated values to an existing enumerated data type Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing

Support scoped IPv6 addresses in inet type strange problem with ip6

Add JSON (JavaScript Object Notation) data type

This would behave similar to the XML data type, which is stored as text, but allows element lookup and conversion functions.

Considering improving performance of computing CHAR() value lengths http://archives.postgresql.org/pgsql-hackers/2009-06/msg00900.php http://archives.postgresql.org/pgsql-hackers/2010-02/msg01787.php

Domains

Fix CREATE CAST on DOMAINs bug? non working casts for domain TODO: Fix CREATE CAST on DOMAINs

Allow domains to be cast Domain casting still doesn't work right domain casting?

Make domains work better with polymorphic functions Polymorphic types vs. domains some difficulties with fixing it

Dates and Times

Allow infinite intervals just like infinite timestamps

Allow TIMESTAMP WITH TIME ZONE to store the original timezone information, either zone name or offset from UTC

If the TIMESTAMP value is stored with a time zone name, interval computations should adjust based on the time zone rules.

timestamp with time zone a la sql99

Fix SELECT '0.01 years'::interval, '0.01 months'::interval

Have timestamp subtraction not call justify_hours()? timestamp subtraction (was Re: formatting intervals with to_char)

Improve timestamptz subtraction to be DST-aware: Currently subtracting one date from another that crosses a daylight savings time adjustment can return '1 day 1 hour', but adding that back to the first date returns a time one hour in the future. This is caused by the adjustment of '25 hours' to '1 day 1 hour', and '1 day' is the same time the next day, even if daylight savings adjustments are involved.

Fix interval display to support values exceeding 2^31 hours

Add overflow checking to timestamp and interval arithmetic

[D] Revise the src/timezone/tznames abbreviation files: to add missing abbreviations to find abbreviations that can be safely promoted to the Default list BUG #4377: casting result of timeofday() to timestamp fails in some timezones

Arrays

Add support for arrays of domains Re: updated WIP: arrays of composites

Allow single-byte header storage for array elements

Add function to detect if an array is empty http://archives.postgresql.org/pgsql-hackers/2008-11/msg00475.php

Improve handling of empty arrays http://archives.postgresql.org/pgsql-hackers/2008-10/msg01033.php

Improve handling of NULLs in arrays http://archives.postgresql.org/pgsql-bugs/2008-11/msg00009.php

Binary Data

Improve vacuum of large objects, like contrib/vacuumlo?

[D] Add security checks for large objects

Auto-delete large objects when referencing row is deleted: contrib/lo offers this functionality.

Allow read/write into TOAST values like large objects: This requires the TOAST column to be stored EXTERNAL.

Add API for 64-bit large object access 64-bit API for large objects

MONEY Data Type

Add locale-aware MONEY type, and support multiple currencies A real currency type Money type todos?

MONEY dumps in a locale-specific format making it difficult to restore to a system with a different locale

Allow MONEY to be easily cast to/from other numeric data types

Text Search

Allow dictionaries to change the token that is passed on to later dictionaries a tsearch2 (8.2.4) dictionary that only filters out stopwords

Consider a function-based API for '@@' searches Simplifying Text Search

Improve text search error messages Poorly designed tsearch NOTICEs Re: Poorly designed tsearch NOTICEs

Consider changing error to warning for strings larger than one megabyte BUG #3975: tsearch2 index should not bomb out of 1Mb limit Re: [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

tsearch and tsdicts regression tests fail in Turkish locale on glibc tsearch with Turkish locale

XML

Allow xml arrays to be cast to other data types proposal casting from XML[] to int[], numeric[], text[] Re: proposal casting from XML[] to int[], numeric[], text[] Re: proposal casting from XML[] to int[], numeric[], text[]

Add XML Schema validation and xmlvalidate function (SQL:2008)

Add xmlvalidatedtd variant to support validating against a DTD?

Relax-NG validation; libxml2 supports this already

Make it work reliably for non-UTF8 server encoding (xpath()) in particular is known to not work) http://archives.postgresql.org/pgsql-bugs/2009-01/msg00135.php http://archives.postgresql.org/message-id/4110.1238973350@sss.pgh.pa.us

Extra functions from SQL:2006: XMLDOCUMENT, XMLCAST, XMLTEXT

Inline ORDER BY for XMLAGG. Example: "... XMLAGG(XMLELEMENT(...) ORDER BY col1) ..." (should be made to work with all aggregate functions)

XMLNAMESPACES support in XMLELEMENT and elsewhere

XSLT support; already available in contrib/xml2, but needs API fixes and adaptation to xml type.

XML Canonical: Convert XML documents to canonical form to compare them. libxml2 has support for this.

Pretty-printing XML: Parse a document and serialize it back in some indented form. libxml2 might support this.

XMLQUERY (from SQL/XML standard)

In some cases shredding could be better option (if there is no need in keeping XML docs entirely; if we have already developed tools that understand only relational data; etc) -- it would be a separate module that implements annotated schema decomposition technique, similar to DB2 and SQL Server functionality.

Nested or repeated xpath() apparently mess up namespaces [1] [2] [3] [4] [5]

XPath: Adding the <x> at the root causes problems [6] [7] [8]

xpath_table needs to be implemented/implementable to get rid of contrib/xml2 [9]

better handling of XPath data types [10] [11]

xpath_exists() is needed. It checks, whether or not the path specified exists in the XML value. (W/o this function we need to use weird "array_dims(xpath(...)) IS NOT NULL" syntax.)

Functions

Allow INET subnet tests using non-constants to be indexed

Allow to_date() and to_timestamp() to accept localized month names

Add missing parameter handling in to_char() Re: to_char and i18n

Throw an error from to_char() instead of printing a string of "#" when a number doesn't fit in the desired output format. discussed in "to_char, support for EEEE format"

[D] Add functions to get/set bit values implemented missing bitSetBit() and bitGetBit() Re: implemented missing bitSetBit() and bitGetBit()

Allow to_char() on interval values to accumulate the highest unit requested

Some special format flag would be required to request such accumulation. Such functionality could also be added to EXTRACT. Prevent accumulation that crosses the month/day boundary because of the uneven number of days in a month.

to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
to_char(INTERVAL '3 years 5 months','MM') => 41

Allow SQL-language functions to reference parameters by parameter name: Currently SQL-language functions can only refer to dollar parameters, e.g. $1

Add SPI_gettypmod() to return the typemod for a TupleDesc

Enforce typmod for function inputs, function results and parameters for spi_prepare'd statements called from PLs Re: BUG #2917: spi_prepare doesn't accept typename aliases http://archives.postgresql.org/pgsql-hackers/2009-11/msg01160.php

Allow holdable cursors in SPI

Tighten function permission checks Re: Security leak with trigger functions?

Fix IS OF so it matches the ISO specification, and add documentation Re: [HACKERS] IS OF ToDo: add documentation for operator IS OF

Add missing operators for geometric data types

Some geometric types do not have the full suite of geometric operators, e.g. box @> point

point_ops for GiST

Implement Boyer-Moore searching in LIKE queries TODO item: Implement Boyer-Moore searching (First time hacker)

Prevent malicious functions from being executed with the permissions of unsuspecting users

Index functions are safe, so VACUUM and ANALYZE are safe too. Triggers, CHECK and DEFAULT expressions, and rules are still vulnerable.

Some notes about the index-functions security vulnerability

Reduce memory usage of aggregates in set returning functions Re: Performance of aggregates over set-returning functions

Fix /contrib/ltree operator BUG #3720: wrong results at using ltree

Fix inconsistent precedence of =, >, and < compared to <>, >=, and <= BUG #3822: Nonstandard precedence for comparison operators

Fix regular expression bug when using complex back-references BUG #3645: regular expression back references seem broken

Have /contrib/dblink reuse unnamed connections dblink un-named connection doesn't get re-used

Allow calling of a procedure outside a SELECT that can control the transaction state Proposal: real procedures again (8.4)

[D] Add has_sequence_privilege() http://archives.postgresql.org/pgsql-hackers/2008-09/msg00032.php

Improve formatting of pg_get_viewdef() output http://archives.postgresql.org/pgsql-hackers/2009-01/msg01648.php http://archives.postgresql.org/pgsql-hackers/2009-08/msg01885.php

Add printf()-like functionality http://archives.postgresql.org/pgsql-hackers/2009-09/msg00367.php

Fix to_number() handling for values not matching the format string http://archives.postgresql.org/pgsql-hackers/2009-09/msg01447.php

Add function to dump pg_depend information cleanly http://archives.postgresql.org/pgsql-hackers/2009-09/msg00226.php

Multi-Language Support

Add NCHAR (as distinguished from ordinary varchar),

Allow more fine-grained collation selection; add CREATE COLLATION.

Right now the collation is fixed at database creation time.

Add a LOCALE option to CREATE DATABASE, as a shorthand Re: 8.4 open items list

Support multiple simultaneous character sets, per SQL:2008

Improve UTF8 combined character handling?

Add octet_length_server() and octet_length_client()

Make octet_length_client() the same as octet_length()?

Fix problems with wrong runtime encoding conversion for NLS message files

Add URL to more complete multi-byte regression tests Multi-byte and client side character encoding tests for copy command..

Fix contrib/fuzzystrmatch to work with multibyte encodings soundex function returns UTF-16 characters

Set client encoding based on the client operating system encoding

Currently client_encoding is set in postgresql.conf, which defaults to the server encoding.

Re: [GENERAL] invalid byte sequence ?

Change memory allocation for multi-byte functions so memory is allocated inside conversion functions: Currently we preallocate memory based on worst-case usage.

Add ability to use case-insensitive regular expressions on multi-byte characters

ILIKE already works with multi-byte characters

Improve encoding of connection startup messages sent to the client

Currently some authentication error messages are sent in the server encoding

Have pg_stat_activity display query strings in the correct client encoding http://archives.postgresql.org/pgsql-hackers/2009-01/msg00131.php

More sensible support for Unicode combining characters, normal forms http://archives.postgresql.org/message-id/200904141532.44618.peter_e@gmx.net

Views / Rules

Automatically create rules on views so they are updateable, per SQL:2008

We can only auto-create rules for simple views. For more complex cases users will still have to write rules manually.

Add the functionality for WITH CHECK OPTION clause of CREATE VIEW

Allow VIEW/RULE recompilation when the underlying tables change

This is both difficult and controversial.

Make it possible to use RETURNING together with conditional DO INSTEAD rules, such as for partitioning setups RETURNING and DO INSTEAD ... Intentional or not?

Add the ability to automatically create materialized views: Right now materialized views require the user to create triggers on the main table to keep the summary table current. SQL syntax should be able to manage the triggers and summary table automatically. A more sophisticated implementation would automatically retrieve from the summary table when the main table is referenced, if possible. See Materalized Views for implementation details.

Improve ability to modify views via ALTER TABLE Re: idea: storing view source in system catalogs modifying views Re: patch: Add columns via CREATE OR REPLACE VIEW

Prevent low-cost functions from seeing unauthorized view rows http://archives.postgresql.org/pgsql-hackers/2009-10/msg01346.php

SQL Commands

Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT

Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php http://archives.postgresql.org/pgsql-hackers/2009-05/msg00466.php

[E] Allow SET CONSTRAINTS to be qualified by schema/table name

[E] Fix TRUNCATE ... RESTART IDENTITY so its effect on sequences is rolled back on transaction abort Re: [PATCHES] TRUNCATE TABLE with IDENTITY

Allow PREPARE of cursors

Allow finer control over the caching of prepared query plans

Currently anonymous (un-named) queries prepared via the wire protocol are replanned every time bind parameters are supplied --- allow SQL PREPARE to do the same. Also, allow control over replanning prepared queries either manually or automatically when statistics for execute parameters differ dramatically from those used during planning.

http://archives.postgresql.org/message-id/201002151911.o1FJBYh22763@momjian.us

Improve logging of prepared transactions recovered during startup "recovering prepared transaction" after server restart message

Allow prepared transactions with temporary tables created and dropped in the same transaction, and when an ON COMMIT DELETE ROWS temporary table is accessed Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables A suggestion on how to implement this

Add a GUC variable to warn about non-standard SQL usage in queries

Add SQL-standard MERGE/REPLACE/UPSERT command: MERGE is typically used to merge two tables. REPLACE or UPSERT command does UPDATE, or on failure, INSERT. See SQL MERGE for notes on the implementation details.

Add NOVICE output level for helpful messages like automatic sequence/index creation

Add GUC to issue notice about statements that use unjoined tables

Allow EXPLAIN to identify tables that were skipped because of constraint_exclusion

[D] Allow EXPLAIN output to be more easily processed by scripts, perhaps XML http://archives.postgresql.org/pgsql-hackers/2009-05/msg00857.php

Enable standard_conforming_strings by default in 9.1?: When this is done, backslash-quote should be prohibited in non-E'' strings because of possible confusion over how such strings treat backslashes. Basically, '' is always safe for a literal single quote, while \' might or might not be based on the backslash handling rules.

Simplify dropping roles that have objects in several databases

Allow the count returned by SELECT, etc to be represented as an int64 to allow a higher range of values

Add support for WITH RECURSIVE ... CYCLE http://archives.postgresql.org/pgsql-hackers/2008-10/msg00291.php

Add DEFAULT .. AS OWNER so permission checks are done as the table owner: This would be useful for SERIAL nextval() calls and CHECK constraints.

Allow DISTINCT to work in multiple-argument aggregate calls

Add column to pg_stat_activity that shows the progress of long-running commands like CREATE INDEX and VACUUM EXPLAIN progress info

Allow INSERT/UPDATE/DELETE ... RETURNING inside a SELECT 'FROM' clause or target list

Actually it would be saner to allow this in WITH

Allow INSERT/UPDATE/DELETE ... RETURNING in common table expressions http://archives.postgresql.org/pgsql-hackers/2009-10/msg00472.php

Add comments on system tables/columns using the information in catalogs.sgml: Ideally the information would be pulled from the SGML file automatically.

Prevent the specification of conflicting transaction read/write options http://archives.postgresql.org/pgsql-hackers/2009-01/msg00684.php

Support LATERAL subqueries

Lateral subqueries can reference columns of tables defined outside the subquery at the same level, i.e. laterally. For example, a LATERAL subquery in a FROM clause could reference tables defined in the same FROM clause. Currently only the columns of tables defined above subqueries are recognized.

[D] Forbid COMMENT on columns of an index

Postgres currently allows comments to be placed on the columns of an index, but pg_dump doesn't handle them and the column names themselves are implementation-dependent.

http://archives.postgresql.org/message-id/27676.1237906577@sss.pgh.pa.us

Add support for functional dependencies: This would allow omitting GROUP BY columns when grouping by the primary key.

CREATE

Allow CREATE TABLE AS to determine column lengths for complex expressions like SELECT col1

Have WITH CONSTRAINTS also create constraint indexes Re: CREATE TABLE LIKE INCLUDING INDEXES support

Move NOT NULL constraint information to pg_constraint

Currently NOT NULL constraints are stored in pg_attribute without any designation of their origins, e.g. primary keys. One manifest problem is that dropping a PRIMARY KEY constraint does not remove the NOT NULL constraint designation. Another issue is that we should probably force NOT NULL to be propagated from parent tables to children, just as CHECK constraints are. (But then does dropping PRIMARY KEY affect children?)

Prevent concurrent CREATE TABLE from sometimes returning a cryptic error message BUG #3692: Conflicting create table statements throw unexpected error

Add CREATE SCHEMA ... LIKE that copies a schema

[D] Add CREATE TABLE LIKE ... INCLUDING COMMENTS

[D] Have CREATE TABLE LIKE copy column storage parameters http://archives.postgresql.org/pgsql-hackers/2008-07/msg01417.php http://archives.postgresql.org/pgsql-hackers/2008-08/msg00423.php http://archives.postgresql.org/pgsql-hackers/2008-09/msg00576.php http://archives.postgresql.org/pgsql-hackers/2008-09/msg00824.php

CREATE OR REPLACE FUNCTION might leave dependent objects depending on the function in inconsistent state indexes on functions and create or replace function

Allow GLOBAL temporary tables to exist as empty by default in all sessions what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL http://archives.postgresql.org/pgsql-hackers/2009-04/msg01329.php http://archives.postgresql.org//pgsql-hackers/2009-05/msg00016.php

Add OR REPLACE to CREATE LANGUAGE http://archives.postgresql.org/pgsql-patches/2008-05/msg00057.php

Allow the creation of "distinct" types http://archives.postgresql.org/pgsql-hackers/2008-10/msg01647.php

UPDATE

Allow UPDATE tab SET ROW (col, ...) = (SELECT...) Re: [PATCHES] extension for sql update UPDATE using sub selects UPDATE using sub selects Re: UPDATE using sub selects

Research self-referential UPDATEs that see inconsistent row versions in read-committed mode Concurrently updating an updatable view Re: Do we need a TODO? (was Re: Concurrently updating anupdatable view)

Improve performance of EvalPlanQual mechanism that rechecks already-updated rows

This is related to the previous item, which questions whether it even has the right semantics

ALTER

Have ALTER TABLE RENAME rename SERIAL sequence names Re: newbie: renaming sequences task

[E] Allow ALTER TABLE ... ALTER CONSTRAINT ... RENAME ALTER CONSTRAINT RENAME patch reverted

Add ALTER TABLE RENAME CONSTRAINT

Have ALTER SEQUENCE RENAME rename the sequence name stored in the sequence table BUG #3619: Renaming sequence does not update its 'sequence_name' field Re: BUG #3619: Renaming sequence does not update its 'sequence_name' field Re: newbie: renaming sequences task

Add ALTER DOMAIN to modify the underlying data type

[E] Allow ALTER TABLE to change constraint deferrability and actions

Add missing object types for ALTER ... SET SCHEMA

Allow ALTER TABLESPACE to move to different directories

Allow moving system tables to other tablespaces, where possible: Currently non-global system tables must be in the default database tablespace. Global system tables can never be moved.

Have ALTER INDEX update the name of a constraint using that index

Allow column display reordering by recording a display, storage, and permanent id for every column? Re: column ordering, was Re: [PATCHES] Enums patch v2 http://archives.postgresql.org/pgsql-hackers/2008-11/msg01029.php

Allow an existing index to be marked as a table's primary key Setting a pre-existing index as a primary key

Allow ALTER TYPE on composite types to perform operations similar to ALTER TABLE http://archives.postgresql.org/pgsql-hackers/2008-12/msg00245.php

Don't require table rewrite on ALTER TABLE ... ALTER COLUMN TYPE, when the old and new data types are binary compatible http://archives.postgresql.org/message-id/200903040137.n241bAUV035002@wwwmaster.postgresql.org http://archives.postgresql.org/pgsql-patches/2006-10/msg00154.php

Reduce locking required for ALTER commands http://archives.postgresql.org/pgsql-hackers/2009-08/msg00533.php http://archives.postgresql.org/pgsql-hackers/2009-10/msg01083.php http://archives.postgresql.org/pgsql-hackers/2010-01/msg02349.php

CLUSTER

Automatically maintain clustering on a table

This might require some background daemon to maintain clustering during periods of low usage. It might also require tables to be only partially filled for easier reorganization. Another idea would be to create a merged heap/index data file so an index lookup would automatically access the heap data too. A third idea would be to store heap rows in hashed groups, perhaps using a user-supplied hash function.

[E] Add default clustering to system tables: To do this, determine the ideal cluster index for each system table and set the cluster setting during initdb.

Improve CLUSTER performance by sorting to reduce random I/O http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php

[E] Make CLUSTER VERBOSE more verbose.: It is also used by new VACUUM FULL VERBOSE.

COPY

Allow COPY to report error lines and continue

This requires the use of a savepoint before each COPY line is processed, with ROLLBACK on COPY failure.

Re: VLDB Features

Allow COPY on a newly-created table to skip WAL logging: On crash recovery, the table involved in the COPY would be removed or have its heap and index files truncated. One issue is that no other backend should be able to add to the table at the same time, which is something that is currently allowed. This currently is done if the table is created inside the same transaction block as the COPY because no other backends can see the table.

Allow COPY FROM to create index entries in bulk Batch update of indexes on data loading

Allow COPY in CSV mode to control whether a quoted zero-length string is treated as NULL

Currently this is always treated as a zero-length string, which generates an error when loading into an integer column

Re: [PATCHES] allow CSV quote in NULL

Improve COPY performance Re: 8.3 / 8.2.6 restore comparison

Allow COPY to report errors sooner Timely reporting of COPY errors

[D] Improve bytea COPY format http://archives.postgresql.org/pgsql-hackers/2009-05/msg00192.php

Allow COPY to handle other number formats eg. the German notation. Best would be something like WITH DECIMAL ','.

Allow a stalled COPY to exit if the backend is terminated http://archives.postgresql.org/pgsql-bugs/2009-04/msg00067.php

GRANT/REVOKE

[D] Allow GRANT/REVOKE permissions to be applied to all schema objects with one command

The proposed syntax is: GRANT SELECT ON ALL TABLES IN public TO phpuser; GRANT SELECT ON NEW TABLES IN public TO phpuser;

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01014.php

[D] Allow GRANT/REVOKE permissions to be inherited by objects based on schema permissions http://wiki.postgresql.org/wiki/DefaultACL

Allow SERIAL sequences to inherit permissions from the base table?

Allow dropping of a role that has connection rights http://archives.postgresql.org/pgsql-hackers/2008-05/msg00736.php

DECLARE CURSOR

Prevent DROP TABLE from dropping a table referenced by its own open cursor?

Provide some guarantees about the behavior of cursors that invoke volatile functions Re: Cursor with hold emits the same row more than once across commits in 8.3.7

INSERT

Allow INSERT/UPDATE of the system-generated oid value for a row

In rules, allow VALUES() to contain a mixture of 'old' and 'new' references

SHOW/SET

Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM ANALYZE, and CLUSTER

Rationalize the discrepancy between settings that use values in bytes and SHOW that returns the object count http://archives.postgresql.org/pgsql-docs/2008-07/msg00007.php

LISTEN/NOTIFY

[D] Allow LISTEN/NOTIFY to store info in memory rather than tables: Currently LISTEN/NOTIFY information is stored in pg_listener. Storing such information in memory would improve performance.

[D] Add optional textual message to NOTIFY: This would allow an informational message to be added to the notify message, perhaps indicating the row modified or other custom information.

Allow NOTIFY in rules involving conditionals

Window Functions

See TODO items for window functions.

Support creation of user-defined window functions.: We have the ability to create new window functions written in C. Is it worth the effort to create an API that would let them be written in PL/pgsql, etc?

Implement full support for window framing clauses.: The cases we support now are basically those where no row ever exits the frame as the current row advances. To do better requires some rethinking of the window aggregate support.

Look at tuplestore performance issues.

The tuplestore_in_memory() thing is just a band-aid, we ought to try to solve it properly. tuplestore_advance seems like a weak spot as well.

http://archives.postgresql.org/pgsql-hackers/2008-12/msg00152.php

Do we really need so much duplicated code between Agg and WindowAgg?

Teach planner to evaluate multiple windows in the optimal order.

Currently windows are always evaluated in the query-specified order.

http://archives.postgresql.org/message-id/3CDAD71E9D70417290FCF66F0178D1E1@amd64

Integrity Constraints

Keys

[D] Allow DEFERRABLE UNIQUE constraints

This would allow UPDATE tab SET col = col + 1 to work if col has a unique index. Currently, uniqueness checks are done while the command is being executed, rather than at the end of the statement or transaction.

Improve deferrable unique constraints for cases with many conflicts: The current implementation fires a trigger for each potentially conflicting row. This might not scale well for an update that changes many key values at once.

Referential Integrity

Add MATCH PARTIAL referential integrity

Change foreign key constraint for array -> element to mean element in array?

Fix problem when cascading referential triggers make changes on cascaded tables, seeing the tables in an intermediate state Re: [PATCHES] Work-in-progress referential action trigger timing

Optimize referential integrity checks Re: Effects of cascading references in foreign keys Can't ri_KeysEqual() consider two nulls as equal?

Server-Side Languages

Add support for polymorphic arguments and return types to languages other than PL/PgSQL

Add capability to create and call PROCEDURES

Add support for OUT and INOUT parameters to languages other than PL/PgSQL

Add more fine-grained specification of functions taking arbitrary data types http://archives.postgresql.org/pgsql-hackers/2009-09/msg00367.php

PL/pgSQL

[D] Allow function parameters to be passed by name, get_employee_salary(12345 AS emp_id, 2001 AS tax_year) http://archives.postgresql.org/pgsql-hackers/2008-08/msg00559.php http://archives.postgresql.org/pgsql-hackers/2008-12/msg00880.php

Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]

Allow listing of record column names, and access to record columns via variables, e.g. columns := r.(*), tval2 := r.(colname) Re: PL/PGSQL: Dynamic Record Introspection Re: PL/PGSQL: Dynamic Record Introspection Re: PL/PGSQL: Dynamic Record Introspection

Add support for SCROLL cursors

Add support for WITH HOLD cursors

Allow row and record variables to be set to NULL constants, and allow NULL tests on such variables

Because a row is not scalar, do not allow assignment from NULL-valued scalars.

NULL and plpgsql rows

[D] Review handling of MOVE and FETCH Re: actualised forgotten Magnus's patch for plpgsql MOVE statement

[D] Improve logic of determining if an identifier is a variable or a column name Re: plpgsql and qualified variable names * http://archives.postgresql.org/message-id/603c8f070903061741l1f11ba59q783745cc3cb79dba@mail.gmail.com

Consider keeping separate cached copies when search_path changes pl/pgsql Plan Invalidation and search_path

[D] Improve PL/pgSQL's ability to cope with rowtypes containing dropped columns Bug in RETURN QUERY

Improve handling of NULL row values vs. NULL rows http://archives.postgresql.org/pgsql-hackers/2008-09/msg01758.php

PL/Perl

Allow data to be passed in native language formats, rather than only text plperl vs. bytea

Allow regex operations in plperl using UTF8 characters in non-UTF8 encoded databases.

PL/Python

Add table function support

Add tracebacks Re: plpython tracebacks

[D] Add support for Python 3 Re: Python 3.0 does not work with PL/Python http://archives.postgresql.org/pgsql-hackers/2009-07/msg01519.php

Develop a trusted variant of PL/Python.

Allow arrays as function arguments and return values.

Create a new restricted execution class that will allow passing function arguments in as locals. Passing them as globals means functions cannot be called recursively.

Functions cache the input and output functions for their arguments, so the following will make PostgreSQL unhappy: create table users (first_name text, last_name text); create function user_name(user) returns text as 'mycode' language plpython; select user_name(user) from users; alter table add column user_id integer; select user_name(user) from users; You have to drop and create the function(s) each time its arguments are modified (not nice), or don't cache the input and output functions (slower?), or check if the structure of the argument has been altered (is this possible, easy, quick?) and recreate cache.

Better documentation

Add a DB-API compliant interface on top of the SPI interface.

check encoding validity of values passed back to Postgres in function returns, trigger tuple changes, or SPI calls.

PL/Tcl

Add table function support

check encoding validity of values passed back to Postgres in function returns, trigger tuple changes, or SPI calls.

Clients

Add a function like pg_get_indexdef() that report more detailed index information BUG #3829: Wrong index reporting from pgAdmin III (v1.8.0 rev 6766-6767)

pg_ctl

Allow pg_ctl to work properly with configuration files located outside the PGDATA directory

pg_ctl can not read the pid file because it isn't located in the config directory but in the PGDATA directory. The solution is to allow pg_ctl to read and understand postgresql.conf to find the data_directory value.

http://archives.postgresql.org/pgsql-bugs/2009-10/msg00024.php

Have the postmaster write a random number to a file on startup that pg_ctl checks against the contents of a pg_ping response on its initial connection (without login)

This will protect against connecting to an old instance of the postmaster in a different or deleted subdirectory.

Modify pg_ctl behavior and exit codes to make it easier to write an LSB conforming init script

It may be desirable to condition some of the changes on a command-line switch, to avoid breaking existing scripts. A Linux shell (sh) script is referenced which has been tested and seems to provide a high degree of conformance in multiple environments. Study of this script might suggest areas where pg_ctl could be modified to make writing an LSB conforming script easier; however, some aspects of that script would be unnecessary with other suggested changes to pg_ctl, and discussion on the lists did not reach consensus on support for all aspects of this script. Further discussion of particular changes is needed before beginning any work.

LSB conforming init script

These threads should be studied for other ideas on improvements:

psql

Have psql \ds show all sequences and their settings http://archives.postgresql.org/pgsql-hackers/2008-07/msg00916.php http://archives.postgresql.org/pgsql-hackers/2008-12/msg00401.php

Have \d on a sequence indicate if the sequences is owned by a table

Move psql backslash database information into the backend, use mnemonic commands?

This would allow non-psql clients to pull the same information out of the database as psql.

Re: psql \d option list overloaded

Make psql's \d commands more consistent in its handling of schemas Re: psql and schemas

Consistently display privilege information for all objects in psql

Add auto-expanded mode so expanded output is used if the row length is wider than the screen width.: Consider using auto-expanded mode for backslash commands like \df+.

Prevent tab completion of SET TRANSACTION from querying the database and therefore preventing the transaction isolation level from being set.: Currently SET <tab> causes a database lookup to check all supported session variables. This query causes problems because setting the transaction isolation level must be the first statement of a transaction.

Add a \set variable to control whether \s displays line numbers

Another option is to add \# which lists line numbers, and allows command execution.

Re: psql possible TODO

Prevent escape string warnings when object names have backslashes Psql command-line completion bug

Have \d show child tables that inherit from the specified parent

Include the symbolic SQLSTATE name in verbose error reports Re: Checking is TSearch2 query is valid

Add prompt escape to display the client and server versions WIP patch for TODO Item: Add prompt escape to display the client and server versions

Add option to wrap column values at whitespace boundaries, rather than chopping them at a fixed width.

Currently, "wrapped" format chops values into fixed widths. Perhaps the word wrapping could use the same algorithm documented in the W3C specification.

Add "auto" expanded mode that outputs in expanded format if "wrapped" mode can't wrap the output to the screen width Re: psql wrapped format default for backslash-d commands

Support the ReST table output format

Details about the ReST format: http://docutils.sourceforge.net/rst.html#reference-documentation

Add option to print advice for people familiar with other databases http://archives.postgresql.org/pgsql-hackers/2010-01/msg01845.php

Consider showing TOAST and index sizes in \dt+ http://archives.postgresql.org/pgsql-general/2010-01/msg00912.php

Allow \dd to show constraint comments http://archives.postgresql.org/pgsql-hackers/2009-09/msg00436.php http://archives.postgresql.org/pgsql-general/2009-09/msg00199.php

Add ability to edit views with \ev http://archives.postgresql.org/pgsql-hackers/2009-09/msg00023.php

Add \dL to show languages http://archives.postgresql.org/pgsql-hackers/2009-07/msg00915.php

[E] Distinguish between unique indexes and unique constraints in \d+ http://archives.postgresql.org/message-id/8780.1271187360@sss.pgh.pa.us

pg_dump / pg_restore

[D] Add dumping of comments on composite type columns Teach pg_dump to dump comments attached to the columns of a composite type.

[E] Add full object name to the tag field. eg. for operators we need '=(integer, integer)', instead of just '='.

Template:TodoItemEasyDone

Add pg_dumpall custom format dumps?

Avoid using platform-dependent locale names in pg_dumpall output

Using native locale names puts roadblocks in the way of porting a dump to another platform. One possible solution is to get CREATE DATABASE to accept some agreed-on set of locale names and fix them up to meet the platform's requirements.

http://archives.postgresql.org/message-id/21396.1241716688@sss.pgh.pa.us

Allow selection of individual object(s) of all types, not just tables

In a selective dump, allow dumping of an object and all its dependencies

Add options like pg_restore -l and -L to pg_dump

Add support for multiple pg_restore -t options, like pg_dump: pg_restore's -t switch is less useful than pg_dump's in quite a few ways: no multiple switches, no pattern matching, no ability to pick up indexes and other dependent items for a selected table. It should be made to handle this switch just like pg_dump does.

Stop dumping CASCADE on DROP TYPE commands in clean mode

Allow pg_dump --clean to drop roles that own objects or have privileges: tgl says: if this is about pg_dumpall, it's done as of 8.4. If it's really about pg_dump, what does it mean? pg_dump has no business dropping roles.

Change pg_dump so that a comment on the dumped database is applied to the loaded database, even if the database has a different name. This will require new backend syntax, perhaps COMMENT ON CURRENT DATABASE.

Remove unnecessary function pointer abstractions in pg_dump source code

Allow pg_dump to utilize multiple CPUs and I/O channels by dumping multiple objects simultaneously

The difficulty with this is getting multiple dump processes to produce a single dump output file. It also would require several sessions to share the same snapshot.

pg_dump additional options for performance

Allow pg_restore to load different parts of the COPY data for a single table simultaneously

Remove support for dumping from pre-7.3 servers: In 7.3 and later, we can get accurate dependency information from the server. pg_dump still contains a lot of crufty code to try to deal with the lack of dependency info in older servers, but the usefulness of maintaining that code grows small.

Allow pre/data/post files when schema and data are dumped separately, for performance reasons pg_dump additional options for performance http://archives.postgresql.org/pgsql-patches/2008-07/msg00185.php

Have pg_dump -C emit ALTER DATABASE ... SET commands after database creation ALTER DATABASE vs pg_dump

Allow parallel restore of tar dumps http://archives.postgresql.org/pgsql-hackers/2009-02/msg01154.php

ecpg

Docs: Document differences between ecpg and the SQL standard and information about the Informix-compatibility module.

Solve cardinality > 1 for input descriptors / variables?

Add a semantic check level, e.g. check if a table really exists

fix handling of DB attributes that are arrays

[D] Implement SQLDA: add sqlda support to ecpg in both native and compatibility mode

Fix nested C comments

[E] sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified

Make SET CONNECTION thread-aware, non-standard?

Allow multidimensional arrays

Implement COPY FROM STDIN

Provide a way to specify size of a bytea parameter BUG #4866: ECPG and BYTEA

Fix small memory leaks in ecpg: Memory leaks in a short running application like ecpg are not really a problem, but make debugging more complicated

libpq

Add PQescapeIdentifierConn()

Prevent PQfnumber() from lowercasing unquoted column names: PQfnumber() should never have been doing lowercasing, but historically it has so we need a way to prevent it

Allow statement results to be automatically batched to the client: Currently all statement results are transferred to the libpq client before libpq makes the results available to the application. This feature would allow the application to make use of the first result rows while the rest are transferred, or held on the server waiting for them to be requested by libpq. One complexity is that a statement like SELECT 1/col could error out mid-way through the result set.

Consider disallowing multiple queries in PQexec() as an additional barrier to SQL injection attacks Re: InitPostgres and flatfiles question

Add PQexecf() that allows complex parameter substitution Last minute mini-proposal (I know, know) for PQexecf()

Add SQLSTATE and severity to errors generated within libpq itself v8.1: Error severity on libpq PGconn*

Add code to detect client encoding and locale from the operating system environment http://archives.postgresql.org/pgsql-hackers/2009-06/msg01040.php

Add keepalive support to libpq http://archives.postgresql.org/pgsql-hackers/2010-02/msg00611.php

Add support for interface/ipaddress binding to libpq http://archives.postgresql.org/pgsql-hackers/2010-02/msg01811.php

Triggers

Improve storage of deferred trigger queue

Right now all deferred trigger information is stored in backend memory. This could exhaust memory for very large trigger queues. This item involves dumping large queues into files, or doing some kind of join to process all the triggers, some bulk operation, or a bitmap.

Allow triggers to be disabled in only the current session.: This is currently possible by starting a multi-statement transaction, modifying the system tables, performing the desired SQL, restoring the system tables, and committing the transaction. ALTER TABLE ... TRIGGER requires a table lock so it is not ideal for this usage.

With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN KEY: If the dump is known to be valid, allow foreign keys to be added without revalidating the data.

Allow statement-level triggers to access modified rows

When statement-level triggers are defined on a parent table, have them fire only on the parent table, and fire child table triggers only where appropriate http://archives.postgresql.org/pgsql-hackers/2008-11/msg01883.php

[D] Support triggers on columns Column-level triggers

Allow AFTER triggers on system tables: System tables are modified in many places in the backend without going through the executor and therefore not causing triggers to fire. To complete this item, the functions that modify system tables will have to fire triggers.

Tighten trigger permission checks Security leak with trigger functions?

Allow BEFORE INSERT triggers on views Re: Why can't I put a BEFORE EACH ROW trigger on a view?

Add database and transaction-level triggers Proposal for db level triggers triggers on prepare, commit, rollback... ?

Reduce locking requirements for creating a trigger Re: Change lock requirements for adding a trigger

Inheritance

Allow inherited tables to inherit indexes, UNIQUE constraints, and primary/foreign keys

Honor UNIQUE INDEX on base column in INSERTs/UPDATEs on inherited table, e.g. INSERT INTO inherit_table (unique_index_col) VALUES (dup) should fail: The main difficulty with this item is the problem of creating an index that can span multiple tables.

Determine whether ALTER TABLE / SET SCHEMA should work on inheritance hierarchies (and thus support ONLY). If yes, implement it.

ALTER TABLE variants sometimes support recursion and sometimes not, but this is poorly/not documented, and the ONLY marker would then be silently ignored. Clarify the documentation, and reject ONLY if it is not supported.

Indexes

Add UNIQUE capability to non-btree indexes

Prevent index uniqueness checks when UPDATE does not modify the column: Uniqueness (index) checks are done when updating a column even if the column is not modified by the UPDATE. However, HOT already short-circuits this in common cases, so more work might not be helpful.

Allow the creation of on-disk bitmap indexes which can be quickly combined with other bitmap indexes

Such indexes could be more compact if there are only a few distinct values. Such indexes can also be compressed. Keeping such indexes updated can be costly.

Allow accurate statistics to be collected on indexes with more than one column or expression indexes, perhaps using per-index statistics Re: Simple join optimized badly? Stats for multi-column indexes Cross-column statistics revisited Multi-Dimensional Histograms

Consider having a larger statistics target for indexed columns and expression indexes.

Consider smaller indexes that record a range of values per heap page, rather than having one index entry for every heap row

This is useful if the heap is clustered by the indexed values.

Add REINDEX CONCURRENTLY, like CREATE INDEX CONCURRENTLY

This is difficult because you must upgrade to an exclusive table lock to replace the existing index file. CREATE INDEX CONCURRENTLY does not have this complication. This would allow index compaction without downtime.

Re: When/if to Reindex

Allow multiple indexes to be created concurrently, ideally via a single heap scan: pg_restore allows parallel index builds, but it is done via subprocesses, and there is no SQL interface for this.

Consider sorting entries before inserting into btree index Re: ATTN: Clodaldo was Performance problem. Could it be related to 8.3-beta4?

Allow index scans to return matching index keys, not just the matching heap locations Re: Is this TODO item done? http://archives.postgresql.org/pgsql-hackers/2009-08/msg01477.php

Allow creation of an index that can do comparisons to test if a value is between two column values http://archives.postgresql.org/pgsql-hackers/2008-05/msg00757.php

Consider using "effective_io_concurrency" for index scans Currently only bitmap scans use this, which might be fine because most multi-row index scans use bitmap scans.

GIST

Add more GIST index support for geometric data types

Allow GIST indexes to create certain complex index types, like digital trees (see Aoki)

Fix performance issues in contrib/seg and contrib/cube GiST support GiST index performance draft patch http://archives.postgresql.org/pgsql-performance/2009-05/msg00069.php http://archives.postgresql.org/pgsql-performance/2009-06/msg00068.php

GIN

Support empty indexed values (such as zero-element arrays) properly contrib/intarray vs empty arrays BUG #4806: Bug with GiST index and empty integer array

Behave correctly for cases where some elements of an indexed value are NULL http://archives.postgresql.org/pgsql-hackers/2009-03/msg01003.php

Support queries that require a full scan Issue report Older issue report Original discussion of issue and proposed resolution

Hash

[D] Pack hash index buckets onto disk pages more efficiently

Currently only one hash bucket can be stored on a page. Ideally several hash buckets could be stored on a single page and greater granularity used for the hash algorithm.

Why hash indexes suck

However, the binary searching within a hash page probably renders this issue moot.

pgsql: Change hash indexes to store only the hash code rather than the

Add hash WAL logging for crash recovery

Allow multi-column hash indexes

Catalogs

[D] Improve performance of information_schema views table_privileges is way too slow

[D] Improve information_schema's entries for precision and scale http://archives.postgresql.org/pgsql-hackers/2009-05/msg01485.php

Sorting

Consider whether duplicate keys should be sorted by block/offset Remove hacks for old bad qsort() implementations?

Consider being smarter about memory and external files used during sorts Sorting Improvements for 8.4 Re: Sorting Improvements for 8.4

Consider detoasting keys before sorting

Fsync

Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options: Ideally this requires a separate test program that can be run at initdb time or optionally later. Consider O_SYNC when O_DIRECT exists.

Add program to test if fsync has a delay compared to non-fsync

Consider sorting writes during checkpoint Sorted writes in checkpoint http://archives.postgresql.org/pgsql-patches/2008-07/msg00050.php

Cache Usage

Speed up COUNT(*): We could use a fixed row count and a +/- count to follow MVCC visibility rules, or a single cached value could be used and invalidated if anyone modifies the table. Another idea is to get a count directly from a unique index, but for this to be faster than a sequential scan it must avoid access to the heap to obtain tuple visibility information.

Provide a way to calculate an "estimated COUNT(*)"

Perhaps by using the optimizer's cardinality estimates or random sampling.

Re: Improving count(*)

Allow data to be pulled directly from indexes

Currently indexes do not have enough tuple visibility information to allow data to be pulled from the index without also accessing the heap. One way to allow this is to set a bit on index tuples to indicate if a tuple is currently visible to all transactions when the first valid heap lookup happens. This bit would have to be cleared when a heap tuple is expired. Another idea is to maintain a bitmap of heap pages where all rows are visible to all backends, and allow index lookups to reference that bitmap to avoid heap lookups, perhaps the same bitmap we might add someday to determine which heap pages need vacuuming. Frequently accessed bitmaps would have to be stored in shared memory. One 8k page of bitmaps could track 512MB of heap pages. A third idea would be for a heap scan to check if all rows are visible and if so set a per-table flag which can be checked by index scans. Any change to the table would have to clear the flag. To detect changes during the heap scan a counter could be set at the start and checked at the end --- if it is the same, the table has not been modified --- any table change would increment the counter.

Consider automatic caching of statements at various levels: Parsed query tree Query execute plan Query results Cached Query Plans (was: global prepared statements)

Consider increasing internal areas (NUM_CLOG_BUFFERS) when shared buffers is increased Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",) clog_buffers to 64 in 8.3? CLOG Patch

Consider decreasing the amount of memory used by PrivateRefCount

Consider allowing higher priority queries to have referenced buffer cache pages stay in memory longer Re: How to keep a table in memory?

Vacuum

[D] Improve VACUUM FULL's speed when major data movement is needed

For large table adjustments during VACUUM FULL, it would be faster to cluster or reindex rather than update the indexes piecemeal as it does now. Also, this behavior tends to bloat the indexes.

Clean up VACUUM FULL's klugy transaction management: VACUUM FULL marks its transaction committed before it's really done, which means a PANIC if it fails after that point. This needs to be split into two transactions.

Auto-fill the free space map by scanning the buffer cache or by checking pages written by the background writer Dead Space Map Re: Automatic free space map filling

Consider having single-page pruning update the visibility map https://commitfest.postgresql.org/action/patch_view?id=75 http://archives.postgresql.org/pgsql-hackers/2010-02/msg02344.php

Improve tracking of total relation tuple counts now that vacuum doesn't always scan the whole heap Partial vacuum versus pg_class.reltuples

Bias FSM towards returning free space near the beginning of the heap file, in hopes that empty pages at the end can be truncated by VACUUM

Make FSM return free space based on table clustering, to assist in maintaining clustering?

Consider a more compact data representation for dead tuple locations within VACUUM Re: Have vacuum emit a warning when it runs out of maintenance_work_mem

Provide more information in order to improve user-side estimates of dead space bloat in relations Re: Bloated Table

Auto-vacuum

[E] Issue log message to suggest VACUUM FULL if a table is nearly empty?

Prevent long-lived temporary tables from causing frozen-xid advancement starvation

The problem is that autovacuum cannot vacuum them to set frozen xids; only the session that created them can do that.

Re: AutoVacuum Behaviour Question

Prevent autovacuum from running if an old transaction is still running from the last vacuum Re: Autovacuum and OldestXmin

Have free space allocation bias away from using trailing table pages

This improves the chances of truncating the table during vacuum

http://archives.postgresql.org/pgsql-hackers/2009-09/msg01124.php

Locking

Fix priority ordering of read and write light-weight locks lwlocks and starvation Re: lwlocks and starvation

Fix problem when multiple subtransactions of the same outer transaction hold different types of locks, and one subtransaction aborts FOR SHARE vs FOR UPDATE locks Re: FOR SHARE vs FOR UPDATE locks Re: [PATCHES] [pgsql-patches] Phantom Command IDs, updated patch Re: savepoints and upgrading locks

Allow UPDATEs on only non-referential integrity columns not to conflict with referential integrity locks Referential Integrity and SHARE locks

Add idle_in_transaction_timeout GUC so locks are not held for long periods of time

Improve deadlock detection when a page cleaning lock conflicts with a shared buffer that is pinned BUG #3883: Autovacuum deadlock with truncate? Thoughts about bug #3883 Re: pgsql: Add checks to TRUNCATE, CLUSTER, and REINDEX to prevent

Detect deadlocks involving LockBufferForCleanup() Thoughts about bug #3883

Consider a lock timeout parameter http://archives.postgresql.org/pgsql-hackers/2009-05/msg00485.php

Consider improving serialized transaction behavior to avoid anomalies http://archives.postgresql.org/pgsql-hackers/2009-05/msg00217.php http://archives.postgresql.org/pgsql-hackers/2009-05/msg01136.php http://archives.postgresql.org/pgsql-hackers/2009-06/msg00035.php

Startup Time Improvements

Experiment with multi-threaded backend for backend creation: This would prevent the overhead associated with process creation. Most operating systems have trivial process creation time compared to database startup overhead, but a few operating systems (Win32, Solaris) might benefit from threading. Also explore the idea of a single session using multiple threads to execute a statement faster.

Write-Ahead Log

Eliminate need to write full pages to WAL before page modification

Currently, to protect against partial disk page writes, we write full page images to WAL before they are modified so we can correct any partial page writes during recovery. These pages can also be eliminated from point-in-time archive files.

Re: Index Scans become Seq Scans after VACUUM ANALYSE

When full page writes are off, write CRC to WAL and check file system blocks on recovery: If CRC check fails during recovery, remember the page in case a later CRC for that page properly matches.

Write full pages during file system write and not when the page is modified in the buffer cache: This allows most full page writes to happen in the background writer. It might cause problems for applying WAL on recovery into a partially-written page, but later the full page will be replaced from WAL.

[D] Allow WAL traffic to be streamed to another server for stand-by replication

Reduce WAL traffic so only modified values are written rather than entire rows Reduction in WAL for UPDATEs

Allow WAL information to recover corrupted pg_controldata Re: [HACKERS] pg_resetxlog -r flag

Find a way to reduce rotational delay when repeatedly writing last WAL page

Currently fsync of WAL requires the disk platter to perform a full rotation to fsync again. One idea is to write the WAL to different offsets that might reduce the rotational delay.

500 tpsQL + WAL log implementation

Allow WAL logging to be turned off for a table, but the table might be dropped or truncated during crash recovery

Allow tables to bypass WAL writes and just fsync() dirty pages on commit. This should be implemented using ALTER TABLE, e.g. ALTER TABLE PERSISTENCE [ DROP | TRUNCATE | DEFAULT ]. Tables using non-default logging should not use referential integrity with default-logging tables. A table without dirty buffers during a crash could perhaps avoid the drop/truncate.

Re: [Bizgres-general] WAL bypass for INSERT, UPDATE and

Allow WAL logging to be turned off for a table, but the table would avoid being truncated/dropped

To do this, only a single writer can modify the table, and writes must happen only on new pages so the new pages can be removed during crash recovery. Readers can continue accessing the table. Such tables probably cannot have indexes. One complexity is the handling of indexes on TOAST tables.

Re: [Bizgres-general] WAL bypass for INSERT, UPDATE and

Speed WAL recovery by allowing more than one page to be prefetched

This should be done utilizing the same infrastructure used for prefetching in general to avoid introducing complex error-prone code in WAL replay.

Improve WAL concurrency by increasing lock granularity Reworking WAL locking

Be more aggressive about creating WAL files Re: PANIC caused by open_sync on Linux PreallocXlogFiles WAL/PITR additional items

Have resource managers report the duration of their status changes Recovery of Multi-stage WAL actions

Move pgfoundry's xlogdump to /contrib and have it rely more closely on the WAL backend code xlogdump

Close deleted WAL files held open in *nix by long-lived read-only backends Deleted WAL files held open by backends in Linux Re: Deleted WAL files held open by backends in Linux

Optimizer / Executor

Improve selectivity functions for geometric operators

Precompile SQL functions to avoid overhead

Create utility to compute accurate random_page_cost value

Consider increasing the default values of from_collapse_limit, join_collapse_limit, and/or geqo_threshold from_collapse_limit vs. geqo_threshold

Improve ability to display optimizer analysis using OPTIMIZER_DEBUG

Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage

Have EXPLAIN ANALYZE report rows as floating-point numbers http://archives.postgresql.org/pgsql-hackers/2009-05/msg01363.php http://archives.postgresql.org/pgsql-hackers/2009-06/msg00108.php

Log statements where the optimizer row estimates were dramatically different from the number of rows actually found?

Improve how ANALYZE computes in-doubt tuples VACUUM/ANALYZE counting of in-doubt tuples

Consider compressed annealing to search for query plans

This might replace GEQO.

http://archives.postgresql.org/message-id/15658.1241278636%40sss.pgh.pa.us

Consider using a hash for joining to a large IN (VALUES ...) list Planning large IN lists

Allow single batch hash joins to preserve outer pathkeys Re: Potential Join Performance Issue a few crazy ideas about hash joins

"lazy" hash tables - look up only the tuples that are actually requested a few crazy ideas about hash joins

Avoid building the same hash table more than once during the same query a few crazy ideas about hash joins

Avoid hashing for distinct and then re-hashing for hash join Re: Fixing Grittner's planner issues a few crazy ideas about hash joins

Allow hashing to be used on arrays, if the element type is hashable http://archives.postgresql.org/message-id/11087.1244905821@sss.pgh.pa.us

Improve use of expression indexes for ORDER BY http://archives.postgresql.org/pgsql-hackers/2009-08/msg01553.php

Background Writer

Consider having the background writer update the transaction status hint bits before writing out the page: Implementing this requires the background writer to have access to system catalogs and the transaction status log.

Consider adding buffers the background writer finds reusable to the free list Background LRU Writer/free list

Automatically tune bgwriter_delay based on activity rather then using a fixed interval Background LRU Writer/free list

Consider whether increasing BM_MAX_USAGE_COUNT improves performance Bgwriter LRU cleaning: we've been going at this all wrong

Test to see if calling PreallocXlogFiles() from the background writer will help with WAL segment creation latency Re: Load Distributed Checkpoints, final patch

Concurrent Use of Resources

Do async I/O for faster random read-ahead of data

Async I/O allows multiple I/O requests to be sent to the disk with results coming back asynchronously.

The above patch is already applied as of 8.4, but it still remains to figure out how to handle plain indexscans effectively.

Problems with the patch submitted for posix_fadvise in index scans

Experiment with multi-threaded backend for better I/O utilization: This would allow a single query to make use of multiple I/O channels simultaneously. One idea is to create a background reader that can pre-fetch sequential and index scan pages needed by other backends. This could be expanded to allow concurrent reads from multiple devices in a partitioned table.

Experiment with multi-threaded backend for better CPU utilization

This would allow several CPUs to be used for a single query, such as for sorting or query execution.

http://archives.postgresql.org/pgsql-hackers/2008-10/msg00945.php

SMP scalability improvements Straightforward changes for increased SMP scalability Re: Reducing Transaction Start/End Contention Re: Reducing Transaction Start/End Contention

TOAST

Allow user configuration of TOAST thresholds Re: Proposed adjustments in MaxTupleSize and toastthresholds pg_lzcompress strategy parameters

Reduce unnecessary cases of deTOASTing Re: [PATCHES] Eliminate more detoast copies for packed varlenas

Reduce costs of repeat de-TOASTing of values WIP patch: reducing overhead for repeat de-TOASTing

Miscellaneous Performance

Use mmap() rather than SYSV shared memory or to write WAL files?: This would remove the requirement for SYSV SHM but would introduce portability issues. Anonymous mmap (or mmap to /dev/zero) is required to prevent I/O overhead.

Consider mmap()'ing files into a backend?: Doing I/O to large tables would consume a lot of address space or require frequent mapping/unmapping. Extending the file also causes mapping problems that might require mapping only individual pages, leading to thousands of mappings. Another problem is that there is no way to _prevent_ I/O to disk from the dirty shared buffers so changes could hit disk before WAL is written.

Add a script to ask system configuration questions and tune postgresql.conf

Consider ways of storing rows more compactly on disk: Reduce the row header size? Consider reducing on-disk varlena length from four bytes to two because a heap row cannot be more than 64k in length

Consider transaction start/end performance improvements Reducing Transaction Start/End Contention Re: Reducing Transaction Start/End Contention

Allow configuration of backend priorities via the operating system

Though backend priorities make priority inversion during lock waits possible, research shows that this is not a huge problem.

Priorities for users or queries?

Consider increasing the minimum allowed number of shared buffers Re: [PATCH] Don't bail with legitimate -N/-B options

Consider if CommandCounterIncrement() can avoid its AcceptInvalidationMessages() call pgsql: Avoid incrementing the CommandCounter when

Consider Cartesian joins when both relations are needed to form an indexscan qualification for a third relation Re: TB-sized databases

Consider not storing a NULL bitmap on disk if all the NULLs are trailing Proposal for Null Bitmap Optimization(for Trailing NULLs) Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

Sort large UPDATE/DELETEs so it is done in heap order Possible future performance improvement: sort updates/deletes by ctid

Allow one transaction to see tuples using the snapshot of another transaction

This would assist multiple backends in working together.

Transaction Snapshot Cloning

Consider decreasing the I/O caused by updating tuple hint bits Hint Bits and Write I/O Re: [HACKERS] Hint Bits and Write I/O

Miscellaneous Other

Deal with encoding issues for filenames in the server filesystem a proposed patch here some issues about it here Windows-specific patch here

Deal with encoding issues in the output of localeconv() bug report draft patch review of patch

Provide schema name and other fields available from SQL GET DIAGNOSTICS in error reports How to get schema name which violates fk constraint http://archives.postgresql.org/pgsql-hackers/2009-11/msg00846.php Re: NOT NULL violation and error-message http://archives.postgresql.org/pgsql-hackers/2009-08/msg00213.php

Source Code

Add use of 'const' for variables in source tree

Move some things from contrib into main tree

[E] Remove warnings created by -Wcast-align

Move platform-specific ps status display info from ps_status.c to ports

Add optional CRC checksum to heap and index pages

One difficulty is how to prevent hint bit changes from affecting the computed CRC checksum.

Improve documentation to build only interfaces

Allow cross-compiling by generating the zic database on the target system

Improve NLS maintenance of libpgport messages linked onto applications

Improve the module installation experience (/contrib, etc) modules Re: PostgreSQL extensions packaging Database owner installable modules patch http://archives.postgresql.org//pgsql-hackers/2009-03/msg00855.php http://archives.postgresql.org/pgsql-hackers/2009-05/msg00912.php

Use UTF8 encoding for NLS messages so all server encodings can read them properly

[D] Update Bonjour to work with newer cross-platform SDK Darwin stuff is getting deprecated Use dns_sd.h for Bonjour

Allow creation of universal binaries for Darwin http://archives.postgresql.org/pgsql-hackers/2008-07/msg00884.php

Consider GnuTLS if OpenSSL license becomes a problem [PATCH] Add support for GnuTLS TODO: GNU TLS

Consider making NAMEDATALEN more configurable in future releases

Research use of signals and sleep wake ups Restartable signals 'n all that

[D] Add automated check for invalid C++ source code constructs Re: SPI-header-files safe for C++-compiler

Allow C++ code to more easily access backend code http://archives.postgresql.org/pgsql-hackers/2008-12/msg00302.php

Consider simplifying how memory context resets handle child contexts Re: Memory leak in nodeAgg

Create three versions of libpgport to simplify client code 8.4 TODO item: make src/port support libpq and ecpg directly

Improve detection of shared memory segments being used by others by checking the SysV shared memory field 'nattch' postgresql in FreeBSD jails: proposal Re: postgresql in FreeBSD jails: proposal

Implement the non-threaded Avahi service discovery protocol Re: [PATCHES] Avahi support for Postgresql Re: Avahi support for Postgresql Re: [PATCHES] Avahi support for Postgresql Re: [HACKERS] Avahi support for Postgresql

Fix system views like pg_stat_all_tables to use set-returning functions, rather than views of per-column functions

[D] Allow table and index WITH options to be specified via hooks, for use with plugins like GiST index methods Change the reloptions machinery to use a table-based parser

Reduce data row alignment requirements on some 64-bit systems http://archives.postgresql.org/pgsql-hackers/2008-10/msg00369.php

Add support for returning multiple result sets? http://archives.postgresql.org/pgsql-general/2008-10/msg00454.php

Restructure TOAST internal storage format for greater flexibility http://archives.postgresql.org/pgsql-hackers/2008-11/msg00049.php

[D] Allow setting of system oids during object creation, for use by pg_migrator http://archives.postgresql.org/pgsql-hackers/2009-08/msg00401.php

Documentation

Convert single quotes to apostrophes in the PDF documentation SGML docs and pdf single-quotes

Provide a manpage for postgresql.conf A smaller default postgresql.conf A smaller default postgresql.conf

Change the manpage-generating toolchain to use the new XML-based docbook2x tools A smaller default postgresql.conf

Consider changing documentation format from SGML to XML Re: Authoring Tools WAS: Switching to XML

Windows

Remove configure.in check for link failure when cause is found

Remove readdir() errno patch when runtime/mingwex/dirent.c rev 1.4 is released

Allow psql to use readline once non-US code pages work with backslashes

Fix problem with shared memory on the Win32 Terminal Server

[D] Diagnose problem where shared memory can sometimes not be attached by postmaster children FATAL: could not reattach to shared memory (Win32) Re: "could not reattach to shared memory" captured in buildfarm

Improve signal handling Simplify Win32 Signaling code

Convert MSVC build system to remove most batch files MSVC build system

Support pgxs when using MSVC

Fix MSVC NLS support, like for to_char() NLS on MSVC strikes back! Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)

Fix locale-aware handling (e.g. monetary) for specific server/client encoding combinations http://archives.postgresql.org/pgsql-general/2009-04/msg00799.php

Find a correct rint() substitute on Windows Minor bug in src/port/rint.c

[D] Reduce compiler warnings on 64-bit Windows http://archives.postgresql.org/pgsql-hackers/2008-07/msg00437.php http://archives.postgresql.org/pgsql-hackers/2008-07/msg00440.php

Fix global namespace issues when using multiple terminal server sessions problems with Windows global namespace

Change from the current autoconf/gmake build system to cmake http://archives.postgresql.org/pgsql-hackers/2008-12/msg01869.php

Improve consistency of path separator usage http://archives.postgresql.org/message-id/49C0BDC5.4010002@hagander.net

[D] Allow compilation using MSVC 2008 http://archives.postgresql.org/pgsql-general/2009-08/msg01172.php

Wire Protocol Changes

Allow dynamic character set handling

Add decoded type, length, precision

Use compression?

Update clients to use data types, typmod, schema.table.column names of result sets using new statement protocol

Exotic Features

Add pre-parsing phase that converts non-ISO syntax to supported syntax: This could allow SQL written for other databases to run without modification.

Allow plug-in modules to emulate features from other databases

Add features of Oracle-style packages

A package would be a schema with session-local variables, public/private functions, and initialization functions. It is also possible to implement these capabilities in any schema and not use a separate "packages" syntax at all.

proposal for PL packages for 8.3.

Consider allowing control of upper/lower case folding of unquoted identifiers Bringing PostgreSQL torwards the standard regarding case folding Re: [SQL] Case Preservation disregarding case sensitivity? TODO Item: Consider allowing control of upper/lower case folding of unquoted, identifiers Identifier case folding notes http://archives.postgresql.org/pgsql-hackers/2008-07/msg00415.php

Add autonomous transactions autonomous transactions

Give query progress indication Query progress indication

Rethinking our type system Rethinking datatypes