PostgreSQL wiki - User contributions [en]

What's new in PostgreSQL 9.3

2013-09-18T14:49:14Z

Schmiddy: grammar

This page contains an overview of PostgreSQL Version 9.3's features, including descriptions, testing and usage information, and links to blog posts containing further imformation. See also the [http://www.postgresql.org/docs/9.3/static/release-9-3.html Release Notes] and [[PostgreSQL 9.3 Open Items]].

== Configuration directive 'include_dir' ==

In addition to including separate configuration files via the 'include' directive, postgresql.conf now also provides the 'include_dir' directive which reads all files ending in ".conf" in the specified directory or directories.

Directories can be specified either as an absolute path or relative from the location of the main configuration file. Directories will be read in the order they occur, while files will be read sorted by C locale rules. It is possible for included files to contain their own 'include_dir' directives.

'''Links'''

* [http://www.postgresql.org/docs/9.3/static/config-setting.html#CONFIG-INCLUDES Documentation]

== COPY FREEZE for more efficient bulk loading ==

To improve initial bulk loading of tables, a ''FREEZE'' parameter has been added to the COPY command to enable data to be copied with rows already frozen. See the documentation for usage and caveats.

'''Links'''
* [http://www.postgresql.org/docs/9.3/static/sql-copy.html Documentation] - see the ''FREEZE'' parameter
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-copy-freeze/ Postgres 9.3 feature highlight: COPY FREEZE]

== Custom Background Workers ==

This functionality enables modules to register themselves as "background worker processes", effectively operating as customised server processes. This is a powerful new feature with a wide variety of possible use cases, such as monitoring server activity, performing tasks at pre-defined intervals, customised logging etc.

Background worker processes can attach to PostgreSQL's shared memory area and to connect to databases internally; by linking to libpq they can also connect to the server in the same way as a regular client application. Background worker processes are written in C, and as server processes they have unrestricted access to all data and can potentially impact other server processes, meaning they represent a potential security / stability risk. Consequently background worker processes should be developed and deployed with appropriate caution.

Providing an example would go beyond the scope of this article; please refer to the blogs linked below, which provide annotated sample code. The PostgreSQL source also contains a sample background worker process in contrib/worker_spi.

'''Links'''

* [http://www.postgresql.org/docs/9.3/static/bgworker.html Documentation]
* [http://www.depesz.com/2012/12/07/waiting-for-9-3-background-worker-processes/ Background worker processes]
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-handling-signals-with-custom-bgworkers/ Postgres 9.3 feature highlight: handling signals with custom bgworkers]
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-custom-background-workers/ Custom background workers]
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-hello-world-with-custom-bgworkers/ "Hello World" with custom bgworkers]
* [http://sql-info.de/postgresql/notes/custom-background-worker-bgw-practical-example.html Custom Background Workers - a practical example]

== Data Checksums ==

It is now possible for PostgreSQL to checksum data pages and report corruption. This is a cluster-wide setting and cannot be applied to individual databases or objects. Also be aware that this facility may incur a noticeable performance penalty. This option must be enabled during initdb and cannot be changed (although there is a new GUC parameter "[http://www.postgresql.org/docs/9.3/static/runtime-config-developer.html#GUC-IGNORE-CHECKSUM-FAILURE ignore_checksum_failure]" which will force PostgreSQL to continue processing a transaction even if corruption is detected).

'''Links'''

* Documentation
** [http://www.postgresql.org/docs/9.3/static/app-initdb.html#APP-INITDB-DATA-CHECKSUMS initdb -k/--data-checksums]
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-data-checksums/ Postgres 9.3 feature highlight: Data Checksums]

== JSON: Additional functionality ==

The [http://www.postgresql.org/docs/9.2/static/functions-json.html JSON datatype] and [http://www.postgresql.org/docs/9.2/static/functions-json.html two supporting functions] for converting rows and arrays were introduced in PostgreSQL 9.2. With PostgreSQL 9.3, dedicated JSON operators have been introduced and the number of functions expanded to 12, including JSON parsing support. The JSON parser has exposed for use by other modules such as extensions as an API.

Additionally, the [http://www.postgresql.org/docs/9.3/static/hstore.html hstore] extension has gained two JSON-related functions, ''hstore_to_json(hstore)'' and ''hstore_to_json_loose(hstore)''. The former is used when an hstore value is cast to json.

'''Links'''

* Documentation
** [http://www.postgresql.org/docs/9.3/static/datatype-json.html Documentation: JSON Datatype]
** [http://www.postgresql.org/docs/9.3/static/functions-json.html Documentation: JSON Functions and Operators]
* [http://www.depesz.com/2013/03/11/waiting-for-9-3-json-generation-improvements/ Waiting for 9.3 – JSON generation improvements]
* [http://www.depesz.com/2013/03/30/waiting-for-9-3-add-new-json-processing-functions-and-parser-api/ Waiting for 9.3 – Add new JSON processing functions and parser API]
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-json-data-generation/ Postgres 9.3 feature highlight: JSON data generation]
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-json-operators/ Postgres 9.3 feature highlight: JSON operators]
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-json-parsing-functions/ Postgres 9.3 feature highlight: JSON parsing functions]

== LATERAL JOIN ==

Put simply, a <tt>LATERAL JOIN</tt> enables a subquery in the <tt>FROM</tt> part of a clause to reference columns from preceding items in the FROM list.

The following is a self-contained (if quite pointless) example of the kind of clause it is sometimes useful to be able to write:

SELECT base.nr,
multiples.multiple
FROM (SELECT generate_series(1,10) AS nr) base
JOIN (SELECT generate_series(1,10) AS b_nr, base.nr * 2 AS multiple) multiples
ON multiples.b_nr = base.nr

but which produces an error message like the following:

<pre> LINE 4: JOIN (SELECT generate_series(1,10) AS base, base.nr * 2 A...
^
HINT: There is an entry for table "base", but it cannot be referenced from this part of the query.</pre>

Using <tt>LATERAL JOIN</tt>, it's now possible for the second subquery to reference a value from the first:

SELECT base.nr,
multiples.multiple
FROM (SELECT generate_series(1,10) AS nr) base,
LATERAL (
SELECT multiples.multiple FROM
( SELECT generate_series(1,10) AS b_nr, base.nr * 2 AS multiple ) multiples
WHERE multiples.b_nr = base.nr
) multiples;

Note that function calls can now directly reference columns from preceding <tt>FROM</tt> items, even without the <tt>LATERAL</tt> keyword. Example:

CREATE FUNCTION multiply(INT, INT)
RETURNS INT
LANGUAGE SQL
AS
$$
SELECT $1 * $2;
$$

Query with function call in the <tt>FROM</tt> list:

SELECT base.nr,
multiple
FROM (SELECT generate_series(1,10) AS nr) base,
multiply(base.nr, 2) AS multiple

In previous versions, this query would generate an error like this:

ERROR: function expression in FROM cannot refer to other relations of same query level
LINE 4: multiply(base.nr, 2) AS multiple

See the articles linked below for some more realistic examples.

'''Links'''

* [http://www.postgresql.org/docs/9.3/static/sql-select.html Documentation: SELECT] ''(see section <tt>LATERAL</tt>)''
* [http://www.depesz.com/2012/08/19/waiting-for-9-3-implement-sql-standard-lateral-subqueries/ Waiting for 9.3: Implement SQL standard lateral subqueries]
* [http://www.postgresonline.com/journal/archives/284-PostgreSQL-9.3-Lateral-Part-1-Use-with-HStore.html PostgreSQL 9.3 Lateral Part 1: Use with HStore]
* [http://www.postgresonline.com/journal/archives/285-PostgreSQL-9.3-Lateral-Part2-The-Lateral-Left-Join.html PostgreSQL 9.3 Lateral Part 2: The Lateral Left Join]

== Parallel pg_dump for faster backups ==

The new ''-j '''njobs''''' (''--jobs=''''njobs'''''') option enables pg_dump to dump '''njobs''' tables simultaneously, reducing the time it takes to dump a database. Example:

pg_dump -U postgres -j4 -Fd -f /tmp/mydb-dump mydb

This dumps the contents of database "mydb" to the directory "/tmp/mydb-dump" using four simultaneous connections.

Caveats:
* Parallel dumps can only be in directory format
* Parellel dumps will place more load on the database, although total dump time should be shorter
* pg_dump will open njobs + 1 connections to the database, so max_connections should be set appropriately
* Requesting exclusive locks on database objects while running a parallel dump could cause the dump to fail
* Parallel dumps from pre-9.2 servers need special attention

An ad-hoc test of this feature on a 4.5GB database (which compresses to around 370MB as a dump) with different values of ''-j '' produced following timings:

* (''no -j''): 1m3s
* -j2: 0m28s
* -j3: 0m24s
* -j4: 0m24s
* -j5: 0m25s

'''Links'''

* [http://www.postgresql.org/docs/9.3/static/app-pgdump.html pg_dump documentation]
* [http://www.depesz.com/2013/03/26/2646/ Waiting for 9.3 – Add parallel pg_dump option]
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-parallel-pg_dump/ Postgres 9.3 feature highlight: parallel pg_dump]

== 'pg_isready' server monitoring tool ==

pg_isready is a wrapper for PQping created as a standard client application. It accepts a libpq-style connection string and returns one of four exit statuses:

* 0: server is accepting connections normally
* 1: server is rejecting connections (for example during startup)
* 2: server did not response to the connection attempt
* 3: no connection attempt was made (e.g. due to invalid connection parameters)

Example usage:

barwick@localhost:~$ pg_isready
/tmp:5432 - accepting connections
barwick@localhost:~$ pg_isready --quiet && echo "OK"
OK
barwick@localhost:~$ pg_isready -p5431 -h localhost
localhost:5431 - accepting connections
barwick@localhost:~$ pg_isready -h example.com
example.com:5432 - no response

'''Links'''

* [http://www.postgresql.org/docs/9.3/static/app-pg-isready.html Documentation]
* [http://www.depesz.com/2013/01/26/waiting-for-9-3-pg_isready/ pg_isready]
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-server-monitoring-with-pg_isready/ Server monitoring with pg_isready]

== Switch to Posix shared memory and mmap() ==

In 9.3, PostgreSQL has switched from using SysV shared memory to using Posix shared memory and mmap for memory management. This allows easier installation and configuration of PostgreSQL, and means that except in unusual cases, system parameters such as SHMMAX and SHMALL no longer need to be adjusted. We need users to rigorously test and ensure that no memory management issues have been introduced by the change.

'''Links'''

* [http://www.postgresql.org/docs/9.3/static/kernel-resources.html#SYSVIPC Documentation]

== Trigger Features ==
=== Event Triggers ===

Triggers can now be defined on DDL events (CREATE, ALTER, DROP).

'''Links'''
* Documentation:
** [http://www.postgresql.org/docs/9.3/interactive/sql-createeventtrigger.html CREATE EVENT TRIGGER]
** [http://www.postgresql.org/docs/9.3/interactive/event-trigger-matrix.html Event Trigger Firing Matrix]
** [http://www.postgresql.org/docs/9.3/interactive/plpgsql-trigger.html#PLPGSQL-EVENT-TRIGGER Triggers on events]

* [http://www.depesz.com/2012/07/29/waiting-for-9-3-event-triggers/ Waiting for 9.3 – Event triggers]

== VIEW Features ==
=== Materialized Views ===

Materialized views are a special kind of view which cache the view's output as a physical table, rather than executing the underlying query on every access. Conceptually they are similar to "CREATE TABLE AS", but store the view definition so it can be easily refreshed.

Note that materialized views cannot be auto-refreshed; refreshes are not incremental; and the base table cannot be manipulated. They will however be automatically populated by pg_restore (more precisely, pg_dump includes a "REFRESH MATERIALIZED VIEW" statement).

'''Contrived example'''

Create and populate a table with some arbitrary data:

CREATE TABLE matview_test_table (
id SERIAL PRIMARY KEY,
ts TIMESTAMPTZ NOT NULL
)

INSERT INTO matview_test_table VALUES (
DEFAULT,
((NOW() - '2 days'::INTERVAL) + (generate_series(1,1000) || ' seconds')::INTERVAL)::TIMESTAMPTZ
)

Create a materialized view which lists the 5 most recent entries:

CREATE MATERIALIZED VIEW matview_test_view AS
SELECT id, ts
FROM matview_test_table
ORDER BY id DESC
LIMIT 5

postgres=# SELECT * from matview_test_view ;
id | ts
------+-------------------------------
1000 | 2013-05-06 12:02:10.974711+09
999 | 2013-05-06 12:02:09.974711+09
998 | 2013-05-06 12:02:08.974711+09
997 | 2013-05-06 12:02:07.974711+09
996 | 2013-05-06 12:02:06.974711+09
(5 rows)

Add more data to the table:

INSERT INTO matview_test_table VALUES (
DEFAULT,
((NOW() - '1 days'::INTERVAL) + (generate_series(1,1000) || ' seconds')::INTERVAL)::TIMESTAMPTZ
)

View output does not change:

postgres=# SELECT * from matview_test_view ;
id | ts
------+-------------------------------
1000 | 2013-05-06 12:02:10.974711+09
999 | 2013-05-06 12:02:09.974711+09
998 | 2013-05-06 12:02:08.974711+09
997 | 2013-05-06 12:02:07.974711+09
996 | 2013-05-06 12:02:06.974711+09
(5 rows)

Refresh the view to display the latest table entries:

postgres=# REFRESH MATERIALIZED VIEW matview_test_view ;
REFRESH MATERIALIZED VIEW
postgres=# SELECT * from matview_test_view ;
id | ts
------+-------------------------------
2001 | 2013-05-07 12:03:10.696626+09
2000 | 2013-05-07 12:03:09.696626+09
1999 | 2013-05-07 12:03:08.696626+09
1998 | 2013-05-07 12:03:07.696626+09
1997 | 2013-05-07 12:03:06.696626+09
(5 rows)

The links below contain more detailed information and examples.

'''Links'''
* Documentation:
** [http://www.postgresql.org/docs/9.3/static/rules-materializedviews.html Overview]
** [http://www.postgresql.org/docs/9.3/static/sql-creatematerializedview.html CREATE command]
* [http://www.depesz.com/2013/03/04/waiting-for-9-3-add-a-materialized-view-relations/ Waiting for 9.3 – Add a materialized view relations]
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-materialized-views/ Postgres 9.3 feature highlight: Materialized views]

=== Recursive View Syntax ===

The CREATE RECURSIVE VIEW syntax provides a shorthand way of formulating a recursive common table expression (CTE) as a view.

Taking the example from the [http://www.postgresql.org/docs/current/static/queries-with.html#QUERIES-WITH-SELECT CTE documentation]:

WITH RECURSIVE t(n) AS (
VALUES (1)
UNION ALL
SELECT n+1 FROM t WHERE n < 100
)
SELECT * FROM t;

This can be created as a recursive view as follows:

CREATE RECURSIVE VIEW t(n) AS
VALUES (1)
UNION ALL
SELECT n+1 FROM t WHERE n < 100;

'''Links'''
* [http://www.postgresql.org/docs/9.3/static/sql-createview.html Documentation]
* [http://www.depesz.com/2013/03/04/waiting-for-9-3-add-create-recursive-view-syntax/ Waiting for 9.3 – Add CREATE RECURSIVE VIEW syntax]

=== Updatable Views ===

Simple views can now be updated in the same way as regular tables. The view can only reference one table (or another updatable view) and must not contain more complex operators, join types etc.

If the view has a WHERE condition, UPDATEs and DELETEs on the underlying table will be restricted to those rows it defines. However UPDATEs may change a row so that it is no longer visible in the view, and an INSERT command can potentiall insert rows which do not satisfy the WHERE condition.

More complex views can be made updatable as before using INSTEAD OF triggers or INSTEAD rules.

Simple example using the following table and view:
<code>
CREATE TABLE postgres_versions (
version VARCHAR(3) PRIMARY KEY,
nickname TEXT NOT NULL
);

INSERT INTO postgres_versions VALUES
('8.0', 'Excitable Element'),
('8.1', 'Fishy Foreign Key'),
('8.2', 'Grumpy Grant'),
('8.3', 'Hysterical Hstore'),
('8.4', 'Insane Index'),
('9.0', 'Jumpy Join'),
('9.1', 'Killer Key'),
('9.2', 'Laconical Lexer'),
('9.3', 'Morose Module');

CREATE VIEW postgres_versions_9 AS
SELECT version, nickname
FROM postgres_versions
WHERE version LIKE '9.%';
</code>

<code>
postgres=# SELECT * from postgres_versions_9;
version | nickname
---------+-----------------
9.0 | Jumpy Join
9.1 | Killer Key
9.2 | Laconical Lexer
9.3 | Morose Module
(4 rows)

postgres=# UPDATE postgres_versions_9 SET nickname='Maniac Master' WHERE version='9.3';
UPDATE 1
postgres=# SELECT * from postgres_versions_9;
version | nickname
---------+-----------------
9.0 | Jumpy Join
9.1 | Killer Key
9.2 | Laconical Lexer
9.3 | Maniac Master
(4 rows)
</code>

'''Links'''
* [http://www.postgresql.org/docs/9.3/static/sql-createview.html#SQL-CREATEVIEW-UPDATABLE-VIEWS Documentation]
* [http://www.depesz.com/2012/12/11/waiting-for-9-3-support-automatically-updatable-views/ Waiting for 9.3 – Support automatically-updatable views]
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-auto-updatable-views/ Postgres 9.3 feature highlight: auto-updatable views]

== Writeable Foreign Tables ==

"Foreign Data Wrappers" (FDW) were introduced in PostgreSQL 9.1, providing a way of accessing external data sources from within PostgreSQL using SQL. The original implementation was read-only, but 9.3 will enable write access as well, provided the individual FDW drivers have been updated to support this. At the time of writing, only the Redis and PostgreSQL drivers have write support (''need to verify this'').

See [[#postgres_fdw|below]] for more information on the PostgreSQL driver and a simple example.

'''Links'''

* [http://www.postgresql.org/docs/9.3/static/sql-createserver.html CREATE SERVER]
* [http://www.postgresql.org/docs/9.3/static/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]
* [http://www.postgresql.org/docs/9.3/static/fdwhandler.html Documentation: Writing A Foreign Data Wrapper]
* [http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-writable-foreign-tables/ Postgres 9.3 feature highlight: writable foreign tables]
* [http://www.depesz.com/2013/03/17/waiting-for-9-3-support-writable-foreign-tables/ Waiting for 9.3 – Support writable foreign tables]

=== postgres_fdw ===

A new contrib module, postgres_fdw, provides the eponymous foreign data wrapper for read/write access to remote PostgreSQL servers (or to another database on the local server).

A simple usage example (connecting to a different database on the same server for ease of testing).

1. Build the postgres_fdw contrib module

cd contrib/postgres_fdw
make install

2. Install the module as an extension

postgres=# CREATE EXTENSION postgres_fdw;
CREATE EXTENSION

3. Create a test "remote" database

postgres=# CREATE DATABASE fdw_test;
CREATE DATABASE
postgres=# \c fdw_test
You are now connected to database "fdw_test" as user "barwick".
fdw_test=# CREATE TABLE world (greeting TEXT);
CREATE TABLE

4. Create the server, user and table mapping so that the local PostgreSQL server knows about the remote database:

postgres=# CREATE SERVER postgres_fdw_test FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'fdw_test');
CREATE SERVER
postgres=# CREATE USER MAPPING FOR PUBLIC SERVER postgres_fdw_test OPTIONS (password <nowiki>''</nowiki>);
CREATE USER MAPPING
postgres=# CREATE FOREIGN TABLE other_world (greeting TEXT) SERVER postgres_fdw_test OPTIONS (table_name 'world');
CREATE FOREIGN TABLE
postgres=# \det
List of foreign tables
Schema | Table | Server
--------+-------------+-------------------
public | other_world | postgres_fdw_test
(1 row)

5. Manipulate the remote table as if it were a local one:

postgres=# INSERT INTO other_world VALUES('Take me to your leader');
INSERT 0 1
postgres=# \c fdw_test
You are now connected to database "fdw_test" as user "barwick".
fdw_test=# SELECT * FROM world;
hello
------------------------
Take me to your leader
(1 row)

Here's another example, where we link to the "account" and "branches" tables on a remote pgbench database:

create extension postgres_fdw;
create user mapping for current_user server remotesrv options ( user 'postgres', password 'password' );
create server remotesrv foreign data wrapper postgres_fdw options ( host '192.168.1.5', port '5433', dbname 'bench');
create foreign table remoteacct (aid int, bid int, abalance int, filler char(84))
server remotesrv options ( table_name 'pgbench_accounts' );
create foreign table remotebranch ( bid int, bbalance int, filler char(88) )
server remotesrv options ( table_name 'pgbench_branches');

Having set this up, we can query the remote server:

explain select * from remotebranch join remoteacct using ( bid ) where bid = 5;
QUERY PLAN
----------------------------------------------------------------------------
Nested Loop (cost=200.00..225.40 rows=1 width=712)
-> Foreign Scan on remotebranch (cost=100.00..112.66 rows=1 width=364)
-> Foreign Scan on remoteacct (cost=100.00..112.73 rows=1 width=352)

Notice a couple of things: first, JOIN push-down to the remote server isn't implemented yet (wait for 9.4!). Second, we're not getting real estimates for the remote tables. This is fixable, but telling Postgres to query the remote DB for EXPLAIN information:

alter foreign table remotebranch options (add use_remote_estimate 'true');
alter foreign table remoteacct options (add use_remote_estimate 'true');
bench=# explain select * from remotebranch join remoteacct using ( bid ) where bid = 5;
QUERY PLAN
------------------------------------------------------------------------------
Nested Loop (cost=200.42..7648.07 rows=99400 width=712)
-> Foreign Scan on remotebranch (cost=100.00..101.14 rows=1 width=364)
-> Foreign Scan on remoteacct (cost=100.42..6552.93 rows=99400 width=97)

'''Links'''
* [http://www.postgresql.org/docs/9.3/static/postgres-fdw.html Documentation]

== Replication Improvements ==

PostgreSQL's built-in binary replication has been improved in four ways: streaming-only remastering, fast failover, and architecture-independent streaming, and pg_basebackup conf setup.

=== Streaming-Only Remastering ===

"Remastering" is the process whereby a replica in a set of replicas becomes the new master for all of the other replicas. For example:

# Master M1 is replicating to replicas R1, R2 and R3.
# Master M1 needs to be taken down for a hardware upgrade.
# The DBA promotes R1 to be the master.
# R2 and R3 are reconfigured & restarted, and now replicate from R1

That's remastering in a nutshell. It's even more useful in combination with cascading replication (introduced in 9.2).

In prior versions of PostgreSQL, remastering required using WAL file archiving. Cascading replicas could not switch masters using streaming alone; they would have to be re-cloned. That restriction has now been lifted, allowing remastering from just the stream. This makes it much easier to set up large replication clusters; administrators no longer have to set up an online WAL archive if they don't need one for disaster recovery.

Incidentally, this also makes it possible to set up "cycles" where replication is going in a circle. Whether that's a feature or a bug depends on your perspective.

Links:
* [http://www.databasesoup.com/2013/01/cascading-replication-and-cycles.html Cascading Replication and Cycles]

=== Fast Failover ===

Allows replicas to be promoted in less than a second, permitting 99.999% uptime. More details TBD.

=== Architecture-Independent Streaming ===

Allows streaming of base backups (using pg_basebackup) and log archiving (using pg_receivexlog) between different OSes and hardware architectures. (Note that you still need the same architecture to restore the backups, but this is useful for example with centralized backup servers)

=== pg_basebackup conf setup ===

If you use the -R switch, pg_basebackup will create a simple (streaming-only) recovery.conf file in the newly cloned data directory. This means that you can immediately start the new database server without doing additional editing.

=Backward compatibility=

These changes may incur regressions in your applications.

== CREATE TABLE output ==

CREATE TABLE will no longer output messages about implicit index and sequence creation unless the log level is set to DEBUG1.

== Server settings ==

* Parameter 'commit_delay' is restricted to superusers only
* Parameter 'replication_timeout' has been renamed to 'wal_sender_timeout'
* Parameter 'unix_socket_directory' has been replaced 'unix_socket_directories'
* In-memory sorts to use their full memory allocation; if work_mem was set on the basis of the pre-9.3 behavior, its value may need to be reviewed.

== WAL filenames may end in FF ==

WAL files will now be written in a continuous stream, rather than skipping the last 16MB segment every 4GB, meaning WAL filenames may end in FF. WAL backup or restore scripts may need to be adapted.

Todo

2013-04-04T23:13:45Z

Schmiddy: Add link for 'Allow processing of multiple -f (file) options'

<div style="margin: 1ex 1em; float: right;">
__TOC__
</div>

This list contains '''known PostgreSQL bugs and feature requests''' and we hope it is complete. If you would like to work on an item, please read the [[Developer FAQ]] first. There is also a [[Development_information|development information page]].

* {{TodoPending}} - marks ordinary, incomplete items
* {{TodoEasy}} - marks items that are easier to implement
* {{TodoDone}} - marks changes that are done, and will appear in the PostgreSQL 9.3 release.

For help on editing this list, please see [[Talk:Todo]]. <b>Please do not add items here without discussion on the mailing list.</b>

<b>For Developers:</b> Unfortunately this list does not contain all the information necessary for someone to start coding a feature. Some of these items might have become unnecessary since they were added --- others might be desirable but the implementation might be unclear. When selecting items listed below, be prepared to first discuss the value of the feature. Do not assume that you can select one, code it and then expect it to be committed. Always discuss design on Hackers list before starting to code. The flow should be:

Desirability -> Design -> Implement -> Test -> Review -> Commit

<div style="padding: 1ex 4em;">
== Administration ==

{{TodoItem
|Allow administrators to cancel multi-statement idle transactions
|This allows locks to be released, but it is complex to report the cancellation back to the client.
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg01340.php <nowiki>Cancelling idle in transaction state</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg00441.php <nowiki>Re: Cancelling idle in transaction state</nowiki>]
}}

{{TodoItem
|Check for unreferenced table files created by transactions that were in-progress when the server terminated abruptly
* [http://archives.postgresql.org/pgsql-patches/2006-06/msg00096.php <nowiki>Removing unreferenced files</nowiki>]
}}

{{TodoItem
|Set proper permissions on non-system schemas during db creation
|Currently all schemas are owned by the super-user because they are copied from the template1 database. However, since all objects are inherited from the template database, it is not clear that setting schemas to the db owner is correct.}}

{{TodoItem
|Allow log_min_messages to be specified on a per-module basis
|This would allow administrators to see more detailed information from specific sections of the backend, e.g. checkpoints, autovacuum, etc. Another idea is to allow separate configuration files for each module, or allow arbitrary SET commands to be passed to them. See also [[Logging Brainstorm]].}}

{{TodoItem
|Simplify creation of partitioned tables
|This would allow creation of partitioned tables without requiring creation of triggers or rules for INSERT/UPDATE/DELETE, and constraints for rapid partition selection. Options could include range and hash partition selection. See also [[Table partitioning]]
}}

{{TodoItem
|Allow custom variables to appear in pg_settings()
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00850.php <nowiki>Re: count(*) performance improvement ideas</nowiki>]
}}

{{TodoItem
|Have custom variables be transaction-safe
* {{MessageLink|4B577E9F.8000505@dunslane.net|Custom GUCs still a bit broken}}
}}

{{TodoItem
|Implement the SQL-standard mechanism whereby REVOKE ROLE revokes only the privilege granted by the invoking role, and not those granted by other roles
* [http://archives.postgresql.org/pgsql-bugs/2007-05/msg00010.php <nowiki>Re: Grantor name gets lost when grantor role dropped</nowiki>]
}}

{{TodoItem
|Prevent query cancel packets from being replayed by an attacker, especially when using SSL
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg00345.php <nowiki>Replay attack of query cancel</nowiki>]
}}

{{TodoItem
|Provide a way to query the log collector subprocess to determine the name of the currently active log file
* [http://archives.postgresql.org/pgsql-general/2008-11/msg00418.php <nowiki>Current log files when rotating?</nowiki>]
}}

{{TodoItem
|Allow simpler reporting of the unix domain socket directory and allow easier configuration of its default location
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg01555.php
* http://archives.postgresql.org/pgsql-hackers/2011-10/msg01482.php
}}

{{TodoItem
|Allow custom daemons to be automatically stopped/started along with the postmaster
|This allows easier administration of daemons like user job schedulers or replication-related daemons.
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01701.php <nowiki>Re: scheduler in core</nowiki>]
}}

{{TodoItem
|Improve logging of prepared transactions recovered during startup
* [http://archives.postgresql.org/pgsql-hackers/2006-11/msg00092.php <nowiki>"recovering prepared transaction" after server restart message</nowiki>]
}}

{{TodoItemDone
|Consider using POSIX shared memory to avoid System V shared memory kernel limits
* [http://archives.postgresql.org/message-id/4DFA2673.3010009@enterprisedb.com <nowiki>POSIX shared memory patch status</nowiki>]
}}

=== Configuration files ===
{{TodoSubsection}}

{{TodoItemDone
|Change pg_ident.conf parsing to be the same as pg_hba.conf
* http://archives.postgresql.org/pgsql-hackers/2011-06/msg02204.php
}}

{{TodoItem
|Allow postgresql.conf file values to be changed via an SQL API, perhaps using SET GLOBAL
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00764.php
* http://archives.postgresql.org/pgsql-hackers/2012-10/msg01509.php
* http://archives.postgresql.org/pgsql-hackers/2012-11/msg00002.php
}}

{{TodoItem
|Consider normalizing fractions in postgresql.conf, perhaps using '%'
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00550.php <nowiki>Fractions in GUC variables</nowiki>]
}}

{{TodoItem
|Allow Kerberos to disable stripping of realms so we can check the username@realm against multiple realms
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00009.php <nowiki>krb_match_realm patch</nowiki>]
}}

{{TodoItem
|Improve LDAP authentication configuration options
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01745.php <nowiki>Proposed Patch - LDAPS support for servers on port 636 w/o TLS</nowiki>]
}}

{{TodoItem
|Add external tool to auto-tune some postgresql.conf parameters
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00000.php <nowiki>Re: Overhauling GUCS</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00033.php <nowiki>Simple postgresql.conf wizard</nowiki>]
}}

{{TodoItem
|Add 'hostgss' pg_hba.conf option to allow GSS link-level encryption
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg01454.php <nowiki>Re: Plans for 8.4</nowiki>]
}}

{{TodoItem
|Process pg_hba.conf keywords as case-insensitive
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00432.php <nowiki>More robust pg_hba.conf parsing/error logging</nowiki>]
}}

{{TodoItem
|Create utility to compute accurate random_page_cost value
* http://archives.postgresql.org/pgsql-performance/2011-04/msg00162.php
* http://archives.postgresql.org/pgsql-performance/2011-04/msg00362.php
}}

{{TodoItem
|Allow configuration files to be independently validated
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01831.php
* http://archives.postgresql.org/message-id/12666.1310774573@sss.pgh.pa.us
}}

{{TodoItem
|Allow postgresql.conf settings to be accepted by backends even if some settings are invalid for those backends
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00330.php
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00375.php
}}

{{TodoItem
|Allow all backends to receive postgresql.conf setting changes at the same time
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00330.php
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00375.php
}}

{{TodoItem
|Allow synchronous_standby_names to be disabled after communication failure with all synchronous standby servers exceeds some timeout
|This also requires successful execution of a synchronous notification command.
* http://archives.postgresql.org/pgsql-hackers/2012-07/msg00409.php
}}

{{TodoEndSubsection}}

=== Tablespaces ===
{{TodoSubsection}}

{{TodoItem
|Allow a database in tablespace t1 with tables created in tablespace t2 to be used as a template for a new database created with default tablespace t2
|Currently all objects in the default database tablespace must have default tablespace specifications. This is because new databases are created by copying directories. If you mix default tablespace tables and tablespace-specified tables in the same directory, creating a new database from such a mixed directory would create a new database with tables that had incorrect explicit tablespaces. To fix this would require modifying pg_class in the newly copied database, which we don't currently do.}}

{{TodoItem
|Allow reporting of which objects are in which tablespaces
|This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.}}

{{TodoItem
|Allow WAL replay of CREATE TABLESPACE to work when the directory structure on the recovery computer is different from the original}}

{{TodoItem
|Allow per-tablespace quotas}}

{{TodoItem
|Allow tablespaces on RAM-based partitions for unlogged tables
* http://archives.postgresql.org/pgsql-advocacy/2011-05/msg00033.php
}}

{{TodoItem
|Allow toast tables to be moved to a different tablespace
* [http://archives.postgresql.org/pgsql-hackers/2011-05/msg00980.php]
* {{messageLink|CAFEQCbH756DyyAPQ1ykh3+b+kE1-EhWRww1WO_x5v38C-uLnUg@mail.gmail.com|patch : Allow toast tables to be moved to a different tablespace}} (issues remain)
* [http://archives.postgresql.org/message-id/CAFEQCbEq07OopgE5xFYv2Q3eMq45hRSJkjCBO+kvpJq9NEVhow@mail.gmail.com Allow toast tables to be moved to a different tablespace]
}}

{{TodoEndSubsection}}

=== Statistics Collector ===
{{TodoSubsection}}

{{TodoItem
|Allow statistics last vacuum/analyze execution times to be displayed without requiring track_counts to be enabled
* [http://archives.postgresql.org/pgsql-docs/2007-04/msg00028.php <nowiki>row-level stats and last analyze time</nowiki>]
}}

{{TodoItem
|Clear table counters on TRUNCATE
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00169.php <nowiki>Small TRUNCATE glitch</nowiki>]
}}

{{TodoEndSubsection}}

=== SSL ===
{{TodoSubsection}}

{{TodoItem
|Allow SSL authentication/encryption over unix domain sockets
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00924.php <nowiki>Re: Spoofing as the postmaster</nowiki>]
}}

{{TodoItem
|Allow SSL key file permission checks to be optionally disabled when sharing SSL keys with other applications
* [http://archives.postgresql.org/pgsql-bugs/2007-12/msg00069.php <nowiki>BUG #3809: SSL "unsafe" private key permissions bug</nowiki>]
}}

{{TodoItem
|Allow SSL CRL files to be re-read during configuration file reload, rather than requiring a server restart
|Unlike SSL CRT files, CRL (Certificate Revocation List) files are updated frequently
* [http://archives.postgresql.org/pgsql-general/2008-12/msg00832.php <nowiki>Automatic CRL reload</nowiki>]
Alternatively or additionally supporting OCSP (online certificate security protocol) would provide real-time revocation discovery without reloading
}}

{{TodoItem
| Allow automatic selection of SSL client certificates from a certificate store
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00406.php <nowiki>Allow multiple certificates or keys in the postgresql.crt/.key files</nowiki>]
}}

{{TodoItem
| Send the full certificate server chain to the client
* [http://archives.postgresql.org/pgsql-bugs/2009-12/msg00145.php BUG #5245: Full Server Certificate Chain Not Sent to client]
}}

{{TodoEndSubsection}}

=== Point-In-Time Recovery (PITR) ===
{{TodoSubsection}}

{{TodoItemDone
|Create dump tool for write-ahead logs for use in determining transaction id for point-in-time recovery
|This is useful for checking PITR recovery.}}

{{TodoItem
|Allow archive_mode to be changed without server restart?
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01655.php <nowiki>Enabling archive_mode without restart</nowiki>]
}}

{{TodoItem
|Consider avoiding WAL switching via archive_timeout if there has been no database activity
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01469.php <nowiki>archive_timeout behavior for no activity</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00395.php <nowiki>Re: archive_timeout behavior for no activity</nowiki>]
}}

{{TodoItem
|Allow base backup from standby to continue when the standby is promoted.
* [http://archives.postgresql.org/pgsql-hackers/2012-10/msg00239.php <nowiki>Re: Promoting a standby during base backup</nowiki>]
}}

{{TodoEndSubsection}}

=== Standby server mode ===
{{TodoSubsection}}

{{TodoItem
| Allow pg_xlogfile_name() to be used in recovery mode
* [http://archives.postgresql.org/message-id/3f0b79eb1001190135vd9f62f1sa7868abc1ea61d12@mail.gmail.com <nowiki>Streaming replication and pg_xlogfile_name()</nowiki>]
}}

{{TodoItem
| Prevent variables inherited from the server environment from begin used for making streaming replication connections.
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01011.php <nowiki>Re: Parameter name standby_mode</nowiki>]
}}

{{TodoItem
| Change walsender so that it applies per-role settings
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg00642.php
}}

{{TodoItem
| Restructure configuration parameters for standby mode
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg01820.php
}}

{{TodoItemf
| Allow time-delayed application of logs on the standby
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00992.php
}}

{{TodoItem
| Add -X parameter to pg_basebackup to specify a different directory for px_xlog, like initdb
}}

{{TodoItem
| Add a new "eager" synchronous mode that starts out synchronous but reverts to asynchronous after a failure timeout period
|This would require some type of command to be executed to alert administrators of this change.
* http://archives.postgresql.org/pgsql-hackers/2011-12/msg01224.php
}}

{{TodoEndSubsection}}

== Data Types ==

{{TodoItem
|Fix data types where equality comparison is not intuitive, e.g. box
* http://archives.postgresql.org/pgsql-hackers/2011-10/msg01643.php
}}

{{TodoItem
|Add support for public SYNONYMs
* [http://archives.postgresql.org/pgsql-hackers/2006-03/msg00519.php <nowiki>Proposal for SYNONYMS</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg02043.php
* http://archives.postgresql.org/pgsql-general/2010-12/msg00139.php
}}

{{TodoItem
|Add support for SQL-standard GENERATED/IDENTITY columns
* [http://archives.postgresql.org/pgsql-hackers/2006-07/msg00543.php <nowiki>Re: Three weeks left until feature freeze</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-08/msg00038.php <nowiki>GENERATED ... AS IDENTITY, Was: Re: Feature Freeze</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00344.php <nowiki>Behavior of GENERATED columns per SQL2003</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00076.php <nowiki>Re: [HACKERS] Behavior of GENERATED columns per SQL2003</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00604.php <nowiki>IDENTITY/GENERATED patch</nowiki>]
}}

{{TodoItem
|Consider placing all sequences in a single table, or create a system view
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php <nowiki>Re: newbie: renaming sequences task</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2012-02/msg00258.php Removing special case OID generation]
}}

{{TodoItem
|Consider a special data type for regular expressions
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg01067.php <nowiki>Why is there a tsquery data type?</nowiki>]
}}

{{TodoItem
|Reduce BIT data type overhead using short varlena headers
* [http://archives.postgresql.org/pgsql-general/2007-12/msg00273.php <nowiki>storage size of "bit" data type..</nowiki>]
}}

{{TodoItem
|Allow renaming and deleting enumerated values from an existing enumerated data type
}}

{{TodoItem
|Support scoped IPv6 addresses in the inet type
* [http://archives.postgresql.org/pgsql-bugs/2007-05/msg00111.php <nowiki>strange problem with ip6</nowiki>]
}}

{{TodoItem
|Considering improving performance of computing CHAR() value lengths
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00900.php <nowiki>char() overhead on read-only workloads not so insignifcant as the docs claim it is...</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01787.php <nowiki>Re: [PATCH] backend: compare word-at-a-time in bcTruelen</nowiki>]
}}

{{TodoItem
|Add overlaps geometric operators that ignore point overlaps
* http://archives.postgresql.org/pgsql-hackers/2010-03/msg00861.php
}}

{{TodoItem
|Remove or improve rounding in geometric comparison operators
* http://archives.postgresql.org/message-id/9804.1346187849@sss.pgh.pa.us
}}

{{TodoItem
| Add IMMUTABLE column attribute
* http://archives.postgresql.org/pgsql-hackers/2011-11/msg00623.php
}}

=== Domains ===
{{TodoSubsection}}

{{TodoItem
|Allow functions defined as casts to domains to be called during casting
* [http://archives.postgresql.org/pgsql-hackers/2006-05/msg00072.php <nowiki>bug? non working casts for domain</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-09/msg01681.php <nowiki>TODO: Fix CREATE CAST on DOMAINs</nowiki>]
}}

{{TodoItem
|Allow values to be cast to domain types
* [http://archives.postgresql.org/pgsql-hackers/2003-06/msg01206.php <nowiki>Domain casting still doesn't work right</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00289.php <nowiki>domain casting?</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00812.php
}}

{{TodoItem
|Make domains work better with polymorphic functions
* [http://archives.postgresql.org/message-id/4887.1228700773@sss.pgh.pa.us Polymorphic types vs. domains]
* [http://archives.postgresql.org/message-id/15535.1238774571@sss.pgh.pa.us some difficulties with fixing it]
}}

{{TodoEndSubsection}}

=== Dates and Times ===
{{TodoSubsection}}

{{TodoItem
|Allow infinite intervals just like infinite timestamps
* http://archives.postgresql.org/pgsql-hackers/2011-11/msg00076.php
}}

{{TodoItem
|Determine how to represent date/time field extraction on infinite timestamps
* [http://archives.postgresql.org/message-id/CA+mi_8bda-Fnev9iXeUbnqhVaCWzbYhHkWoxPQfBca9eDPpRMw@mail.gmail.com extract(epoch from infinity) is not 0]
* [http://archives.postgresql.org/message-id/CADAkt-icuESH16uLOCXbR-dKpcvwtUJE4JWXnkdAjAAwP6j12g@mail.gmail.com converting between infinity timestamp and float8]
}}

{{TodoItem
|Allow TIMESTAMP WITH TIME ZONE to store the original timezone information, either zone name or offset from UTC
|If the TIMESTAMP value is stored with a time zone name, interval computations should adjust based on the time zone rules.
* [http://archives.postgresql.org/pgsql-hackers/2004-10/msg00705.php <nowiki>timestamp with time zone a la sql99</nowiki>]
}}

{{TodoItem
|Have timestamp subtraction not call justify_hours()?
* [http://archives.postgresql.org/pgsql-sql/2006-10/msg00059.php <nowiki>timestamp subtraction (was Re: formatting intervals with to_char)</nowiki>]
}}

{{TodoItem
|Improve TIMESTAMP WITH TIME ZONE subtraction to be DST-aware
|Currently subtracting one date from another that crosses a daylight savings time adjustment can return '1 day 1 hour', but adding that back to the first date returns a time one hour in the future. This is caused by the adjustment of '25 hours' to '1 day 1 hour', and '1 day' is the same time the next day, even if daylight savings adjustments are involved.}}

{{TodoItem
|Fix interval display to support values exceeding 2^31 hours}}

{{TodoItem
|Add overflow checking to timestamp and interval arithmetic}}

{{TodoItem
|Add function to allow the creation of timestamps using parameters
* http://archives.postgresql.org/pgsql-performance/2010-06/msg00232.php
}}

{{TodoEndSubsection}}

=== Arrays ===
{{TodoSubsection}}

{{TodoItem
|Add support for arrays of domains
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00114.php <nowiki>Re: updated WIP: arrays of composites</nowiki>]
}}

{{TodoItem
|Allow single-byte header storage for array elements}}

{{TodoItem
|Add function to detect if an array is empty
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00475.php <nowiki>Re: array_length()</nowiki>]
}}

{{TodoItem
|Improve handling of empty arrays
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01033.php <nowiki>So what's an "empty" array anyway?</nowiki>]
* http://archives.postgresql.org/pgsql-general/2012-07/msg00633.php
* [http://www.postgresql.org/message-id/1182.1363387349@sss.pgh.pa.us <nowiki>Allow declaration of an empty array?</nowiki>]
* [http://www.postgresql.org/message-id/CADxJZo0keVhSRzUnot2Y6g46tsP7f-eV28iEmBd3AtLjU-YTMA@mail.gmail.com Exorcise "zero-dimensional" arrays]
}}

{{TodoItem
|Improve handling of NULLs in arrays
* [http://archives.postgresql.org/pgsql-bugs/2008-11/msg00009.php <nowiki>BUG #4509: array_cat's null behaviour is inconsistent</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg01040.php
}}

{{TodoEndSubsection}}

=== Binary Data ===
{{TodoSubsection}}

{{TodoItem
|Improve vacuum of large objects, like contrib/vacuumlo?}}

{{TodoItem
|Auto-delete large objects when referencing row is deleted
|contrib/lo offers this functionality.}}

{{TodoItem
|Allow read/write into TOAST values like large objects
|Writing might require the TOAST column to be stored EXTERNAL.
* http://archives.postgresql.org/pgsql-hackers/2011-06/msg00049.php
}}

{{TodoItemDone
|Add API for 64-bit large object access
* [http://archives.postgresql.org/pgsql-hackers/2005-09/msg00781.php <nowiki>64-bit API for large objects</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg01790.php
}}

{{TodoEndSubsection}}

=== MONEY Data Type ===
{{TodoSubsection}}

{{TodoItem
|Add locale-aware MONEY type, and support multiple currencies
* [http://archives.postgresql.org/pgsql-general/2005-08/msg01432.php <nowiki>A real currency type</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01181.php <nowiki>Money type todos?</nowiki>]
}}

{{TodoItem
|MONEY dumps in a locale-specific format making it difficult to restore to a system with a different locale}}

{{TodoEndSubsection}}

=== Text Search ===
{{TodoSubsection}}

{{TodoItem
|Allow dictionaries to change the token that is passed on to later dictionaries
* [http://archives.postgresql.org/pgsql-patches/2007-11/msg00081.php <nowiki>a tsearch2 (8.2.4) dictionary that only filters out stopwords</nowiki>]
}}

{{TodoItem
|Consider a function-based API for '@@' searches
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00511.php <nowiki>Simplifying Text Search</nowiki>]
}}

{{TodoItem
|Improve text search error messages
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00966.php <nowiki>Poorly designed tsearch NOTICEs</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg01146.php <nowiki>Re: Poorly designed tsearch NOTICEs</nowiki>]
}}

{{TodoItem
|Consider changing error to warning for strings larger than one megabyte
* [http://archives.postgresql.org/pgsql-bugs/2008-02/msg00190.php <nowiki>BUG #3975: tsearch2 index should not bomb out of 1Mb limit</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-03/msg00062.php <nowiki>Re: [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit</nowiki>]
}}

{{TodoItem
|tsearch and tsdicts regression tests fail in Turkish locale on glibc
* [http://archives.postgresql.org/message-id/49749645.5070801@gmx.net tsearch with Turkish locale]
}}

{{TodoItem
|tsquery negator operator treated as part of lexeme
* [http://archives.postgresql.org/pgsql-bugs/2009-06/msg00346.php BUG #4887: inclusion operator (@>) on tsqeries behaves not conforming to documentation]
}}

{{TodoItem
|Improve handling of dash and plus signs in email address user names, and perhaps improve URL parsing
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00772.php
* [http://archives.postgresql.org/message-id/E1Ri8il-0008Ct-9p@wrigleys.postgresql.org tsearch does not recognize all valid emails]
}}

{{TodoItem
|Improve default parser, to more easily allow adding new tokens
* http://archives.postgresql.org/message-id/23485.1297727826@sss.pgh.pa.us
}}

{{TodoItem
|Add additional support functions
* http://archives.postgresql.org/pgsql-hackers/2011-06/msg00319.php
}}

{{TodoEndSubsection}}

=== XML ===
{{TodoSubsection}}

{{TodoItem
|Allow XML arrays to be cast to other data types
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00981.php <nowiki>proposal casting from XML[] to int[], numeric[], text[]</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00231.php <nowiki>Re: proposal casting from XML[] to int[], numeric[], text[]</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00471.php <nowiki>Re: proposal casting from XML[] to int[], numeric[], text[]</nowiki>]
}}

{{TodoItem
|Add XML Schema validation and xmlvalidate functions (SQL:2008)}}

{{TodoItem
|Add xmlvalidatedtd variant to support validating against a DTD?}}

{{TodoItem
|Relax-NG validation; libxml2 supports this already}}

{{TodoItem
|Allow reliable XML operation non-UTF8 server encodings (xpath(), in particular, is known to not work)
* [http://archives.postgresql.org/pgsql-bugs/2009-01/msg00135.php <nowiki>BUG #4622: xpath only work in utf-8 server encoding</nowiki>]
* http://archives.postgresql.org/message-id/4110.1238973350@sss.pgh.pa.us}}

{{TodoItem
|Add functions from SQL:2006: XMLDOCUMENT, XMLCAST, XMLTEXT}}

{{TodoItem
|Add XMLNAMESPACES support in XMLELEMENT and elsewhere}}

{{TodoItem
|Move XSLT from contrib/xml2 to a more reasonable location
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00539.php
}}

{{TodoItem
|Report errors returned by the XSLT library
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00562.php
}}

{{TodoItem
|Improve the XSLT parameter passing API
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00416.php
}}

{{TodoItem
|XML Canonical: Convert XML documents to canonical form to compare them. libxml2 has support for this.}}

{{TodoItem
|Add pretty-printed XML output option
|Parse a document and serialize it back in some indented form. libxml2 might support this.}}

{{TodoItem
|Add XMLQUERY (from the SQL/XML standard)}}

{{TodoItem
|Allow XML sthredding
|In some cases shredding could be better option (if there is no need to keep XML docs entirely, e.g. if we have already developed tools that understand only relational data. This would be a separate module that implements annotated schema decomposition technique, similar to DB2 and SQL Server functionality.}}

{{TodoItem
|Fix Nested or repeated xpath() that apparently mess up namespaces [http://archives.postgresql.org/pgsql-bugs/2008-03/msg00097.php] [http://archives.postgresql.org/pgsql-bugs/2008-03/msg00144.php] [http://archives.postgresql.org/pgsql-general/2008-03/msg00295.php] [http://archives.postgresql.org/pgsql-bugs/2008-07/msg00054.php] [http://archives.postgresql.org/message-id/004f01c90e91$138e9d10$3aabd730$@anstett@iaas.uni-stuttgart.de]}}

{{TodoItem
|XPath: Adding the <x> at the root causes problems [http://archives.postgresql.org/pgsql-bugs/2008-05/msg00184.php] [http://archives.postgresql.org/pgsql-bugs/2008-07/msg00054.php] [http://archives.postgresql.org/pgsql-general/2008-07/msg00613.php]}}

{{TodoItem
|xpath_table needs to be implemented/implementable to get rid of contrib/xml2 [http://archives.postgresql.org/pgsql-general/2008-05/msg00823.php]}}

{{TodoItem
|xpath_table is pretty broken anyway [http://archives.postgresql.org/pgsql-hackers/2010-02/msg02424.php]}}

{{TodoItem
|better handling of XPath data types [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00616.php] [http://archives.postgresql.org/message-id/004a01c90e90$4b986d90$e2c948b0$@anstett@iaas.uni-stuttgart.de]}}

{{TodoItem
|Improve handling of PIs and DTDs in xmlconcat() [http://archives.postgresql.org/message-id/200904211211.n3LCB09p008988@wwwmaster.postgresql.org]}}

{{TodoItem
|Restructure XML and /contrib/xml2 functionality
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg02314.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg00017.php
}}

{{TodoEndSubsection}}

== Functions ==

{{TodoItem
|Allow INET subnet comparisons using non-constants to be indexed}}

{{TodoItem
|Add an INET overlaps operator, for use by exclusion constraints
* http://archives.postgresql.org/pgsql-hackers/2010-03/msg00845.php
}}

{{TodoItem
|Enforce typmod for function inputs, function results and parameters for spi_prepare'd statements called from PLs
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg01403.php <nowiki>Re: BUG #2917: spi_prepare doesn't accept typename aliases</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-11/msg01160.php <nowiki>RFC for adding typmods to functions</nowiki>]
}}

{{TodoItem
|Fix IS OF so it matches the ISO specification, and add documentation
* [http://archives.postgresql.org/pgsql-patches/2003-08/msg00060.php <nowiki>Re: [HACKERS] IS OF</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00060.php <nowiki>ToDo: add documentation for operator IS OF</nowiki>]
}}

{{TodoItem
|Implement Boyer-Moore searching in LIKE queries
* {{messageLink|27645.1220635769@sss.pgh.pa.us|TODO item: Implement Boyer-Moore searching (First time hacker)}}
}}

{{TodoItem
|Prevent malicious functions from being executed with the permissions of unsuspecting users
|Index functions are safe, so VACUUM and ANALYZE are safe too. Triggers, CHECK and DEFAULT expressions, and rules are still vulnerable.
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00268.php <nowiki>Some notes about the index-functions security vulnerability</nowiki>]
}}

{{TodoItem
|Reduce memory usage of aggregates in set returning functions
* [http://archives.postgresql.org/pgsql-performance/2008-01/msg00031.php <nowiki>Re: Performance of aggregates over set-returning functions</nowiki>]
}}

{{TodoItem
|Fix /contrib/ltree operator
* [http://archives.postgresql.org/pgsql-bugs/2007-11/msg00044.php <nowiki>BUG #3720: wrong results at using ltree</nowiki>]
}}

{{TodoItem
|Fix /contrib/btree_gist's implementation of inet indexing
* [http://archives.postgresql.org/pgsql-bugs/2010-10/msg00099.php <nowiki>BUG #5705: btree_gist: Index on inet changes query result</nowiki>]
}}

{{TodoItem
|<nowiki>Fix inconsistent precedence of =, >, and < compared to <>, >=, and <=</nowiki>
* [http://archives.postgresql.org/pgsql-bugs/2007-12/msg00145.php <nowiki>BUG #3822: Nonstandard precedence for comparison operators</nowiki>]
}}

{{TodoItem
|Fix regular expression bug when using complex back-references
* [http://archives.postgresql.org/pgsql-bugs/2007-10/msg00000.php <nowiki>BUG #3645: regular expression back references seem broken</nowiki>]
}}

{{TodoItem
|Have /contrib/dblink reuse unnamed connections
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00895.php <nowiki>dblink un-named connection doesn't get re-used</nowiki>]
}}

{{TodoItem
|Improve formatting of pg_get_viewdef() output
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg01648.php <nowiki>pg_get_viewdef formattiing</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01885.php <nowiki>Re: pretty print viewdefs</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2011-12/msg00906.php reprise: pretty print viewdefs]
}}

{{TodoItem
|Add function to dump pg_depend information cleanly
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00226.php <nowiki>Elementary dependency look-up</nowiki>]
}}

{{TodoItem
|Add function to allow easier transaction id comparisons
* http://archives.postgresql.org/pgsql-hackers/2011-11/msg00786.php
}}

=== Character Formatting ===

{{TodoSubsection}}
{{TodoItem
|Allow to_date() and to_timestamp() to accept localized month names}}

{{TodoItem
|Add missing parameter handling in to_char()
* [http://archives.postgresql.org/pgsql-hackers/2005-12/msg00948.php <nowiki>Re: to_char and i18n</nowiki>]
}}

{{TodoItem
|Throw an error from to_char() instead of printing a string of "#" when a number doesn't fit in the desired output format.
* discussed in [http://archives.postgresql.org/message-id/37ed240d0907290836w42187222n18664dfcbcb445b1@mail.gmail.com "to_char, support for EEEE format"]
}}

{{TodoItem
|Allow to_char() on interval values to accumulate the highest unit requested
|2= Some special format flag would be required to request such accumulation. Such functionality could also be added to EXTRACT. Prevent accumulation that crosses the month/day boundary because of the uneven number of days in a month.
* to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
* to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
* to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
* to_char(INTERVAL '3 years 5 months','MM') => 41
}}

{{TodoItem
|Fix to_number() handling for values not matching the format string
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg01447.php <nowiki>Re: numeric_to_number() function skipping some digits</nowiki>]
}}

{{TodoEndSubsection}}

== Multi-Language Support ==

{{TodoItem
|Add NCHAR (as distinguished from ordinary varchar),}}

{{TodoItem
|Add a cares-about-collation column to pg_proc, so that unresolved-collation errors can be thrown at parse time
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01520.php <nowiki>Open issues for collations</nowiki>]
}}

{{TodoItem
|Integrate collations with text search configurations
* [http://archives.postgresql.org/message-id/28887.1303579034@sss.pgh.pa.us <nowiki>Some TODO items for collations</nowiki>]
}}

{{TodoItem
|Integrate collations with to_char() and related functions
* [http://archives.postgresql.org/message-id/28887.1303579034@sss.pgh.pa.us <nowiki>Some TODO items for collations</nowiki>]
}}

{{TodoItem
|Support collation-sensitive equality and hashing functions
* [http://archives.postgresql.org/pgsql-hackers/2011-06/msg00472.php <nowiki> contrib/citext versus collations</nowiki>]
}}

{{TodoItem
|Add a LOCALE option to CREATE DATABASE, as a shorthand
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00119.php <nowiki> Re: 8.4 open items list</nowiki>]
}}

{{TodoItem
|Support multiple simultaneous character sets, per SQL:2008}}

{{TodoItem
|Improve UTF8 combined character handling?}}

{{TodoItem
|Add octet_length_server() and octet_length_client()}}

{{TodoItem
|Make octet_length_client() the same as octet_length()?}}

{{TodoItem
|Fix problems with wrong runtime encoding conversion for NLS message files}}

{{TodoItem
|Add URL to more complete multi-byte regression tests
* [http://archives.postgresql.org/pgsql-hackers/2005-07/msg00272.php <nowiki>Multi-byte and client side character encoding tests for copy command..</nowiki>]
}}

{{TodoItem
|Fix contrib/fuzzystrmatch to work with multibyte encodings
* [http://archives.postgresql.org/pgsql-bugs/2009-04/msg00047.php <nowiki> soundex function returns UTF-16 characters</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00138.php <nowiki> dmetaphone woes</nowiki>]
}}

{{TodoItem
|Change memory allocation for multi-byte functions so memory is allocated inside conversion functions
|Currently we preallocate memory based on worst-case usage.}}

{{TodoItem
|Add ability to use case-insensitive regular expressions on multi-byte characters
|Currently it works for UTF-8, but not other multi-byte encodings
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00433.php <nowiki>Regexps vs. locale</nowiki>]
* {{MessageLink|20091201210024.B1393753FB7@cvs.postgresql.org|A partial solution for UTF-8}}
}}

{{TodoItem
|Improve encoding of connection startup messages sent to the client
|Currently some authentication error messages are sent in the server encoding
* [http://archives.postgresql.org/pgsql-general/2008-12/msg00801.php <nowiki>encoding of PostgreSQL messages</nowiki>]
* [http://archives.postgresql.org/pgsql-general/2009-01/msg00005.php <nowiki>Re: encoding of PostgreSQL messages</nowiki>]
}}

{{TodoItem
|More sensible support for Unicode combining characters, normal forms
* http://archives.postgresql.org/message-id/200904141532.44618.peter_e@gmx.net
}}

== Views and Rules ==
{{TodoItemDone
|Automatically create rules on views so they are updateable, per SQL:2008
|We can only auto-create rules for simple views. For more complex cases users will still have to write rules manually.
* [http://archives.postgresql.org/pgsql-hackers/2006-03/msg00586.php <nowiki>Proposal for updatable views</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-08/msg00255.php <nowiki>Updatable views</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg01746.php <nowiki>Re: [COMMITTERS] pgsql: Automatic view update rules Bernd Helmle</nowiki>]
* http://wiki.postgresql.org/wiki/Updatable_views
* http://archives.postgresql.org/pgsql-hackers/2012-07/msg00035.php
* http://archives.postgresql.org/pgsql-hackers/2012-08/msg00303.php
}}
{{TodoItem
|Add the functionality of the WITH CHECK OPTION clause to CREATE VIEW
}}
{{TodoItem
|Allow VIEW/RULE recompilation when the underlying tables change
|This is both difficult and controversial.
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg01723.php Re: About "Allow VIEW/RULE recompilation when the underlying tables change"]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg01724.php Re: About "Allow VIEW/RULE recompilation when the underlying tables change2"]
* [http://archives.postgresql.org/message-id/CACk%3DU9NFSzWrEba8G5dZ%3DTZLy3_hx3QXGyCcKVWT%3D4iA1FjMuA@mail.gmail.com VIEW still referring to old name of field]
}}
{{TodoItem
|Make it possible to use RETURNING together with conditional DO INSTEAD rules, such as for partitioning setups
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00577.php <nowiki>RETURNING and DO INSTEAD ... Intentional or not?</nowiki>]
}}

{{TodoItemDone
|Add materialized views
}}

{{TodoItem
|Improve ability to modify views via ALTER TABLE
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00691.php <nowiki>Re: idea: storing view source in system catalogs</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg01410.php <nowiki>modifying views</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg00300.php <nowiki>Re: patch: Add columns via CREATE OR REPLACE VIEW</nowiki>]
}}

== SQL Commands ==

{{TodoItem
|Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT
* [http://dissipatedheat.com/2011/11/10/how-not-to-write-a-patch-for-postgresql/ How not to write this patch.]
}}

{{TodoItem
|Improve type determination of unknown (NULL or quoted literal) result columns for UNION/INTERSECT/EXCEPT
* [http://archives.postgresql.org/message-id/9799.1302719551@sss.pgh.pa.us <nowiki>UNION construct type cast gives poor error message</nowiki>]
}}

{{TodoItem
|Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php <nowiki>WIP: grouping sets support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00466.php <nowiki>Implementation of GROUPING SETS (T431: Extended grouping capabilities)</nowiki>]
}}

{{TodoItem
|Allow prepared transactions with temporary tables created and dropped in the same transaction, and when an ON COMMIT DELETE ROWS temporary table is accessed
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00047.php <nowiki>Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables</nowiki>]
* [http://archives.postgresql.org/message-id/492543D5.9050904@enterprisedb.com A suggestion on how to implement this]
}}

{{TodoItem
|Add a GUC variable to warn about non-standard SQL usage in queries}}

{{TodoItem
|Add SQL-standard MERGE/REPLACE/UPSERT command
|MERGE is typically used to merge two tables. REPLACE or UPSERT command does UPDATE, or on failure, INSERT. See [[SQL MERGE]] for notes on the implementation details.
}}

{{TodoItem
|Add NOVICE output level for helpful messages
|For example, have it warn about unjoined tables. This could also control automatic sequence/index creation messages.
}}

{{TodoItem
|Allow NOTIFY in rules involving conditionals}}

{{TodoItem
|Allow EXPLAIN to identify tables that were skipped because of constraint_exclusion
}}

{{TodoItem
|Simplify dropping roles that have objects in several databases}}

{{TodoItem
|Allow the count returned by SELECT, etc to be represented as an int64 to allow a higher range of values}}

{{TodoItem
|Add support for WITH RECURSIVE ... CYCLE
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00291.php <nowiki>WITH RECURSIVE ... CYCLE in vanilla SQL: issues with arrays of rows</nowiki>]}}

{{TodoItem
|Add DEFAULT .. AS OWNER so permission checks are done as the table owner
|This would be useful for SERIAL nextval() calls and CHECK constraints.}}

{{TodoItem
|Allow DISTINCT to work in multiple-argument aggregate calls}}

{{TodoItem
|Add comments on system tables/columns using the information in catalogs.sgml
|Ideally the information would be pulled from the SGML file automatically.}}

{{TodoItem
|Prevent the specification of conflicting transaction read/write options
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00684.php <nowiki>Re: SET TRANSACTION and SQL Standard</nowiki>]
}}

{{TodoItemDone
|Support LATERAL subqueries
|Lateral subqueries can reference columns of tables defined outside the subquery at the same level, i.e. ''laterally''.
For example, a LATERAL subquery in a FROM clause could reference tables defined in the same FROM clause.
Currently only the columns of tables defined ''above'' subqueries are recognized.
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00292.php <nowiki>LATERAL</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg00991.php <nowiki>Re: LATERAL</nowiki>]
* [http://archives.postgresql.org/message-id/4F5AA202.9020906@gmail.com lateral function as a subquery - WIP patch]
}}

{{TodoItemDone
|Prevent temporary tables created with ON COMMIT DELETE ROWS from repeatedly truncating the table on every commit if the table is already empty
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg00842.php
* http://archives.postgresql.org/pgsql-performance/2010-03/msg00392.php
* http://archives.postgresql.org/pgsql-performance/2010-04/msg00046.php
}}

{{TodoItem
|Allow DELETE and UPDATE to be used with LIMIT and ORDER BY
* http://archives.postgresql.org/pgadmin-hackers/2010-04/msg00078.php
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg01997.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00021.php
}}

{{TodoItem
|Allow PREPARE of cursors}}

{{TodoItem
|Have DISCARD PLANS discard plans cached by functions
|DISCARD all should do the same.
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00431.php
}}

{{TodoItem
|Avoid multiple-evaluation of BETWEEN and IN arguments containing volatile expressions
* http://archives.postgresql.org/message-id/4D95B605.2020709@enterprisedb.com
}}

{{TodoItem
|Fix nested CASE-WHEN constructs
* http://archives.postgresql.org/message-id/4DDCEEB8.50602@enterprisedb.com
}}

=== CREATE ===
{{TodoSubsection}}

{{TodoItem
|Allow CREATE TABLE AS to determine column lengths for complex expressions like SELECT col1 || col2}}

{{TodoItem
|Have WITH CONSTRAINTS also create constraint indexes
* [http://archives.postgresql.org/pgsql-patches/2007-04/msg00149.php <nowiki>Re: CREATE TABLE LIKE INCLUDING INDEXES support</nowiki>]
}}

{{TodoItem
|Move NOT NULL constraint information to pg_constraint
|Currently NOT NULL constraints are stored in pg_attribute without any designation of their origins, e.g. primary keys. One manifest problem is that dropping a PRIMARY KEY constraint does not remove the NOT NULL constraint designation. Another issue is that we should probably force NOT NULL to be propagated from parent tables to children, just as CHECK constraints are. (But then does dropping PRIMARY KEY affect children?)
* http://archives.postgresql.org/message-id/19768.1238680878@sss.pgh.pa.us
* http://archives.postgresql.org/message-id/200909181005.n8IA5Ris061239@wwwmaster.postgresql.org
* http://archives.postgresql.org/pgsql-hackers/2011-07/msg01223.php
* http://archives.postgresql.org/pgsql-hackers/2011-07/msg00358.php
}}

{{TodoItem
|Prevent concurrent CREATE TABLE from sometimes returning a cryptic error message
* [http://archives.postgresql.org/pgsql-bugs/2007-10/msg00169.php <nowiki>BUG #3692: Conflicting create table statements throw unexpected error</nowiki>]
}}

{{TodoItem
|Add CREATE SCHEMA ... LIKE that copies a schema}}

{{TodoItem
|Fix CREATE OR REPLACE FUNCTION to not leave objects depending on the function in inconsistent state
* [http://archives.postgresql.org/pgsql-general/2008-08/msg00985.php indexes on functions and create or replace function]
}}

{{TodoItem
|Allow temporary tables to exist as empty by default in all sessions
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00006.php <nowiki>what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg01329.php <nowiki>idea: global temp tables</nowiki>]
* [http://archives.postgresql.org//pgsql-hackers/2009-05/msg00016.php <nowiki>Re: idea: global temp tables</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg01098.php <nowiki>global temporary tables</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2012-04/msg01148.php Temporary tables under hot standby]
}}

{{TodoItem
|Allow the creation of "distinct" types
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01647.php <nowiki>Distinct types</nowiki>]
}}

{{TodoItem
|Consider analyzing temporary tables when they are first used in a query
|Autovacuum cannot analyze or vacuum temporary tables.
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00416.php <nowiki>autovacuum and temp tables support</nowiki>]
}}

{{TodoItem
|Allow an unlogged table to be changed to logged
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00315.php
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00437.php
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00323.php
* http://archives.postgresql.org/pgsql-hackers/2011-06/msg00237.php
}}

{{TodoEndSubsection}}

=== UPDATE ===
{{TodoSubsection}}

{{TodoItem
|<nowiki>Allow UPDATE tab SET ROW (col, ...) = (SELECT...)</nowiki>
* [http://archives.postgresql.org/pgsql-hackers/2006-07/msg01308.php <nowiki>Re: [PATCHES] extension for sql update</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00865.php <nowiki>UPDATE using sub selects</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-04/msg00315.php <nowiki>UPDATE using sub selects</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-03/msg00237.php <nowiki>Re: UPDATE using sub selects</nowiki>]
}}

{{TodoItem
|Research self-referential UPDATEs that see inconsistent row versions in read-committed mode
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00507.php <nowiki>Concurrently updating an updatable view</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00016.php <nowiki>Re: Do we need a TODO? (was Re: Concurrently updating anupdatable view)</nowiki>]
}}

{{TodoItem
|Improve performance of EvalPlanQual mechanism that rechecks already-updated rows
|This is related to the previous item, which questions whether it even has the right semantics
* [http://archives.postgresql.org/pgsql-bugs/2008-09/msg00045.php <nowiki>BUG #4401: concurrent updates to a table blocks one update indefinitely</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2009-07/msg00302.php <nowiki>BUG #4945: Parallel update(s) gone wild</nowiki>]
}}

{{TodoEndSubsection}}

=== ALTER ===
{{TodoSubsection}}

{{TodoItem
|Have ALTER TABLE RENAME of a SERIAL column rename the sequence
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php <nowiki>Re: newbie: renaming sequences task</nowiki>]
* [http://archives.postgresql.org/message-id/CADLWmXUV4LbLhMZL8rYMhCy72aZZLB5BSARPQVgoX0BrxA0FFg@mail.gmail.com renaming implicit sequences]
}}

{{TodoItem
|Have ALTER SEQUENCE RENAME rename the sequence name stored in the sequence table
* [http://archives.postgresql.org/pgsql-bugs/2007-09/msg00092.php <nowiki>BUG #3619: Renaming sequence does not update its 'sequence_name' field</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2007-10/msg00007.php <nowiki>Re: BUG #3619: Renaming sequence does not update its 'sequence_name' field</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php <nowiki>Re: newbie: renaming sequences task</nowiki>]
}}

{{TodoItem
|Add ALTER DOMAIN to modify the underlying data type}}

{{TodoItem
|Allow ALTER TABLESPACE to move the tablespace to different directories}}

{{TodoItem
|Allow moving system tables to other tablespaces, where possible
|Currently non-global system tables must be in the default database tablespace. Global system tables can never be moved.}}

{{TodoItem
|Have ALTER INDEX update the name of a constraint using that index}}

{{TodoItem
|Allow column display reordering by recording a display, storage, and permanent id for every column?
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00782.php <nowiki>Re: column ordering, was Re: [PATCHES] Enums patch v2</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg01029.php <nowiki>Column reordering in pg_dump</nowiki>]
* http://archives.postgresql.org/message-id/1324412114-sup-9608@alvh.no-ip.org
}}

{{TodoItem
|Allow deactivating (and reactivating) indexes via ALTER TABLE
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg01191.php
}}

{{TodoItem
|Add ALTER OPERATOR ... RENAME
|needs to consider effects of changing operator precedence
* [http://archives.postgresql.org/message-id/1322948781.26266.9.camel@vanquo.pezone.net Missing rename support]
}}

{{TodoItemDone
|Add ALTER TABLE ... RENAME RULE
* [http://archives.postgresql.org/message-id/1322948781.26266.9.camel@vanquo.pezone.net Missing rename support]
}}

{{TodoEndSubsection}}

=== CLUSTER ===
{{TodoSubsection}}

{{TodoItem
|Automatically maintain clustering on a table
|This might require some background daemon to maintain clustering during periods of low usage. It might also require tables to be only partially filled for easier reorganization. Another idea would be to create a merged heap/index data file so an index lookup would automatically access the heap data too. A third idea would be to store heap rows in hashed groups, perhaps using a user-supplied hash function.
* [http://archives.postgresql.org/pgsql-performance/2004-08/msg00350.php <nowiki>Equivalent praxis to CLUSTERED INDEX?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00155.php <nowiki>Re: Grouped Index Tuples</nowiki>]
* http://community.enterprisedb.com/git/
* [http://archives.postgresql.org/pgsql-performance/2009-10/msg00346.php <nowiki>Re: maintain_cluster_order_v5.patch</nowiki>]
}}

{{TodoItem
| Allow CLUSTER to be used on a partial index
* http://www.postgresql.org/message-id/CAMkU%3D1zYwoHHsqJ8wfK3GdG_t_a6t4RK-GFDSKymQ0EGP%3DtypA@mail.gmail.com
}}

{{TodoEndSubsection}}

=== COPY ===
{{TodoSubsection}}

{{TodoItem
|Allow COPY to report error lines and continue
|This requires the use of a savepoint before each COPY line is processed, with ROLLBACK on COPY failure.
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00572.php <nowiki>Re: VLDB Features</nowiki>]
}}

{{TodoItem
|Allow COPY FROM to create index entries in bulk
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00811.php <nowiki>Batch update of indexes on data loading</nowiki>]
}}

{{TodoItem
|Allow COPY in CSV mode to control whether a quoted zero-length string is treated as NULL
|Currently this is always treated as a zero-length string, which generates an error when loading into an integer column
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00905.php <nowiki>Re: [PATCHES] allow CSV quote in NULL</nowiki>]
}}

{{TodoItem
|Improve COPY performance
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00954.php <nowiki>Re: 8.3 / 8.2.6 restore comparison</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01882.php
}}

{{TodoItem
|Allow COPY to report errors sooner
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01169.php <nowiki>Timely reporting of COPY errors</nowiki>]
}}

{{TodoItem
|Allow COPY to handle other number formats
|E.g. the German notation. Best would be something like WITH DECIMAL ','.
}}

{{TodoItem
|Allow a stalled COPY to exit if the backend is terminated
* [http://archives.postgresql.org/pgsql-bugs/2009-04/msg00067.php <nowiki>Re: possible bug not in open items</nowiki>]
}}

{{TodoEndSubsection}}

=== GRANT/REVOKE ===
{{TodoSubsection}}

{{TodoItem
|Allow SERIAL sequences to inherit permissions from the base table?}}

{{TodoItem
|Allow dropping of a role that has connection rights
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00736.php <nowiki>DROP ROLE dependency tracking ...</nowiki>]
}}
{{TodoEndSubsection}}

=== DECLARE CURSOR ===
{{TodoSubsection}}

{{TodoItem
|Prevent DROP TABLE from dropping a table referenced by its own open cursor?}}

{{TodoItem
|Provide some guarantees about the behavior of cursors that invoke volatile functions
* [http://archives.postgresql.org/message-id/20997.1244563664@sss.pgh.pa.us Re: Cursor with hold emits the same row more than once across commits in 8.3.7]
}}

{{TodoEndSubsection}}

=== INSERT ===
{{TodoSubsection}}

{{TodoItem
|Allow INSERT/UPDATE of the system-generated oid value for a row}}

{{TodoItemDone
|In rules, allow VALUES() to contain a mixture of 'old' and 'new' references}}

{{TodoEndSubsection}}

=== SHOW/SET ===
{{TodoSubsection}}

{{TodoItem
|Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM ANALYZE, and CLUSTER}}

{{TodoItem
|Rationalize the discrepancy between settings that use values in bytes and SHOW that returns the object count
* [http://archives.postgresql.org/pgsql-docs/2008-07/msg00007.php <nowiki>Re: [ADMIN] shared_buffers and shmmax</nowiki>]
}}

{{TodoEndSubsection}}

=== ANALYZE ===
{{TodoSubsection}}

{{TodoItem
|Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage}}

{{TodoItem
|Have EXPLAIN ANALYZE report rows as floating-point numbers
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg01363.php <nowiki>explain analyze rows=%.0f</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00108.php <nowiki>Re: explain analyze rows=%.0f</nowiki>]
}}

{{TodoItem
|Improve how ANALYZE computes in-doubt tuples
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00771.php <nowiki>VACUUM/ANALYZE counting of in-doubt tuples</nowiki>]
}}

{{TodoEndSubsection}}

=== Window Functions ===
See {{messageLink|357.1230492361@sss.pgh.pa.us|TODO items for window functions}}.
{{TodoSubsection}}
{{TodoItem
|Support creation of user-defined window functions
|We have the ability to create new window functions written in C. Is it
worth the effort to create an API that would let them be written in PL/pgsql, etc?}}

{{TodoItem
|Implement full support for window framing clauses
|In addition to done clauses described in the [http://developer.postgresql.org/pgdocs/postgres/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS latest doc], these clauses are not implemented yet.
* RANGE BETWEEN ... PRECEDING/FOLLOWING
* EXCLUDE
}}

{{TodoItem
|Investigate tuplestore performance issues
|The tuplestore_in_memory() thing is just a band-aid, we ought to try to solve it properly. tuplestore_advance seems like a weak spot as well.
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00152.php <nowiki>tuplestore potential performance problem</nowiki>]
}}

{{TodoItem|Do we really need so much duplicated code between Agg and WindowAgg?}}

{{TodoItem
|Teach planner to evaluate multiple windows in the optimal order
|Currently windows are always evaluated in the query-specified order.
* http://archives.postgresql.org/message-id/3CDAD71E9D70417290FCF66F0178D1E1@amd64
}}

{{TodoItem
|Implement DISTINCT clause in window aggregates
|Some proprietary RDBMSs have implemented it already, so it helps with porting from those.}}

{{TodoEndSubsection}}

== Integrity Constraints ==
=== Keys ===

{{TodoSubsection}}

{{TodoItem
|Improve deferrable unique constraints for cases with many conflicts
|The current implementation fires a trigger for each potentially conflicting row. This might not scale well for an update that changes many key values at once.
}}

{{TodoEndSubsection}}

=== Referential Integrity ===
{{TodoSubsection}}

{{TodoItem
|Add MATCH PARTIAL referential integrity}}

{{TodoItem
|Change foreign key constraint for array -> element to mean element in array?
* [http://archives.postgresql.org/pgsql-hackers/2010-10/msg01814.php <nowiki>foreign keys for array/period contains relationships</nowiki>]
}}

{{TodoItem
|Fix problem when cascading referential triggers make changes on cascaded tables, seeing the tables in an intermediate state
* [http://archives.postgresql.org/pgsql-hackers/2005-09/msg00174.php <nowiki>Re: [PATCHES] Work-in-progress referential action trigger timing</nowiki>]
}}

{{TodoItem
|Are ri_KeysEqual checks in the RI enforcement triggers still necessary?
* [http://archives.postgresql.org/pgsql-performance/2005-10/msg00458.php <nowiki>Re: Effects of cascading references in foreign keys</nowiki>]
}}

{{TodoItemDone
|Optimize referential integrity checks involving null values
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00744.php <nowiki>Can't ri_KeysEqual() consider two nulls as equal?</nowiki>]
}}

{{TodoEndSubsection}}

=== Check Constraints ===
{{TodoSubsection}}

{{TodoItem
|Run check constraints only when affected columns are changed
* http://archives.postgresql.org/message-id/1326055327.15293.13.camel@vanquo.pezone.net
}}

{{TodoEndSubsection}}

== Server-Side Languages ==

{{TodoItem
|Add support for polymorphic arguments and return types to languages other than PL/PgSQL}}

{{TodoItem
|Add support for OUT and INOUT parameters to languages other than PL/PgSQL}}

{{TodoItem
|Add more fine-grained specification of functions taking arbitrary data types
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00367.php <nowiki>RfD: more powerful "any" types</nowiki>]
}}

{{TodoItem
|Implement stored procedures
|This might involve the control of transaction state and the return of multiple result sets
* [http://archives.postgresql.org/pgsql-general/2008-10/msg00454.php <nowiki>PL/pgSQL stored procedure returning multiple result sets (SELECTs)?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg01375.php <nowiki>Proposal: real procedures again (8.4)</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg00542.php
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg01149.php <nowiki>Gathering specs and discussion on feature (post 9.1)</nowiki>]
}}

{{TodoItem
|Allow holdable cursors in SPI}}

{{TodoItemEasy
|Add SPI_gettypmod() to return a field's typemod from a TupleDesc
* http://archives.postgresql.org/pgsql-hackers/2005-11/msg00250.php
}}

=== SQL-Language Functions ===
{{TodoSubsection}}

{{TodoItem
|Rethink query plan caching and timing of parse analysis within SQL-language functions
|They should work more like plpgsql functions do ...
* [http://archives.postgresql.org/pgsql-bugs/2011-05/msg00078.php <nowiki>Re: BUG #6019: invalid cached plan on inherited table</nowiki>]
}}

{{TodoEndSubsection}}

=== PL/pgSQL ===
{{TodoSubsection}}

{{TodoItem
|Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]}}

{{TodoItem
|<nowiki>Allow listing of record column names, and access to record columns via variables, e.g. columns := r.(*), tval2 := r.(colname)</nowiki>
* [http://archives.postgresql.org/pgsql-patches/2005-07/msg00458.php <nowiki>Re: PL/PGSQL: Dynamic Record Introspection</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-05/msg00302.php <nowiki>Re: PL/PGSQL: Dynamic Record Introspection</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-06/msg00031.php <nowiki>Re: PL/PGSQL: Dynamic Record Introspection</nowiki>]
}}

{{TodoItem
|Allow row and record variables to be set to NULL constants, and allow NULL tests on such variables
|Because a row is not scalar, do not allow assignment from NULL-valued scalars.
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg00070.php <nowiki>NULL and plpgsql rows</nowiki>]
}}

{{TodoItem
|Consider keeping separate cached copies when search_path changes
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg01009.php <nowiki>pl/pgsql Plan Invalidation and search_path</nowiki>]
}}

{{TodoItem
|Improve handling of NULL row values vs. NULL rows
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg01758.php <nowiki>Null row vs. row of nulls in plpgsql</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg01973.php
}}

{{TodoItem
|Improve PERFORM handling of WITH queries or document limitation
* http://archives.postgresql.org/pgsql-bugs/2011-03/msg00309.php
}}

{{TodoEndSubsection}}

=== PL/Perl ===
{{TodoSubsection}}

{{TodoItem
|Allow regex operations in plperl using UTF8 characters in non-UTF8 encoded databases}}

{{TodoEndSubsection}}

=== PL/Python ===
{{TodoSubsection}}

{{TodoItem
|Develop a trusted variant of PL/Python.}}

{{TodoItem
|Create a new restricted execution class that will allow passing function arguments in as locals. Passing them as globals means functions cannot be called recursively.
* [http://archives.postgresql.org/pgsql-hackers/2011-02/msg01468.php <nowiki>Re: pl/python do not delete function arguments</nowiki>]
}}

{{TodoItem
|Add a DB-API compliant interface on top of the SPI interface
* http://petereisentraut.blogspot.com/2011/11/plpydbapi-db-api-for-plpython.html
}}

{{TodoItem
|For functions returning a setof record with a composite type, cache the I/O functions for the composite type
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg02007.php
}}

{{TodoItem
|Fix loss of information during conversion of numeric type to Python float}}

{{TodoEndSubsection}}

=== PL/Tcl ===
{{TodoSubsection}}

{{TodoItem
|Add table function support}}

{{TodoItem
|Check encoding validity of values passed back to Postgres in function returns, trigger tuple changes, and SPI calls.}}

{{TodoEndSubsection}}

== Clients ==

{{TodoItem
|Add a function like pg_get_indexdef() that report more detailed index information
* [http://archives.postgresql.org/pgsql-bugs/2007-12/msg00166.php <nowiki>BUG #3829: Wrong index reporting from pgAdmin III (v1.8.0 rev 6766-6767)</nowiki>]
}}

{{TodoItem
|Split out pg_resetxlog output into pre- and post-sections
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg02040.php
}}

=== pg_ctl ===
{{TodoSubsection}}

{{TodoItem
|Improve pg_ctl's detection of running postmasters
* http://archives.postgresql.org/pgsql-hackers/2011-06/msg00000.php
* http://archives.postgresql.org/pgsql-committers/2011-06/msg00001.php
}}

{{TodoItem
|Add additional shutdown modes, and change the default?
* http://archives.postgresql.org/pgsql-hackers/2012-04/msg01283.php
}}

{{TodoEndSubsection}}

=== psql ===
{{TodoSubsection}}

{{TodoItem
|Have psql \ds show all sequences and their settings
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00916.php <nowiki>Re: TODO item: Have psql show current values for a sequence</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00401.php <nowiki>Quick patch: Display sequence owner</nowiki>]
}}

{{TodoItem
|Move psql backslash database information into the backend, use mnemonic commands?
|This would allow non-psql clients to pull the same information out of the database as psql.
* [http://archives.postgresql.org/pgsql-hackers/2004-01/msg00191.php <nowiki>Re: psql \d option list overloaded</nowiki>]
}}

{{TodoItem
|Make psql's \d commands more consistent in their handling of schemas
* [http://archives.postgresql.org/pgsql-hackers/2004-11/msg00014.php <nowiki>Re: psql and schemas</nowiki>]
}}

{{TodoItem
|Make psql's \d commands distinguish default privileges from no privileges
|ACL displays were visibly different for the two cases before we "improved" them by using array_to_string.
* [http://archives.postgresql.org/pgsql-bugs/2011-05/msg00082.php <nowiki>BUG #6021: There is no difference between default and empty access privileges with \dp</nowiki>]
}}

{{TodoItem
|Consistently display privilege information for all objects in psql}}

{{TodoItemEasy
|\s without arguments (display history) fails with libedit, doesn't use pager either
* [http://archives.postgresql.org/pgsql-bugs/2011-06/msg00114.php <nowiki> psql \s not working - OS X</nowiki>]
}}

{{TodoItem
|Add a \set variable to control whether \s displays line numbers
|Another option is to add \# which lists line numbers, and allows command execution.
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00255.php <nowiki>Re: psql possible TODO</nowiki>]
}}

{{TodoItem
|Include the symbolic SQLSTATE name in verbose error reports
* [http://archives.postgresql.org/pgsql-general/2007-09/msg00438.php <nowiki>Re: Checking is TSearch2 query is valid</nowiki>]
}}

{{TodoItem
|Add prompt escape to display the client and server versions
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00310.php <nowiki>WIP patch for TODO Item: Add prompt escape to display the client and server versions</nowiki>]
}}

{{TodoItem
|Add option to wrap column values at whitespace boundaries, rather than chopping them at a fixed width.
|Currently, "wrapped" format chops values into fixed widths. Perhaps the word wrapping could use the same algorithm documented in the W3C specification.
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00404.php <nowiki>Re: psql wrapped format default for backslash-d commands</nowiki>]
* http://www.w3.org/TR/CSS21/tables.html#auto-table-layout}}

{{TodoItem
|Support the ReST table output format
|Details about the ReST format: http://docutils.sourceforge.net/rst.html#reference-documentation
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg01007.php <nowiki>Proposal: new border setting in psql</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00518.php <nowiki>Re: Proposal: new border setting in psql</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00609.php <nowiki>Re: Proposal: new border setting in psql</nowiki>]
}}

{{TodoItem
|Add option to print advice for people familiar with other databases
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01845.php <nowiki>MySQL-ism help patch for psql</nowiki>]
}}

{{TodoItem
|Add ability to edit views with \ev
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00023.php <nowiki>Adding \ev view editor?</nowiki>]
}}

{{TodoItem
|Fix FETCH_COUNT to handle SELECT ... INTO and WITH queries
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01565.php
* http://archives.postgresql.org/pgsql-bugs/2010-05/msg00192.php
}}

{{TodoItem
|Prevent psql from sending remaining single-line multi-statement queries after reconnecting
* http://archives.postgresql.org/pgsql-bugs/2010-05/msg00159.php
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01283.php
}}

{{TodoItemEasy
|Add \i option to bring in the specified file as a quoted literal
|This would be useful for creating functions and other areas. Details still need to be worked out.
* http://archives.postgresql.org/pgsql-bugs/2011-02/msg00016.php
* http://archives.postgresql.org/pgsql-bugs/2011-02/msg00020.php
}}

{{TodoItem
|Consider having psql -c read .psqlrc, for consistency
|psql -f already reads .psqlrc
}}

{{TodoItem
|Allow processing of multiple -f (file) options
* http://www.postgresql.org/message-id/AANLkTikFpzrTRl6392GhatQdwlCWQTXFdSMxh5CP51iv@mail.gmail.com
}}

{{TodoItem
|Improve line drawing characters
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00386.php
}}

{{TodoItem
|Consider improving the continuation prompt
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg01772.php
}}

{{TodoItem
|Improve speed of tab completion by using LIKE
* http://www.postgresql.org/message-id/20121012060345.GA29214@toroid.org
}}

{{TodoEndSubsection}}

=== pg_dump / pg_restore ===
{{TodoSubsection}}

{{TodoItemEasy
|<nowiki>Add full object name to the tag field. eg. for operators we need '=(integer, integer)', instead of just '='.</nowiki>}}

{{TodoItemEasy
|Modify pg_dump to create skeleton views for reload (which are then updated via CREATE OR REPLACE VIEW) when views have circular dependencies. This should eliminate the need for the CREATE RULE "_RETURN" hack currently used to address this issue. Thread and additional information here:
* [http://www.postgresql.org/message-id/25554.1360895028@sss.pgh.pa.us Description of change]
|}}

{{TodoItem
|Add pg_dumpall custom format dumps?
* [http://archives.postgresql.org/pgsql-general/2010-05/msg00509.php pg_dumpall custom format]
|}}

{{TodoItem
|Avoid using platform-dependent locale names in pg_dumpall output
|Using native locale names puts roadblocks in the way of porting a dump to another platform. One possible solution is to get
CREATE DATABASE to accept some agreed-on set of locale names and fix them up to meet the platform's requirements.
* http://archives.postgresql.org/message-id/21396.1241716688@sss.pgh.pa.us
}}

{{TodoItem
|Allow selection of individual object(s) of all types, not just tables}}

{{TodoItem
|In a selective dump, allow dumping of an object and all its dependencies}}

{{TodoItem
|Add options like pg_restore -l and -L to pg_dump}}

{{TodoItemDone
|Add support for multiple pg_restore -t options, like pg_dump}}

{{TodoItem
|Stop dumping CASCADE on DROP TYPE commands in clean mode}}

{{TodoItem
|Allow pg_dump --clean to drop roles that own objects or have privileges
|tgl says: if this is about pg_dumpall, it's done as of 8.4. If it's really about pg_dump, what does it mean? pg_dump has no business dropping roles.}}

{{TodoItem
|Allow pg_restore to load different parts of the COPY data for a single table simultaneously}}

{{TodoItem
|Remove support for dumping from pre-7.3 servers
|In 7.3 and later, we can get accurate dependency information from the server. pg_dump still contains a lot of crufty code
to try to deal with the lack of dependency info in older servers, but the usefulness of maintaining that code grows small.}}

{{TodoItem
|Refactor handling of database attributes between pg_dump and pg_dumpall
|Currently only pg_dumpall emits database attributes, such as ALTER DATABASE SET commands and database-level GRANTs.
Many people wish that pg_dump would do that. One proposal is to let pg_dump issue such commands if the -C switch was used,
but it's unclear whether that will satisfy the demand.
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg01031.php <nowiki>ALTER DATABASE vs pg_dump</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2010-05/msg00010.php summary of the issues]
}}

{{TodoItem
|Change pg_dump so that a comment on the dumped database is applied to the loaded database, even if the database has a different name.
|This will require new backend syntax, perhaps COMMENT ON CURRENT DATABASE. This is related to the previous item.}}

{{TodoItem
|Allow parallel restore of tar dumps
* [http://archives.postgresql.org/pgsql-hackers/2009-02/msg01154.php <nowiki>Re: parallel restore</nowiki>]
}}

{{TodoItem
|Preserve sparse storage of large objects over dump/restore
* [http://archives.postgresql.org/message-id/18789.1349750451@sss.pgh.pa.us <nowiki>TODO item: teach pg_dump about sparsely-stored large objects</nowiki>]
}}

{{TodoEndSubsection}}

=== ecpg ===
{{TodoSubsection}}

{{TodoItem
|Docs
|Document differences between ecpg and the SQL standard and information about the Informix-compatibility module.}}

{{TodoItem
|Solve cardinality > 1 for input descriptors / variables?}}

{{TodoItem
|Add a semantic check level, e.g. check if a table really exists}}

{{TodoItem
|fix handling of DB attributes that are arrays}}

{{TodoItem
|Fix nested C comments}}

{{TodoItemEasy
|sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified}}

{{TodoItem
|Make SET CONNECTION thread-aware, non-standard?}}

{{TodoItem
|Allow multidimensional arrays}}

{{TodoItem
|Implement COPY FROM STDIN}}

{{TodoItem
|Provide a way to specify size of a bytea parameter
* [http://archives.postgresql.org/message-id/200906192131.n5JLVoMo044178@wwwmaster.postgresql.org <nowiki>BUG #4866: ECPG and BYTEA</nowiki>]
}}

{{TodoItemEasy
|Fix small memory leaks in ecpg
|Memory leaks in a short running application like ecpg are not really a problem, but make debugging more complicated}}

{{TodoItem
|Allow reuse of cursor name variables
* [http://archives.postgresql.org/message-id/20100329113435.GA3430@feivel.credativ.lan <nowiki>Problems with variable cursorname in ecpg</nowiki>]
}}

{{TodoEndSubsection}}

=== libpq ===
{{TodoSubsection}}

{{TodoItem
|Prevent PQfnumber() from lowercasing unquoted column names
|PQfnumber() should never have been doing lowercasing, but historically it has so we need a way to prevent it}}

{{TodoItem
|Consider disallowing multiple queries in PQexec() as an additional barrier to SQL injection attacks
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg00184.php <nowiki>Re: InitPostgres and flatfiles question</nowiki>]
}}

{{TodoItem
|Add PQexecf() that allows complex parameter substitution
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01803.php <nowiki>Last minute mini-proposal (I know, know) for PQexecf()</nowiki>]
}}

{{TodoItem
|Add SQLSTATE and severity to errors generated within libpq itself
* [http://archives.postgresql.org/pgsql-interfaces/2007-11/msg00015.php <nowiki>v8.1: Error severity on libpq PGconn*</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01425.php
}}

{{TodoItem
|Add support for interface/ipaddress binding to libpq
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01811.php <nowiki>SR/libpq - outbound interface/ipaddress binding</nowiki>]
}}

{{TodoEndSubsection}}

=== HTTP===
{{TodoSubsection}}

{{TodoItem
|Allow access to the database via HTTP
|See [[HTTP_API]]}}

{{TodoEndSubsection}}

== Triggers ==

{{TodoItem
|Improve storage of deferred trigger queue
|Right now all deferred trigger information is stored in backend memory. This could exhaust memory for very large trigger queues. This item involves dumping large queues into files, or doing some kind of join to process all the triggers, some bulk operation, or a bitmap.
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00876.php <nowiki>Re: BUG #4204: COPY to table with FK has memory leak</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg00464.php <nowiki>Scaling up deferred unique checks and the after trigger queue</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2011-08/msg00023.php
}}

{{TodoItem
|Allow triggers to be disabled in only the current session.
|This is currently possible by starting a multi-statement transaction, modifying the system tables, performing the desired SQL, restoring the system tables, and committing the transaction. ALTER TABLE ... TRIGGER requires a table lock so it is not ideal for this usage.}}

{{TodoItem
|With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN KEY
|If the dump is known to be valid, allow foreign keys to be added without revalidating the data.}}

{{TodoItem
|Allow statement-level triggers to access modified rows}}

{{TodoItem
|When statement-level triggers are defined on a parent table, have them fire only on the parent table, and fire child table triggers only where appropriate
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg01883.php <nowiki>Statement-level triggers and inheritance</nowiki>]
}}

{{TodoItemDone
|Add event triggers
}}

{{TodoItem
|Tighten trigger permission checks
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00564.php <nowiki>Security leak with trigger functions?</nowiki>]
}}

{{TodoItem
|Allow BEFORE INSERT triggers on views
* [http://archives.postgresql.org/pgsql-general/2007-02/msg01466.php <nowiki>Re: Why can't I put a BEFORE EACH ROW trigger on a view?</nowiki>]
}}

{{TodoItem
|Add database and transaction-level triggers
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00451.php <nowiki>Proposal for db level triggers</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00620.php <nowiki>triggers on prepare, commit, rollback... ?</nowiki>]
}}

{{TodoItem
|Reduce locking requirements for creating a trigger
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00635.php <nowiki>Re: Change lock requirements for adding a trigger</nowiki>]
}}

{{TodoItem
|Avoid requirement for "AFTER" trigger functions to return a value
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg02384.php
}}

{{TodoItem
|Allow creation of inline triggers
* http://archives.postgresql.org/pgsql-hackers/2012-02/msg00708.php
}}

== Inheritance ==

{{TodoItem
|Allow inherited tables to inherit indexes, UNIQUE constraints, and primary/foreign keys
* [http://archives.postgresql.org/pgsql-hackers/2010-05/msg00285.php <nowiki>Partitioning/inherited tables vs FKs</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00039.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00305.php
}}

{{TodoItem
|Honor UNIQUE INDEX on base column in INSERTs/UPDATEs on inherited table, e.g. INSERT INTO inherit_table (unique_index_col) VALUES (dup) should fail
|The main difficulty with this item is the problem of creating an index that can span multiple tables.}}

{{TodoItem
|Determine whether ALTER TABLE / SET SCHEMA should work on inheritance hierarchies (and thus support ONLY). If yes, implement it.}}

{{TodoItem
|ALTER TABLE variants sometimes support recursion and sometimes not, but this is poorly/not documented, and the ONLY marker would then be silently ignored. Clarify the documentation, and reject ONLY if it is not supported.}}

== Indexes ==

{{TodoItem
|Prevent index uniqueness checks when UPDATE does not modify the column
|Uniqueness (index) checks are done when updating a column even if the column is not modified by the UPDATE.
However, HOT already short-circuits this in common cases, so more work might not be helpful.
* http://www.postgresql.org/message-id/CA+TgmoZOyaTanfEvNUdiHBCuu9Zh0JVP1e_UTVbx6Rvj9vFC9Q@mail.gmail.com
}}

{{TodoItem
|Allow the creation of on-disk bitmap indexes which can be quickly combined with other bitmap indexes
|Such indexes could be more compact if there are only a few distinct values. Such indexes can also be compressed. Keeping such indexes updated can be costly.
* [http://archives.postgresql.org/pgsql-patches/2005-07/msg00512.php <nowiki>Re: Bitmap index AM</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg01107.php <nowiki>Bitmap index thoughts</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00265.php <nowiki>Stream bitmaps</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01214.php <nowiki>Re: Bitmapscan changes - Requesting further feedback</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00013.php <nowiki>Updated bitmap index patch</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00741.php <nowiki>Reviewing new index types (was Re: [PATCHES] Updated bitmap indexpatch)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01023.php <nowiki>Bitmap Indexes: request for feedback</nowiki>]
* http://archives.postgresql.org/message-id/800923.27831.qm@web29010.mail.ird.yahoo.com
}}

{{TodoItem
|Allow accurate statistics to be collected on indexes with more than one column or expression indexes, perhaps using per-index statistics
* [http://archives.postgresql.org/pgsql-performance/2006-10/msg00222.php <nowiki>Re: Simple join optimized badly?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01131.php <nowiki>Stats for multi-column indexes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00741.php <nowiki>Cross-column statistics revisited</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg01431.php <nowiki>Multi-Dimensional Histograms</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00913.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg02179.php
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00459.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg02054.php
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg01731.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg00894.php
* http://archives.postgresql.org/pgsql-hackers/2011-09/msg00679.php
}}

{{TodoItem
|Consider having a larger statistics target for indexed columns and expression indexes.
}}

{{TodoItem
|Consider smaller indexes that record a range of values per heap page, rather than having one index entry for every heap row
|This is useful if the heap is clustered by the indexed values.
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00341.php <nowiki>Grouped Index Tuples</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg01264.php <nowiki>Grouped Index Tuples</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00465.php <nowiki>Grouped Index Tuples / Clustered Indexes</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-03/msg00163.php <nowiki>Bitmapscan changes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00014.php <nowiki>Re: GIT patch</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00487.php <nowiki>Re: Index Tuple Compression Approach?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01589.php <nowiki>Re: Index AM change proposals, redux</nowiki>]
}}

{{TodoItem
|Add REINDEX CONCURRENTLY, like CREATE INDEX CONCURRENTLY
|This is difficult because you must upgrade to an exclusive table lock to replace the existing index file. CREATE INDEX CONCURRENTLY does not have this complication. This would allow index compaction without downtime.
* [http://archives.postgresql.org/pgsql-performance/2007-08/msg00289.php <nowiki>Re: When/if to Reindex</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2012-09/msg00911.php
* http://archives.postgresql.org/pgsql-hackers/2012-10/msg00128.php
}}

{{TodoItem
|Allow multiple indexes to be created concurrently, ideally via a single heap scan
|pg_restore allows parallel index builds, but it is done via subprocesses, and there is no SQL interface for this.
Cluster could definitely benefit from this.
* http://archives.postgresql.org/pgsql-performance/2011-04/msg00093.php
}}

{{TodoItem
|Consider sorting entries before inserting into btree index
* [http://archives.postgresql.org/pgsql-general/2008-01/msg01010.php <nowiki>Re: ATTN: Clodaldo was Performance problem. Could it be related to 8.3-beta4?</nowiki>]
}}

{{TodoItem
|Allow creation of an index that can do comparisons to test if a value is between two column values
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00757.php <nowiki>Proposal: temporal extension "period" data type</nowiki>]
}}

{{TodoItem
|Consider using "effective_io_concurrency" for index scans
|Currently only bitmap scans use this, which might be fine because most multi-row index scans use bitmap scans.
* [http://www.postgresql.org/message-id/CAGTBQpZzf70n0PYJ%3DVQLd+jb3wJGo%3D2TXmY+SkJD6G_vjC5QNg@mail.gmail.com Prefetch index pages for B-Tree index scans]
}}

{{TodoItem
|Fix problem with btree page splits during checkpoints
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00052.php
* http://archives.postgresql.org/pgsql-hackers/2011-09/msg00184.php
}}

{{TodoItem
|[http://archives.postgresql.org/pgsql-hackers/2012-05/msg00669.php Support amgettuple() in GIN (useful for exclusion constraints)]
}}

{{TodoItem
| Allow "loose" or "skip" scans on btree indexes in which the first column has low cardinality
* http://archives.postgresql.org/pgsql-performance/2012-08/msg00159.php
}}

{{TodoItem
| Make the planner's "special index operator" mechanism extensible
* http://www.postgresql.org/message-id/27270.1364700924@sss.pgh.pa.us
}}

=== GIST ===
{{TodoSubsection}}

{{TodoItem
|Add more GIST index support for geometric data types}}

{{TodoItem
|Allow GIST indexes to create certain complex index types, like digital trees (see Aoki)}}

{{TodoItem
|Fix performance issues in contrib/seg and contrib/cube GiST support
* [http://archives.postgresql.org/message-id/alpine.DEB.2.00.0904161633160.4053@aragorn.flymine.org GiST index performance]
* [http://archives.postgresql.org/message-id/alpine.DEB.2.00.0904221704470.22330@aragorn.flymine.org draft patch]
* [http://archives.postgresql.org/pgsql-performance/2009-05/msg00069.php <nowiki>Re: GiST index performance</nowiki>]
* [http://archives.postgresql.org/pgsql-performance/2009-06/msg00068.php <nowiki>GiST index performance</nowiki>]
}}

{{TodoItem
|[http://archives.postgresql.org/message-id/4DC8D284-05CF-4E3D-9670-AC9A32C37A36@justatheory.com GiST index support for arrays]
}}

{{TodoEndSubsection}}

=== Hash ===
{{TodoSubsection}}

{{TodoItem
|Add UNIQUE capability to hash indexes}}

{{TodoItem
|Add hash WAL logging for crash recovery
* http://archives.postgresql.org/pgsql-performance/2011-09/msg00196.php
}}

{{TodoItem
|Allow multi-column hash indexes}}

{{TodoEndSubsection}}

== Sorting ==

{{TodoItem
|Consider whether duplicate keys should be sorted by block/offset
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00558.php <nowiki>Remove hacks for old bad qsort() implementations?</nowiki>]
}}

{{TodoItem
|Consider being smarter about memory and external files used during sorts
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg01101.php <nowiki>Sorting Improvements for 8.4</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00045.php <nowiki>Re: Sorting Improvements for 8.4</nowiki>]
}}

{{TodoItem
|Consider detoasting keys before sorting}}

{{TodoItem
|Allow sorts to use more available memory
* http://archives.postgresql.org/pgsql-hackers/2007-11/msg01026.php
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg01123.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg01957.php
}}

== Fsync ==

{{TodoItem
|Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options and whether fsync does anything
|Ideally this requires a separate test program like /contrib/pg_test_fsync that can be run at initdb time or optionally later.
}}

{{TodoItem
|Consider sorting writes during checkpoint
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00541.php <nowiki>Sorted writes in checkpoint</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-07/msg00050.php <nowiki>Re: Sorting writes during checkpoint</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg02012.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg00278.php
* http://archives.postgresql.org/pgsql-hackers/2012-01/msg00493.php
}}

== Cache Usage ==

{{TodoItem
|Provide a way to calculate an "estimated COUNT(*)"
|Perhaps by using the optimizer's cardinality estimates or random sampling.
* [http://archives.postgresql.org/pgsql-hackers/2005-11/msg00943.php <nowiki>Re: Improving count(*)</nowiki>]
* http://wiki.postgresql.org/wiki/Slow_Counting
}}

{{TodoItem
|Consider automatic caching of statements at various levels:
* Parsed query tree
* Query execute plan
* Query results

:
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00823.php <nowiki>Cached Query Plans (was: global prepared statements)</nowiki>]
}}

{{TodoItem
|Consider increasing internal areas (NUM_CLOG_BUFFERS) when shared buffers is increased
* [http://archives.postgresql.org/pgsql-hackers/2005-10/msg01419.php <nowiki>Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00030.php <nowiki>clog_buffers to 64 in 8.3?</nowiki>]
* [http://archives.postgresql.org/pgsql-performance/2007-08/msg00024.php <nowiki>CLOG Patch</nowiki>]
}}

{{TodoItem
|Consider decreasing the amount of memory used by PrivateRefCount
|
* [http://archives.postgresql.org/pgsql-hackers/2006-11/msg00797.php <nowiki>PrivateRefCount (for 8.3)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg00752.php <nowiki>Re: PrivateRefCount (for 8.3)</nowiki>]
}}

{{TodoItem
|Consider allowing higher priority queries to have referenced buffer cache pages stay in memory longer
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00562.php <nowiki>Re: How to keep a table in memory?</nowiki>]
}}

{{TodoItem
|Improve cache lookup speed for sessions accessing many relations
* http://archives.postgresql.org/pgsql-hackers/2012-11/msg00356.php
}}

== Vacuum ==

{{TodoItem
|Auto-fill the free space map by scanning the buffer cache or by checking pages written by the background writer
* [http://archives.postgresql.org/pgsql-hackers/2006-02/msg01125.php <nowiki>Dead Space Map</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-03/msg00011.php <nowiki>Re: Automatic free space map filling</nowiki>]
}}

{{TodoItem
|Allow concurrent inserts to use recently created pages rather than creating new ones
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg00853.php
}}

{{TodoItem
|Consider having single-page pruning update the visibility map
* <nowiki>https://commitfest.postgresql.org/action/patch_view?id=75</nowiki>
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg02344.php <nowiki>Re: visibility maps and heap_prune</nowiki>]
}}

{{TodoItem
|Improve tracking of total relation tuple counts now that vacuum doesn't always scan the whole heap
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00531.php Partial vacuum versus pg_class.reltuples]
}}

{{TodoItem
|Bias FSM towards returning free space near the beginning of the heap file, in hopes that empty pages at the end can be truncated by VACUUM
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg01124.php <nowiki>FSM search modes</nowiki>]
}}

{{TodoItem
|Consider a more compact data representation for dead tuple locations within VACUUM
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00143.php <nowiki>Re: Have vacuum emit a warning when it runs out of maintenance_work_mem</nowiki>]
}}

{{TodoItem
|Provide more information in order to improve user-side estimates of dead space bloat in relations
* [http://archives.postgresql.org/pgsql-general/2009-05/msg01039.php <nowiki>Re: Bloated Table</nowiki>]
}}

{{TodoItem
|Improve locking behaviour of vacuum during trailing page truncation
* http://archives.postgresql.org/pgsql-bugs/2011-03/msg00319.php
* http://archives.postgresql.org/message-id/4D8DF88E.7080205@Yahoo.com
}}

{{TodoItem
|Reduce the number of table scans performed by vacuum
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg01119.php
* http://archives.postgresql.org/pgsql-hackers/2011-06/msg00605.php
* http://archives.postgresql.org/pgsql-hackers/2011-07/msg00624.php
}}

{{TodoItem
|Vacuum Gin indexes in physically order rather than logical order
* http://archives.postgresql.org/pgsql-hackers/2012-04/msg00443.php
}}

{{TodoItem
|Avoid creation of the free space map for small tables
* http://archives.postgresql.org/pgsql-hackers/2011-11/msg01751.php
* http://archives.postgresql.org/pgsql-hackers/2012-08/msg00552.php
* http://archives.postgresql.org/pgsql-hackers/2012-08/msg00615.php
}}

=== Auto-vacuum ===
{{TodoSubsection}}

{{TodoItemEasy
|Issue log message to suggest VACUUM FULL if a table is nearly empty?}}

{{TodoItem
|Prevent long-lived temporary tables from causing frozen-xid advancement starvation
|The problem is that autovacuum cannot vacuum them to set frozen xids; only the session that created them can do that.
* [http://archives.postgresql.org/pgsql-general/2007-06/msg01645.php <nowiki>Re: AutoVacuum Behaviour Question</nowiki>]
}}

{{TodoItem
|Prevent autovacuum from running if an old transaction is still running from the last vacuum
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00899.php <nowiki>Re: Autovacuum and OldestXmin</nowiki>]
}}

{{TodoItem
|Have autoanalyze of parent tables occur when child tables are modified
* http://archives.postgresql.org/pgsql-performance/2010-06/msg00137.php
* http://archives.postgresql.org/pgsql-performance/2010-10/msg00271.php
}}

{{TodoItem
|Allow visibility map all-visible bits to be set even when an auto-ANALYZE is running
* http://archives.postgresql.org/pgsql-hackers/2012-01/msg00356.php
}}

{{TodoItem
|Allow parallel cores to be used by vacuumdb
* [http://archives.postgresql.org/message-id/4F10A728.7090403@agliodbs.com vacuumdb -j]
}}

{{TodoItem
|Improve autovacuum tuning
* http://www.postgresql.org/message-id/5078AD6B.8060802@agliodbs.com
* http://www.postgresql.org/message-id/20130124215715.GE4528@alvh.no-ip.org
}}

{{TodoEndSubsection}}

== Locking ==

{{TodoItem
|Fix priority ordering of read and write light-weight locks
* [http://archives.postgresql.org/pgsql-hackers/2004-11/msg00893.php <nowiki>lwlocks and starvation</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2004-11/msg00905.php <nowiki>Re: lwlocks and starvation</nowiki>]
}}

{{TodoItem
|Fix problem when multiple subtransactions of the same outer transaction hold different types of locks, and one subtransaction aborts
* [http://archives.postgresql.org/pgsql-hackers/2006-11/msg01011.php <nowiki>FOR SHARE vs FOR UPDATE locks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00001.php <nowiki>Re: FOR SHARE vs FOR UPDATE locks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00435.php <nowiki>Re: [PATCHES] [pgsql-patches] Phantom Command IDs, updated patch</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00773.php <nowiki>Re: savepoints and upgrading locks</nowiki>]
}}

{{TodoItem
|Allow UPDATEs on only non-referential integrity columns not to conflict with referential integrity locks
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00073.php <nowiki>Referential Integrity and SHARE locks</nowiki>]
}}

{{TodoItem
|Add idle_in_transaction_timeout GUC so locks are not held for long periods of time}}

{{TodoItem
|Improve deadlock detection when a page cleaning lock conflicts with a shared buffer that is pinned
* [http://archives.postgresql.org/pgsql-bugs/2008-01/msg00138.php <nowiki>BUG #3883: Autovacuum deadlock with truncate?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00873.php <nowiki>Thoughts about bug #3883</nowiki>]
* [http://archives.postgresql.org/pgsql-committers/2008-01/msg00365.php <nowiki>Re: pgsql: Add checks to TRUNCATE, CLUSTER, and REINDEX to prevent</nowiki>]
}}

{{TodoItem
|Detect deadlocks involving LockBufferForCleanup()
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00873.php <nowiki>Thoughts about bug #3883</nowiki>]
}}

{{TodoItem
|Allow finer control over who is cancelled in a deadlock
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01727.php
}}

{{TodoItemDone
|Consider a lock timeout parameter
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00485.php <nowiki>SELECT ... FOR UPDATE [WAIT integer | NOWAIT] for 8.5</nowiki>]
}}

== Startup Time Improvements ==

{{TodoItem
|Experiment with multi-threaded backend for backend creation
|This would prevent the overhead associated with process creation. Most operating systems have trivial process creation time compared to database startup overhead, but a few operating systems (Win32, Solaris) might benefit from threading. Also explore the idea of a single session using multiple threads to execute a statement faster.}}

{{TodoItem
|Allow backends to change their database without restart
|This allows for faster server startup.
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00843.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00336.php
}}

== Write-Ahead Log ==

{{TodoItem
|Eliminate need to write full pages to WAL before page modification
|Currently, to protect against partial disk page writes, we write full page images to WAL before they are modified so we can correct any partial page writes during recovery. These pages can also be eliminated from point-in-time archive files.
* [http://archives.postgresql.org/pgsql-hackers/2002-06/msg00655.php <nowiki>Re: Index Scans become Seq Scans after VACUUM ANALYSE</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg01191.php
* [http://archives.postgresql.org/message-id/20120105061916.GB21048@fetter.org WIP double writes]
* [http://archives.postgresql.org/message-id/4EFC449F02000025000441CD@gw.wicourts.gov double writes]
* [http://archives.postgresql.org/message-id/20120110214344.GB21106@fetter.org Double-write with Fast Checksums]
* [http://archives.postgresql.org/message-id/1962493974.656458.1327703514780.JavaMail.root@zimbra-prod-mbox-4.vmware.com double writes using "double-write buffer" approach]
* http://archives.postgresql.org/pgsql-hackers/2012-10/msg01463.php
}}

{{TodoItem
|When full page writes are off, write CRC to WAL and check file system blocks on recovery
|If CRC check fails during recovery, remember the page in case a later CRC for that page properly matches. The difficulty is that hint bits are not WAL logged, meaning a valid page might not match the earlier CRC.}}

{{TodoItem
|Write full pages during file system write and not when the page is modified in the buffer cache
|This allows most full page writes to happen in the background writer. It might cause problems for applying WAL on recovery into a partially-written page, but later the full page will be replaced from WAL.
* [http://archives.postgresql.org/message-id/CAGvK12UST-tPhyLrSLuSpwFxZbAO79yYrhV2xaLmS2MkUxNUVQ@mail.gmail.com Page Checksums + Double Writes]
}}

{{TodoItem
|Reduce WAL traffic so only modified values are written rather than entire rows
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01589.php <nowiki>Reduction in WAL for UPDATEs</nowiki>]
}}

{{TodoItem
|Allow WAL information to recover corrupted pg_controldata
* [http://archives.postgresql.org/pgsql-patches/2006-06/msg00025.php <nowiki>Re: [HACKERS] pg_resetxlog -r flag</nowiki>]
}}

{{TodoItem
|Find a way to reduce rotational delay when repeatedly writing last WAL page
|Currently fsync of WAL requires the disk platter to perform a full rotation to fsync again. One idea is to write the WAL to different offsets that might reduce the rotational delay.
* [http://archives.postgresql.org/pgsql-hackers/2002-11/msg00483.php <nowiki>500 tpsQL + WAL log implementation</nowiki>]
}}

{{TodoItem
|Speed WAL recovery by allowing more than one page to be prefetched
|This should be done utilizing the same infrastructure used for prefetching in general to avoid introducing complex error-prone code in WAL replay.
* [http://archives.postgresql.org/pgsql-general/2007-12/msg00683.php <nowiki>Slow PITR restore</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00497.php <nowiki>Re: [GENERAL] Slow PITR restore</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg01279.php <nowiki>Read-ahead and parallelism in redo recovery</nowiki>]
}}

{{TodoItem
|Improve WAL concurrency by increasing lock granularity
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00556.php <nowiki>Reworking WAL locking</nowiki>]
}}

{{TodoItem
|Be more aggressive about creating WAL files
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg01325.php <nowiki>Re: PANIC caused by open_sync on Linux</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2004-07/msg01075.php <nowiki>PreallocXlogFiles</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2005-04/msg00556.php <nowiki>WAL/PITR additional items</nowiki>]
}}

{{TodoItem
|Have resource managers report the duration of their status changes
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg01468.php <nowiki>Recovery of Multi-stage WAL actions</nowiki>]
}}

{{TodoItemDone
|Move pgfoundry's xlogdump to /contrib and have it rely more closely on the WAL backend code
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00035.php <nowiki>xlogdump</nowiki>]
}}

{{TodoItem
|Close deleted WAL files held open in *nix by long-lived read-only backends
* [http://archives.postgresql.org/pgsql-hackers/2009-11/msg01754.php <nowiki>Deleted WAL files held open by backends in Linux</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg00060.php <nowiki>Re: Deleted WAL files held open by backends in Linux</nowiki>]
}}

== Optimizer / Executor ==

{{TodoItem
|Improve selectivity functions for geometric operators}}

{{TodoItem
|Consider increasing the default values of from_collapse_limit, join_collapse_limit, and/or geqo_threshold
* [http://archives.postgresql.org/message-id/4136ffa0905210551u22eeb31bn5655dbe7c9a3aed5@mail.gmail.com from_collapse_limit vs. geqo_threshold]
}}

{{TodoItem
|Improve ability to display optimizer analysis using OPTIMIZER_DEBUG
* http://archives.postgresql.org/pgsql-hackers/2012-08/msg00597.php
}}

{{TodoItem
|Log statements where the optimizer row estimates were dramatically different from the number of rows actually found?}}

{{TodoItem
|Consider compressed annealing to search for query plans
|This might replace GEQO.
* http://archives.postgresql.org/message-id/15658.1241278636%40sss.pgh.pa.us
}}

{{TodoItem
|Improve use of expression indexes for ORDER BY
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01553.php <nowiki>Resjunk sort columns, Heikki's index-only quals patch, and bug #5000</nowiki>]
}}

{{TodoItem
|Modify the planner to better estimate caching effects
* http://archives.postgresql.org/pgsql-performance/2010-11/msg00117.php
}}

{{TodoItem
|Allow shared buffer cache contents to affect index cost computations
* http://archives.postgresql.org/pgsql-hackers/2011-06/msg01140.php
}}

{{TodoItem
|Allow the CTE (Common Table Expression) optimization fence to be optionally disabled
* http://archives.postgresql.org/pgsql-hackers/2012-09/msg00700.php
* http://archives.postgresql.org/pgsql-performance/2012-11/msg00161.php
}}

=== Hashing ===
{{TodoSubsection}}

{{TodoItem
|Consider using a hash for joining to a large IN (VALUES ...) list
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00450.php <nowiki>Planning large IN lists</nowiki>]
}}

{{TodoItem
|Allow single batch hash joins to preserve outer pathkeys
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg00806.php Re: Potential Join Performance Issue]
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|"lazy" hash tables - look up only the tuples that are actually requested
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|Avoid building the same hash table more than once during the same query
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|Avoid hashing for distinct and then re-hashing for hash join
* [http://archives.postgresql.org/message-id/4136ffa0902191346g62081081v8607f0b92c206f0a@mail.gmail.com Re: Fixing Grittner's planner issues]
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoEndSubsection}}

== Background Writer ==

{{TodoItem
|Consider having the background writer update the transaction status hint bits before writing out the page
|Implementing this requires the background writer to have access to system catalogs and the transaction status log.}}

{{TodoItem
|Consider adding buffers the background writer finds reusable to the free list
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00781.php <nowiki>Background LRU Writer/free list</nowiki>]
* [http://archives.postgresql.org/message-id/CA+U5nMKtvyDcV4zTr7bq7t6cA2nBfLxCJ8tQgVBnc5ddRPO+Bg@mail.gmail.com our buffer replacement strategy is kind of lame]
}}

{{TodoItem
|Automatically tune bgwriter_delay based on activity rather then using a fixed interval
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00781.php <nowiki>Background LRU Writer/free list</nowiki>]
* [http://archives.postgresql.org/message-id/CA+U5nMKtvyDcV4zTr7bq7t6cA2nBfLxCJ8tQgVBnc5ddRPO+Bg@mail.gmail.com our buffer replacement strategy is kind of lame]
}}

{{TodoItem
|Consider whether increasing BM_MAX_USAGE_COUNT improves performance
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg01007.php <nowiki>Bgwriter LRU cleaning: we've been going at this all wrong</nowiki>]
}}

{{TodoItem
|Test to see if calling PreallocXlogFiles() from the background writer will help with WAL segment creation latency
* [http://archives.postgresql.org/pgsql-patches/2007-06/msg00340.php <nowiki>Re: Load Distributed Checkpoints, final patch</nowiki>]
}}

== Concurrent Use of Resources ==

{{TodoItem
|Do async I/O for faster random read-ahead of data
|Async I/O allows multiple I/O requests to be sent to the disk with results coming back asynchronously.
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg00820.php <nowiki>Asynchronous I/O Support</nowiki>]
* [http://archives.postgresql.org/pgsql-performance/2007-09/msg00255.php <nowiki>Re: random_page_costs - are defaults of 4.0 realistic for SCSI RAID 1</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00027.php <nowiki>There's random access and then there's random access</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-01/msg00170.php <nowiki>Bitmap index scan preread using posix_fadvise (Was: There's random access and then there's random access)</nowiki>]
The above patch is already applied as of 8.4, but it still remains to figure out how to handle plain indexscans effectively.
* [http://archives.postgresql.org//pgsql-hackers/2009-01/msg00806.php Problems with the patch submitted for posix_fadvise in index scans]
}}

{{TodoItem
|Experiment with multi-threaded backend for better I/O utilization
|This would allow a single query to make use of multiple I/O channels simultaneously. One idea is to create a background reader that can pre-fetch sequential and index scan pages needed by other backends. This could be expanded to allow concurrent reads from multiple devices in a partitioned table.
* http://archives.postgresql.org/pgsql-performance/2011-02/msg00123.php
* http://archives.postgresql.org/pgsql-hackers/2012-10/msg01139.php
}}

{{TodoItem
|Experiment with multi-threaded backend for better CPU utilization
|This would allow several CPUs to be used for a single query, such as for sorting or query execution.
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00945.php <nowiki>Multi CPU Queries - Feedback and/or suggestions wanted!</nowiki>]
}}

{{TodoItem
|SMP scalability improvements
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00439.php <nowiki>Straightforward changes for increased SMP scalability</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00206.php <nowiki>Re: Reducing Transaction Start/End Contention</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00361.php <nowiki>Re: Reducing Transaction Start/End Contention</nowiki>]
}}

== TOAST ==

{{TodoItem
|Allow user configuration of TOAST thresholds
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00213.php <nowiki>Re: Proposed adjustments in MaxTupleSize and toastthresholds</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00082.php <nowiki>pg_lzcompress strategy parameters</nowiki>]
}}

{{TodoItem
|Reduce unnecessary cases of deTOASTing
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00895.php <nowiki>Re: [PATCHES] Eliminate more detoast copies for packed varlenas</nowiki>]
}}

{{TodoItem
|Reduce costs of repeat de-TOASTing of values
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg01096.php <nowiki>WIP patch: reducing overhead for repeat de-TOASTing</nowiki>]
}}

== Monitoring ==
{{TodoItem
|Expand pg_stat_activity for easier integration with monitoring tools
|* http://archives.postgresql.org/message-id/4DFA13A5.2060200@2ndQuadrant.com
}}

{{TodoItem
|Add column to pg_stat_activity that shows the progress of long-running commands like CREATE INDEX and VACUUM
* [http://archives.postgresql.org/pgsql-patches/2008-04/msg00203.php <nowiki>EXPLAIN progress info</nowiki>]
* The CLUSTER/VACUUM FULL implementation would also be useful to track this way
}}

{{TodoItem
|Have pg_stat_activity display query strings in the correct client encoding
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00131.php <nowiki>pg_stats queries versus per-database encodings</nowiki>]
}}

{{TodoItemEasy
|Expose pg_controldata via an SQL interface
|Helpful for monitoring replicated databases
* http://archives.postgresql.org/message-id/4B901D73.8030003@agliodbs.com
* [http://archives.postgresql.org/message-id/4B959D7A.6010907@joeconway.com initial patch]
}}

{{TodoItem
| Add entry creation timestamp column to pg_stat_replication
* http://archives.postgresql.org/pgsql-hackers/2011-08/msg00694.php
}}

{{TodoItem
| Allow reporting of stalls due to wal_buffer wrap-around
* http://archives.postgresql.org/pgsql-hackers/2012-02/msg00826.php
}}

{{TodoItem
| Restructure pg_stat_database columns tup_returned and tup_fetched to return meaningful values
* http://www.postgresql.org/message-id/20121012060345.GA29214@toroid.org
}}

== Miscellaneous Performance ==

{{TodoItem
|Use mmap() rather than SYSV for shared buffers?
|This would remove the requirement for SYSV SHM but would introduce portability issues. Anonymous mmap (or mmap to /dev/zero) is required to prevent I/O overhead. We could also consider mmap() for writing WAL.
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00750.php
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00756.php
}}

{{TodoItem
|Rather than consider mmap()-ing in 8k pages, consider mmap()'ing entire files into a backend?
|Doing I/O to large tables would consume a lot of address space or require frequent mapping/unmapping. Extending the file also causes mapping problems that might require mapping only individual pages, leading to thousands of mappings. Another problem is that there is no way to _prevent_ I/O to disk from the dirty shared buffers so changes could hit disk before WAL is written.
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01239.php
}}

{{TodoItem
|Consider ways of storing rows more compactly on disk:
* Reduce the row header size?
* Consider reducing on-disk varlena length from four bytes to two because a heap row cannot be more than 64k in length}}

{{TodoItem
|Consider transaction start/end performance improvements
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00948.php <nowiki>Reducing Transaction Start/End Contention</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00361.php <nowiki>Re: Reducing Transaction Start/End Contention</nowiki>]
}}

{{TodoItem
|Allow configuration of backend priorities via the operating system
|Though backend priorities make priority inversion during lock waits possible, research shows that this is not a huge problem.
* [http://archives.postgresql.org/pgsql-general/2007-02/msg00493.php <nowiki>Priorities for users or queries?</nowiki>]
}}

{{TodoItem
|Consider increasing the minimum allowed number of shared buffers
* [http://archives.postgresql.org/pgsql-bugs/2008-02/msg00157.php <nowiki>Re: [PATCH] Don't bail with legitimate -N/-B options</nowiki>]
}}

{{TodoItem
|Consider if CommandCounterIncrement() can avoid its AcceptInvalidationMessages() call
* [http://archives.postgresql.org/pgsql-committers/2007-11/msg00585.php <nowiki>pgsql: Avoid incrementing the CommandCounter when</nowiki>]
}}

{{TodoItem
|Consider Cartesian joins when both relations are needed to form an indexscan qualification for a third relation
* [http://archives.postgresql.org/pgsql-performance/2007-12/msg00090.php <nowiki>Re: TB-sized databases</nowiki>]
}}

{{TodoItem
|Consider not storing a NULL bitmap on disk if all the NULLs are trailing
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00624.php <nowiki>Proposal for Null Bitmap Optimization(for Trailing NULLs)</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-12/msg00109.php <nowiki>Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)</nowiki>]
}}

{{TodoItem
|Sort large UPDATE/DELETEs so it is done in heap order
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg01119.php <nowiki>Possible future performance improvement: sort updates/deletes by ctid</nowiki>]
}}

{{TodoItem
|Consider decreasing the I/O caused by updating tuple hint bits
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00847.php <nowiki>Hint Bits and Write I/O</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-07/msg00199.php <nowiki>Re: [HACKERS] Hint Bits and Write I/O</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00695.php
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00792.php
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg01063.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01408.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01453.php
}}

{{TodoItem
|Avoid the requirement of freezing pages that are infrequently modified
|If all rows on a page are visible, it is possible to set a bit in the visibility map (once the visibility map is 100% reliable) and not need to freeze the page, avoiding a page rewrite
* http://archives.postgresql.org/message-id/4BF701CF.2090205@agliodbs.com
* http://archives.postgresql.org/pgsql-hackers/2010-06/msg00082.php
}}

{{TodoItem
|Avoid reading in b-tree pages when replaying vacuum records in hot standby mode
* [http://archives.postgresql.org/message-id/1272571938.4161.14739.camel@ebony <nowiki>Hot Standby tuning for btree_xlog_vacuum()</nowiki>]
}}

{{TodoItem
|Restructure truncation logic to be more resistant to failure
|This also involves not writing dirty buffers for a truncated or dropped relation
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01032.php
}}

{{TodoItem
|Consider adding logic to increase large tables by more than 8k
|This would reduce file system fragmentation
* http://archives.postgresql.org/pgsql-bugs/2011-03/msg00337.php
}}

== Miscellaneous Other ==

{{TodoItem
|Deal with encoding issues for filenames in the server filesystem
* {{MessageLink|20090413184335.39BE.52131E4D@oss.ntt.co.jp|a proposed patch here}}
* {{MessageLink|8484.1244655656@sss.pgh.pa.us|some issues about it here}}
* {{MessageLink|20100107103740.97A5.52131E4D@oss.ntt.co.jp|Windows-specific patch here}}
}}

{{TodoItem
|Deal with encoding issues in the output of localeconv()
* [http://archives.postgresql.org/message-id/40c6d9160904210658y590377cfw6dbbecb53d2b8be0@mail.gmail.com bug report]
* [http://archives.postgresql.org/message-id/49EF8DA0.90008@tpf.co.jp draft patch]
* [http://archives.postgresql.org/message-id/21710.1243620986@sss.pgh.pa.us review of patch]
}}

{{TodoItem
|Provide schema name and other fields available from SQL GET DIAGNOSTICS in error reports
* [http://archives.postgresql.org/message-id/dcc563d10810211907n3c59a920ia9eb7cd2a6d5ea58@mail.gmail.com <nowiki>How to get schema name which violates fk constraint</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-11/msg00846.php <nowiki>patch - Report the schema along table name in a referential failure error message</nowiki>]
* {{MessageLink|3191.1263306359@sss.pgh.pa.us|Re: NOT NULL violation and error-message}}
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg00213.php <nowiki>the case for machine-readable error fields</nowiki>]
}}

{{TodoItemDone
| Provide [http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html#LIBPQ-CONNECT-FALLBACK-APPLICATION-NAME fallback_application_name] in contrib/pgbench, oid2name, and dblink.
* {{MessageLink|w2g9837222c1004070216u3bc46b3ahbddfdffdbfb46212@mail.gmail.com|fallback_application_name and pgbench}}
}}

{{TodoItem
|Add 64-bit support to /contrib/pgbench
* http://archives.postgresql.org/pgsql-hackers/2010-07/msg00153.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg00705.php
}}

== Source Code ==

{{TodoItemEasy
|Remove warnings created by -Wcast-align}}

{{TodoItem
|Move platform-specific ps status display info from ps_status.c to ports}}

{{TodoItemDone
|Add optional CRC checksum to heap and index pages
|One difficulty is how to prevent hint bit changes from affecting the computed CRC checksum.
* http://archives.postgresql.org/message-id/19934.1226601952%40sss.pgh.pa.us
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00002.php <nowiki>Re: Block-level CRC checks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01028.php <nowiki>double-buffering page writes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00524.php <nowiki>Re: Block-level CRC checks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg01101.php <nowiki>Re: Block-level CRC checks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg00011.php <nowiki>Re: Block-level CRC checks</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00249.php
* http://archives.postgresql.org/message-id/20111221215913.GA4536@fetter.org
* http://archives.postgresql.org/message-id/CA+U5nMJzQyxcObkpNAf1SYTX-gO_Mom3O9JXHnGpxRo1kXJ7ww@mail.gmail.com
* http://archives.postgresql.org/pgsql-hackers/2012-01/msg00128.php
* http://archives.postgresql.org/pgsql-hackers/2012-01/msg00113.php
* http://archives.postgresql.org/pgsql-hackers/2012-02/msg00172.php
* http://archives.postgresql.org/pgsql-hackers/2012-03/msg00001.php
* http://archives.postgresql.org/pgsql-hackers/2012-03/msg00188.php
* http://www.postgresql.org/message-id/1352422901.31259.28.camel@sussancws0025
}}

{{TodoItem
|Consider a faster CRC32 algorithm
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01112.php
}}

{{TodoItem
|Allow cross-compiling by generating the zic database on the target system}}

{{TodoItem
|Improve NLS maintenance of libpgport messages linked onto applications}}

{{TodoItem
|Use UTF8 encoding for NLS messages so all server encodings can read them properly}}

{{TodoItem
|Allow creation of universal binaries for Darwin
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00884.php <nowiki>Getting to universal binaries for Darwin</nowiki>]
}}

{{TodoItem
|Consider GnuTLS if OpenSSL license becomes a problem
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg00892.php
* [http://archives.postgresql.org/pgsql-patches/2006-05/msg00040.php <nowiki>[PATCH] Add support for GnuTLS</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg01213.php <nowiki>TODO: GNU TLS</nowiki>]
}}

{{TodoItem
|Consider making NAMEDATALEN more configurable in future releases}}

{{TodoItem
|Research use of signals and sleep wake ups
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00003.php <nowiki>Restartable signals 'n all that</nowiki>]
}}

{{TodoItem
|Allow C++ code to more easily access backend code
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00302.php <nowiki>Mostly Harmless: Welcoming our C++ friends</nowiki>]
}}

{{TodoItem
|Consider simplifying how memory context resets handle child contexts
* [http://archives.postgresql.org/pgsql-patches/2007-08/msg00067.php <nowiki>Re: Memory leak in nodeAgg</nowiki>]
}}

{{TodoItem
|Create three versions of libpgport to simplify client code
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00154.php <nowiki>8.4 TODO item: make src/port support libpq and ecpg directly</nowiki>]
}}

{{TodoItem
|Improve detection of shared memory segments being used by others by checking the SysV shared memory field 'nattch'
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00656.php <nowiki>postgresql in FreeBSD jails: proposal</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00673.php <nowiki>Re: postgresql in FreeBSD jails: proposal</nowiki>]
}}

{{TodoItem
|Implement the non-threaded Avahi service discovery protocol
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00939.php <nowiki>Re: [PATCHES] Avahi support for Postgresql</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-02/msg00097.php <nowiki>Re: Avahi support for Postgresql</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg01211.php <nowiki>Re: [PATCHES] Avahi support for Postgresql</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-04/msg00001.php <nowiki>Re: [HACKERS] Avahi support for Postgresql</nowiki>]
}}

{{TodoItem
|Reduce data row alignment requirements on some 64-bit systems
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00369.php <nowiki>[WIP] Reduce alignment requirements on 64-bit systems.</nowiki>]
}}

{{TodoItem
|Restructure TOAST internal storage format for greater flexibility
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00049.php <nowiki>Re: PG_PAGE_LAYOUT_VERSION 5 - time for change</nowiki>]
}}

{{TodoItem
| Add regression tests for pg_dump/restore
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01967.php <nowiki>"make install-check-pg_dump" target in src/regress]</nowiki>]
}}

{{TodoItem
| Research different memory allocation methods for lists
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg01467.php
}}

{{TodoItem
| Consider removing the attribute options cache
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg00039.php
}}

{{TodoItem
| Restructure /contrib section
* http://archives.postgresql.org/pgsql-hackers/2011-06/msg00705.php
}}

{{TodoItem
| Consider adding explicit huge page support
* http://archives.postgresql.org/pgsql-hackers/2012-07/msg00123.php
}}

=== /contrib/pg_upgrade ===
{{TodoSubsection}}

{{TodoItem
|Handle large object comments
|This is difficult to do because the large object doesn't exist when --schema-only is loaded.
}}

{{TodoItem
|Consider using pg_depend for checking object usage in version.c
}}

{{TodoItem
|If reindex is necessary, allow it to be done in parallel with pg_dump custom format
}}

{{TodoItem
|Migrate pg_statistic by dumping it out as a flat file, so analyze is not necessary
|pg_class.oid is not preserved so schema.tablename must be used.
* [http://archives.postgresql.org/message-id/CAAZKuFaWdLkK8eozSAooZBets9y_mfo2HS6urPAKXEPbd-JLCA@mail.gmail.com pg_upgrade and statistics]
}}

{{TodoItem
|Improve testing, perhaps using the buildfarm
|The buildfarm has access to multiple versions of PostgreSQL.
}}

{{TodoItem
|Create machine-readable output of pg_controldata
|This would avoid parsing its output. The problem is we need pg_controldata output from both the old and new clusters so we would need to support both formats.
}}

{{TodoItem
|Find cleaner way to start/stop dedicated servers for upgrades
* http://archives.postgresql.org/pgsql-hackers/2012-08/msg00275.php
}}

{{TodoItem
|Consider a way to run pg_upgrade on standby servers
* http://archives.postgresql.org/pgsql-hackers/2012-07/msg00453.php
* http://archives.postgresql.org/pgsql-hackers/2012-09/msg00056.php
}}

{{TodoEndSubsection}}

=== Windows ===
{{TodoSubsection}}

{{TodoItem
|Remove configure.in check for link failure when cause is found}}

{{TodoItem
|Remove readdir() errno patch when runtime/mingwex/dirent.c rev 1.4 is released}}

{{TodoItem
|Allow psql to use readline once non-US code pages work with backslashes}}

{{TodoItem
|Fix problem with shared memory on the Win32 Terminal Server}}

{{TodoItem
|Improve signal handling
* [http://archives.postgresql.org/pgsql-patches/2005-06/msg00027.php <nowiki>Simplify Win32 Signaling code</nowiki>]
}}

{{TodoItem
|Convert MSVC build system to remove most batch files
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00961.php <nowiki>MSVC build system</nowiki>]
}}

{{TodoItem
|Support pgxs when using MSVC}}

{{TodoItem
|Fix MSVC NLS support, like for to_char()
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00485.php <nowiki>NLS on MSVC strikes back!</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-02/msg00038.php <nowiki>Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)</nowiki>]
}}

{{TodoItem
|Find a correct rint() substitute on Windows
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00808.php <nowiki>Minor bug in src/port/rint.c</nowiki>]
}}

{{TodoItem
|Fix global namespace issues when using multiple terminal server sessions
* [http://archives.postgresql.org/message-id/48F3BFCC.8030107@dunslane.net problems with Windows global namespace]}}

{{TodoItem
|Change from the current autoconf/gmake build system to cmake
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg01869.php <nowiki>About CMake (was Re: [COMMITTERS] pgsql: Append major version number and for libraries soname major)</nowiki>]
}}

{{TodoItem
|Improve consistency of path separator usage
* http://archives.postgresql.org/message-id/49C0BDC5.4010002@hagander.net
}}

{{TodoItem
|Fix cross-compiling on Windows
* http://archives.postgresql.org/pgsql-bugs/2010-10/msg00110.php
}}

{{TodoItem
|Reduce file statistics overhead on directory reads
* http://www.postgresql.org/message-id/1338325561.82125.YahooMailNeo@web39304.mail.mud.yahoo.com
}}

{{TodoEndSubsection}}

=== Wire Protocol Changes ===
{{TodoSubsection}}

{{TodoItem
|Allow dynamic character set handling}}

{{TodoItem
|Let the client indicate character encoding of database names, user names, and passwords
* http://www.postgresql.org/message-id/16160.1360540050@sss.pgh.pa.us}}

{{TodoItem
|Add decoded type, length, precision}}

{{TodoItem
|Mark result columns as known-not-null when possible
* [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01029.php <nowiki>Adding nullable indicator to Describe</nowiki>]
}}

{{TodoItem
|Provide more control over planner treatment of statements being prepared}}

{{TodoItem
|Use compression
|If SSL is used, hopefully avoid the overhead of key negotiation and encryption
* http://archives.postgresql.org/pgsql-hackers/2012-06/msg00793.php
}}

{{TodoItem
|Update clients to use data types, typmod, schema.table.column names of result sets using new statement protocol}}

{{TodoItem
|Set protocol for wire format negotiation
* [http://archives.postgresql.org/message-id/CACMqXCKkGrGXxQhjHCKCe0B8hn6sTt-1sdgHZOSGQMxrusOsQA@mail.gmail.com GUC_REPORT for protocol tunables]
}}

{{TodoItem
|Make sure upgrading to a 4.1 protocol version will actually work smoothly
* [http://archives.postgresql.org/message-id/28307.1318255008@sss.pgh.pa.us Re: libpq, PQdescribePrepared -> PQftype, PQfmod, no PQnullable]
}}

{{TodoEndSubsection}}

== Documentation ==

{{TodoItemEasy
| Add contrib functions to the index
* Add the functions and GUCs in the contrib modules to [http://www.postgresql.org/docs/current/static/sql-createindex.html the documentation index]: [http://archives.postgresql.org/message-id/50A2E173.6030404@2ndQuadrant.com per list discussion]
}}

{{TodoItem
|Convert single quotes to apostrophes in the PDF documentation
* [http://archives.postgresql.org/pgsql-docs/2007-12/msg00059.php <nowiki>SGML docs and pdf single-quotes</nowiki>]
}}

{{TodoItem
|Provide a manpage for postgresql.conf
* {{messageLink|20080819194311.GH4428@alvh.no-ip.org|A smaller default postgresql.conf}}
* {{messageLink|200808211910.37524.peter_e@gmx.net|A smaller default postgresql.conf}}
}}

{{TodoItem
|Change the manpage-generating toolchain to use the new XML-based docbook2x tools
* {{messageLink|200808211910.37524.peter_e@gmx.net|A smaller default postgresql.conf}}
}}

{{TodoItem
|Consider changing documentation format from SGML to XML
* [http://archives.postgresql.org/pgsql-docs/2006-12/msg00152.php <nowiki>Re: Authoring Tools WAS: Switching to XML</nowiki>]
* http://archives.postgresql.org/pgsql-docs/2011-04/msg00020.php
* http://wiki.postgresql.org/wiki/Switching_PostgreSQL_documentation_from_SGML_to_XML
}}

{{TodoItem
|Document support for N<nowiki>' '</nowiki> national character string literals, if it matches the SQL standard
* http://archives.postgresql.org/message-id/1275895438.1849.1.camel@fsopti579.F-Secure.com
}}

{{TodoItem
|Add diagrams to the documentation
* http://archives.postgresql.org/pgsql-docs/2010-07/msg00001.php
}}

== Exotic Features ==

{{TodoItem
|Add pre-parsing phase that converts non-ISO syntax to supported syntax
|This could allow SQL written for other databases to run without modification.}}

{{TodoItem
|Allow plug-in modules to emulate features from other databases}}

{{TodoItem
|Add features of Oracle-style packages
|A package would be a schema with session-local variables, public/private functions, and initialization functions. It is also possible to implement these capabilities in any schema and not use a separate "packages" syntax at all.
* [http://archives.postgresql.org/pgsql-hackers/2006-08/msg00384.php <nowiki>proposal for PL packages for 8.3.</nowiki>]
}}

{{TodoItem
|Consider allowing control of upper/lower case folding of unquoted identifiers
* [http://archives.postgresql.org/pgsql-hackers/2004-04/msg00818.php <nowiki>Bringing PostgreSQL torwards the standard regarding case folding</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg01527.php <nowiki>Re: [SQL] Case Preservation disregarding case sensitivity?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00849.php <nowiki>TODO Item: Consider allowing control of upper/lower case folding of unquoted, identifiers</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00415.php <nowiki>Identifier case folding notes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00415.php <nowiki>Identifier case folding notes</nowiki>]
}}

{{TodoItem
|Add autonomous transactions
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php <nowiki>autonomous transactions</nowiki>]
}}

{{TodoItem
|Give query progress indication
* [[Query progress indication]]
}}

{{TodoItem
|Rethink our type system
* [[Rethinking datatypes]]
}}

== Features We Do ''Not'' Want ==

The following features have been discussed ad nauseum on the PostgreSQL mailing lists and the consensus has been that the project is not interested in them. As such, if you are going to bring them up as potential features, you will want to be familiar with all of the arguments against these features which have been previously made over the years. If you decide to work on such features anyway, you should be aware that you face a higher-than-normal barrier to get the Project to accept them.

{{TodoItem
|All backends running as threads in a single process (not wanted)
|This eliminates the process protection we get from the current setup. Thread creation is usually the same overhead as process creation on modern systems, so it seems unwise to use a pure threaded model, and MySQL and DB2 have demonstrated that threads introduce as many issues as they solve. Threading specific operations such as I/O, seq scans, and connection management has been discussed and will probably be implemented to enable specific performance features. Moving to a threaded engine would also require halting all other work on PostgreSQL for one to two years.}}

{{TodoItem
|"Oracle-style" optimizer hints (not wanted)
|Optimizer hints, as implemented in Oracle and other RDBMSes, are used to work around problems in the optimizer and introduce upgrade and maintenance issues. We would rather have such problems reported and fixed. We have discussed a more sophisticated system of per-class cost adjustment instead, but a specification remains to be developed. See [[OptimizerHintsDiscussion|Optimizer Hints Discussion]] for further information.}}

{{TodoItem
|Embedded server (not wanted)
|While PostgreSQL clients runs fine in limited-resource environments, the server requires multiple processes and a stable pool of resources to run reliably and efficiently. Stripping down the PostgreSQL server to run in the same process address space as the client application would add too much complexity and failure cases. Besides, there are several very mature embedded SQL databases already available.}}

{{TodoItem
|Obfuscated function source code (not wanted)
|Obfuscating function source code has minimal protective benefits because anyone with super-user access can find a way to view the code. At the same time, it would greatly complicate backups and other administrative tasks. To prevent non-super-users from viewing function source code, remove SELECT permission on pg_proc.
* [http://archives.postgresql.org/pgsql-general/2008-09/msg00668.php <nowiki>Obfuscated stored procedures (was Re: Oracle and Postgresql)</nowiki>]
}}

{{TodoItem
|Indeterminate behavior for the GROUP BY clause (not wanted)
|At least one other database product allows specification of a subset of the result columns which GROUP BY would need to be able to provide predictable results; the server is free to return any value from the group. This is not viewed as a desirable feature. PostgreSQL 9.1 allows result columns that are not referenced by GROUP BY if a primary key for the same table is referenced in GROUP BY.
* [http://archives.postgresql.org/pgsql-hackers/2010-03/msg00297.php <nowiki>Re: SQL compatibility reminder: MySQL vs PostgreSQL</nowiki>]
}}

</div>

[[Category:Todo]]

Streaming Replication

2013-01-05T17:44:55Z

Schmiddy: SPAM undo revision 18716 by User:Davidbooker2012

'''Streaming Replication''' (SR) provides the capability to continuously ship and
apply the [http://www.postgresql.org/docs/current/static/wal.html WAL XLOG] records to some number of standby servers in order to keep them current.

This feature was added to PostgreSQL 9.0. The discussion below is a developer oriented one that contains some out of data information. Users of this feature should use the documentation for the feature or a setup tutorial instead:

* [[Binary Replication Tutorial]] provides an introduction to using this replication feature.
* [http://www.postgresql.org/docs/9.1/static/warm-standby.html 9.1 Replication Documentation]
* [http://www.postgresql.org/docs/9.0/static/warm-standby.html 9.0 Replication Documentation]

= Project =
SR was developed for inclusion in PostgreSQL 9.0 by NTT OSS Center. The lead developer is [mailto:masao.fujii@gmail.com Masao Fujii]. [http://www.pgcon.org/2008/schedule/events/76.en.html Synchronous Log Shipping Replication Presentation] introduces the early design of the feature.

= Usage =
== Users Overview ==
* '''Log-shipping'''
** XLOG records generated in the primary are periodically shipped to the standby via the network.
** In the existing warm standby, only records in a filled file are shipped, what's referred to as file-based log-shipping. In SR, XLOG records in partially-filled XLOG file are shipped too, implementing record-based log-shipping. This means the window for data loss in SR is usually smaller than in warm standby, unless the warm standby was also configured for record-based shipping (which is complicated to setup).
** The content of XLOG files written to the standby are exactly the same as those on the primary. XLOG files shipped can be used for a normal recovery and PITR.
* '''Multiple standbys'''
** More than one standby can establish a connection to the primary for SR. XLOG records are concurrently shipped to all these standbys. The delay/death of a standby does not harm log-shipping to other standbys.
** The maximum number of standbys can be specified as a GUC variable.
* '''Continuous recovery'''
** The standby continuously replays XLOG records shipped without using pg_standby.
** XLOG records shipped are replayed as soon as possible without waiting until XLOG file has been filled. The combination of [[Hot Standby]] and SR would make the latest data inserted into the primary visible in the standby almost immediately.
** The standby periodically removes old XLOG files which are no longer needed for recovery, to prevent excessive disk usage.
* '''Setup'''
** The start of log-shipping does not interfere with any query processing on the primary.
** The standby can be started in various conditions.
*** If there are XLOG files in archive directory and restore_command is supplied, at first those files are replayed. Then the standby requests XLOG records following the last applied one to the primary. This prevents XLOG files already present in the standby from being shipped again. Similarly, XLOG files in pg_xlog are also replayed before starting log-shipping.
*** If there is no XLOG files on the standby, the standby requests XLOG records following the starting XLOG location of recovery (the redo starting location).
* '''Connection settings and authentication'''
** A user can configure the same settings as a normal connection to a connection for SR (e.g., keepalive, pg_hba.conf).
* '''Activation'''
** The standby can keep waiting for activation as long as a user likes. This prevents the standby from being automatically brought up by failure of recovery or network outage.
* '''Progress report'''
** The primary and standby report the progress of log-shipping in PS display.
* '''Graceful shutdown'''
** When smart/fast shutdown is requested, the primary waits to exit until XLOG records have been sent to the standby, up to the shutdown checkpoint record.

== Restrictions ==
* '''Synchronous log-shipping'''
** By default, SR supports operates in asynchronous manner, so the commit command might return a "success" to a client before the corresponding XLOG records are shipped to the standby. To enable synchronous replication, see [http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION Synchronous Replication]
* '''Replication beyond timeline'''
** A user has to get a fresh backup whenever making the old standby catch up.
* '''Clustering'''
** Postgres doesn't provide any clustering feature.

== How to Use ==
* '''1.''' Install postgres in the primary and standby server as usual. This requires only ''configure'', ''make'' and ''make install''.
* '''2.''' Create the initial database cluster in the primary server as usual, using ''initdb''.
* '''3.''' Set up connections and authentication so that the standby server can successfully connect to the ''replication'' pseudo-database on the primary.
$ $EDITOR postgresql.conf

listen_addresses = '192.168.0.10'

$ $EDITOR pg_hba.conf

# The standby server must have superuser access privileges.
host replication postgres 192.168.0.20/22 trust
* '''4.''' Set up the streaming replication related parameters on the primary server.
$ $EDITOR postgresql.conf

# To enable read-only queries on a standby server, wal_level must be set to
# "hot_standby". But you can choose "archive" if you never connect to the
# server in standby mode.
wal_level = hot_standby

# Set the maximum number of concurrent connections from the standby servers.
max_wal_senders = 5

# To prevent the primary server from removing the WAL segments required for
# the standby server before shipping them, set the minimum number of segments
# retained in the pg_xlog directory. At least wal_keep_segments should be
# larger than the number of segments generated between the beginning of
# online-backup and the startup of streaming replication. If you enable WAL
# archiving to an archive directory accessible from the standby, this may
# not be necessary.
wal_keep_segments = 32

# Enable WAL archiving on the primary to an archive directory accessible from
# the standby. If wal_keep_segments is a high enough number to retain the WAL
# segments required for the standby server, this is not necessary.
archive_mode = on
archive_command = 'cp %p /path_to/archive/%f'
* '''5.''' Start postgres on the primary server.
* '''6.''' Make a base backup by copying the primary server's data directory to the standby server.
$ psql -c "SELECT pg_start_backup('label', true)"
$ rsync -a ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid
$ psql -c "SELECT pg_stop_backup()"
* '''7.''' Set up replication-related parameters, connections and authentication in the standby server like the primary, so that the standby might work as a primary after failover.
* '''8.''' Enable read-only queries on the standby server. But if wal_level is ''archive'' on the primary, leave hot_standby unchanged (i.e., off).
$ $EDITOR postgresql.conf

hot_standby = on
* '''9.''' Create a recovery command file in the standby server; the following parameters are required for streaming replication.
$ $EDITOR recovery.conf
# Note that recovery.conf must be in $PGDATA directory.

# Specifies whether to start the server as a standby. In streaming replication,
# this parameter must to be set to on.
standby_mode = 'on'

# Specifies a connection string which is used for the standby server to connect
# with the primary.
primary_conninfo = 'host=192.168.0.10 port=5432 user=postgres'

# Specifies a trigger file whose presence should cause streaming replication to
# end (i.e., failover).
trigger_file = '/path_to/trigger'

# Specifies a command to load archive segments from the WAL archive. If
# wal_keep_segments is a high enough number to retain the WAL segments
# required for the standby server, this may not be necessary. But
# a large workload can cause segments to be recycled before the standby
# is fully synchronized, requiring you to start again from a new base backup.
restore_command = 'cp /path_to/archive/%f "%p"'
* '''10.''' Start postgres in the standby server. It will start streaming replication.
* '''11.''' You can calculate the replication lag by comparing the current WAL write location on the primary with the last WAL location received/replayed by the standby. They can be retrieved using ''pg_current_xlog_location'' on the primary and the ''pg_last_xlog_receive_location''/''pg_last_xlog_replay_location'' on the standby, respectively.
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)
pg_current_xlog_location
--------------------------
0/2000000
(1 row)

$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)
pg_last_xlog_receive_location
-------------------------------
0/2000000
(1 row)

$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)
pg_last_xlog_replay_location
------------------------------
0/2000000
(1 row)
* '''12.''' You can also check the progress of streaming replication by using ''ps'' command.
# The displayed LSNs indicate the byte position that the standby server has
# written up to in the xlogs.
[primary] $ ps -ef | grep sender
postgres 6879 6831 0 10:31 ? 00:00:00 postgres: wal sender process postgres 127.0.0.1(44663) streaming 0/2000000

[standby] $ ps -ef | grep receiver
postgres 6878 6872 1 10:31 ? 00:00:01 postgres: wal receiver process streaming 0/2000000
* How to do failover
** Create the trigger file in the standby after the primary fails.
* How to stop the primary or the standby server
** Shut down it as usual (''pg_ctl stop'').
* How to restart streaming replication after failover
** Repeat the operations from '''6th'''; making a fresh backup, some configurations and starting the original primary as the standby. The primary server doesn't need to be stopped during these operations.
* How to restart streaming replication after the standby fails
** Restart postgres in the standby server after eliminating the cause of failure.
* How to disconnect the standby from the primary
** Create the trigger file in the standby while the primary is running. Then the standby would be brought up.
* How to re-synchronize the stand-alone standby after isolation
** Shut down the standby as usual. And repeat the operations from '''6th'''.
* If you have more than one slave, promoting one will break the other(s). Update their recovery.conf settings to point to the new master, set recovery_target_timeline to 'latest', scp/rsync the pg_xlog directory, and restart the slave.

= Todo =
== v9.0 ==

Moved to [[PostgreSQL_9.0_Open_Items]]

=== Committed ===
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01455.php Retrying from archive and some refactoring around Read/FetchRecord().] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00395.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg02601.php SR wrongly treats the WAL-boundary.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00396.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01715.php Adjust SR for some later changes about wal-skipping.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00399.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00024.php VACUUM FULL unexpectedly writes an XLOG UNLOGGED record.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00038.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01754.php Add a message type header.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00037.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01536.php Documentation: Add a new "Replication" chapter.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00115.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00350.php Failed assertion during recovery of partial WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00124.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00712.php A PANIC error might occur in the standby because of a partially-filled archived WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00137.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00330.php Improve the standby messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00140.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01672.php pq_getbyte_if_available() is not working because the win32 socket emulation layer simply wasn't designed to deal with non-blocking sockets.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00198.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01488.php Walsender might emit unfit messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00239.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01236.php Streaming replication on win32, still broken.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00270.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00992.php Create new section for recovery.conf.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00295.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01824.php Assertion failure in walreceiver.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00356.php commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01717.php Forbid a startup of walsender during recovery, and emit a suitable message? Or allow walsender to be started also during recovery?] - [http://archives.postgresql.org/message-id/20100316090955.9A5107541D0@cvs.postgresql.org commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01003.php How do we clean down the archive without using pg_standby?] - [http://archives.postgresql.org/message-id/20100318091718.BC14D7541D0@cvs.postgresql.org commit]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01510.php File-based log shipping without pg_standby doesn't replay the WAL files in pg_xlog.] - [http://archives.postgresql.org/pgsql-committers/2010-03/msg00356.php commit]

== v9.1 ==
=== Synchronization capability ===
* Introduce the replication mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'' and ''fsync''.
** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.
** ''recv'' or ''fsync'' makes transaction commit wait for XLOG to be received or fsynced by the standby, respectively.
** (''apply'' makes transaction commit wait for XLOG to be replayed by the standby. This mode will be supported in v9.2 or later)
** The replication mode is specified in recovery.conf of the standby as well as other parameters for replication.
*** The startup process reads the replication mode from recovery.conf and shares it to walreceiver via new shared-memory variable.
*** Walreceiver also shares it to walsender by using the replication handshake message (existing protocol needs to be extended).
** Based on the replication mode, walreceiver sends the reply meaning that replication is done up to the specified location to the primary.
*** In async, walreceiver doesn't need to send any reply other than end-of-replication message.
*** In recv or fsync, walreceiver sends the reply just after receiving or flushing XLOG, respectively.
*** New message type for the reply needs to be defined. The reply is sent as CopyData message.
** Walreceiver writes all the outstanding XLOG to disk before shutting down.
** Walsender receives the reply from the standby, updates the location of the last record replicated, and announces completion of replication.
*** New shared-memory variable to keep that location is required.
** When processing the commit command, backend waits for XLOG to be replicated to only the standbys which are in the recv or fsync replication mode.
*** Also smart shutdown waits for XLOG of shutdown checkpoint to be replicated.
* Required optimization
** Walsender should send outstanding XLOG without waiting wal_sender_delay.
*** When processing the commit command, backend signals walsender to send outstanding XLOG immediately.
** Backend should exit the wait loop as soon as the reply arrives at the primary.
*** When receiving the reply, walsender signals backends to get up from the sleep and determine whether to exit the wait loop by checking the location of the last XLOG replicated.
*** Only backends waiting for XLOG to be replicated up to the location contained in the reply are sent the signal.
** Walsender waits for the signal from backends and the reply from the standby at the same time, by using select/poll.
** Walsender reads XLOG from not only disk but also shared memory (wal buffers).
** Walreceiver should flush XLOG file only when XLOG file is switched or the related page is flushed.
*** When startup process or bgwriter flushes the buffer page, it checks whether the related XLOG has already been flushed via shared memory (location of the last XLOG flushed).
*** It flushes the buffer page, if XLOG file has already been flushed.
*** It signals walreceiver to flush XLOG file immediately and waits for the flush to complete, if XLOG file has not been flushed yet.
** While the standby is catching up with the primary, those servers should ignore the replication mode and perform asynchronous replication.
*** After those servers have almost gotten into synchronization, they perform replication based on the specified replication mode.
*** New replication states like 'catching-up', 'sync', etc need to be defined, and the state machine for them is required on both servers.
*** Current replication state can be monitored on both servers via SQL.
* Required timeout
** Add new parameter replication_timeout which is the maximum time to wait until XLOG is replicated to the standby.
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.

== Future release ==
* '''Synchronization capability'''
** Introduce the synchronization mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'', ''fsync'' and ''apply''.
*** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.
*** ''recv'', ''fsync'' and ''apply'' makes transaction commit wait for XLOG records to be received, fsynced and applied on the standby, respectively.
** Change walsender to be able to read XLOG from not only the disk but also shared memory.
** Add new parameter replication_timeout which is the maximum time to wait until XLOG records are replicated to the standby.
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.
* '''Monitoring'''
** Provide the capability to check the progress and gap of streaming replication via one query. A collaboration of HS and SR is necessary to provide that capability on the standby side.
** Provide the capability to check if the specified repliation is in progress via a query. Also more detailed status information might be necessary, e.g, the standby is catching up now, has already gotten into sync, and so on.
** Change the stats collector to collect the statistics information about replication, e.g., average delay of replication time.
** Develop the tool to calculate the latest XLOG position from XLOG files. This is necessary to check the gap of replication after the server fails.
** Also develop the tool to extract the user-readable contents from XLOG files. This is necessary to see the contents of the gap, and manually restore them.
* '''Easy to Use'''
** Introduce the parameters like:
*** replication_halt_timeout - replication will halt if no data has been sent for this much time.
*** replication_halt_segments - replication will halt if number of WAL files in pg_xlog exceeds this threshold.
*** These parameters allow us to avoid disk overflow.
** Add new feature which transfers also base backup via the direct connection between the primary and the standby.
** Add new hooks like walsender_hook and walreceiver_hook to cooperate with the add-on program for compression like pglesslog.
** Provide a graceful termination of replication via a query on the primary. On the standby, a trigger file mechanism already provides that capability.
** Support replication beyond timeline. The timeline history files need to be shipped from the primary to the standby.
* '''Robustness'''
** Support keepalive in libpq. This is useful for a client and the standby to detect a failure of the primary immediately.
** [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01536.php New privilege for replication.]
*** Currently superuser privilege is required when the standby connects to the primary. But there is the complaint that we should add new privilege for replication and use it instead of superuser because current approach is not good for security.
* '''Miscellaneous'''
** Standalone walreceiver tool, which connects to the primary, continuously receives and writes XLOG records, independently from postgres server.
** Cascade streaming replication. Allow walsender to send XLOG to another standby during recovery.
** WAL archiving during recovery.

[[Category:Replication]]

Synchronous replication

2013-01-05T17:37:03Z

Schmiddy: remove doubled title

Synchronous replication is available starting in PostgreSQL 9.1 by enabling the [http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html#GUC-SYNCHRONOUS-STANDBY-NAMES synchronous_standby_names] parameter. It includes user-controlled durability specified on the master using the [http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT synchronous_commit] parameter. The design also provides high throughput by allowing concurrent processes to handle the WAL stream.

== Design Notes ==
See also [[Synchronous Replication 9/2010 Proposal]], though those notes pertain to a patch different than what has been committed.

[[Category:Replication]]

Synchronous replication

2013-01-05T17:36:18Z

Schmiddy: Remove the notes describing Simon's 9/2010 patch, and put a link to the wiki page where they have been moved.

= Synchronous Replication =
Synchronous replication is available starting in PostgreSQL 9.1 by enabling the [http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html#GUC-SYNCHRONOUS-STANDBY-NAMES synchronous_standby_names] parameter. It includes user-controlled durability specified on the master using the [http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT synchronous_commit] parameter. The design also provides high throughput by allowing concurrent processes to handle the WAL stream.

== Design Notes ==
See also [[Synchronous Replication 9/2010 Proposal]], though those notes pertain to a patch different than what has been committed.

[[Category:Replication]]

Synchronous Replication 9/2010 Proposal

2013-01-05T17:21:21Z

Schmiddy: Import from Synchronous Replication

= PAGE STATUS =
This page serves as documentation for a Synchronous Replication [http://archives.postgresql.org/message-id/1284486530.1952.3976.camel@ebony patch] posted in September 2010 by Simon Riggs. Note, this proposal is somewhat different than the version which ended up being committed: see [[Synchronous replication]] for more details.

=WHAT'S DIFFERENT ABOUT THIS PATCH?=

The implementation in 9.1 includes several innovations, beyond [http://wiki.postgresql.org/wiki/Streaming_Replication Fujii Masao's work] providing an earlier synchronous replication implementation for PostgreSQL 9.0:

* Low complexity of code on Standby
* User control: All decisions to wait take place on master, allowing fine-grained control of synchronous replication. Max replication level can also be set on the standby.
* Low bandwidth: Very small response packet size with no increase in number of responses when system is under high load means very little additional bandwidth required
* Performance: Standby processes work concurrently to give good overall throughput on standby and minimal latency in all modes. 4 performance options don't interfere with each other, so offer different levels of performance/durability alongside each other.

These are major wins for PostgreSQL project over and above the basic sync rep feature.

=SYNCHRONOUS REPLICATION OVERVIEW=

Synchronous replication offers the guarantee that all changes made by a
transaction have been transferred to remote standby nodes. This is an
extension to the standard level of durability offered by a transaction
commit.

When synchronous replication is requested the transaction will wait
after it commits until it receives confirmation that the transfer has
been successful. Waiting for confirmation increases the user's certainty
that the transfer has taken place but it also necessarily increases the
response time for the requesting transaction. Synchronous replication
usually requires carefully planned and placed standby servers to ensure
applications perform acceptably. Waiting doesn't utilise system
resources, but transaction locks continue to be held until the transfer
is confirmed. As a result, incautious use of synchronous replication
will lead to reduced performance for database applications.

It may seem that there is a simple choice between durability and
performance. However, there is often a close relationship between the
importance of data and how busy the database needs to be, so this is
seldom a simple choice. With this patch, PostgreSQL now provides a range
of features designed to allow application architects to design a system
that has both good overall performance and yet good durability of the
most important data assets.

PostgreSQL allows the application designer to specify the durability
level required via replication. This can be specified for the system
overall, though it can also be specified for individual transactions.
This allows to selectively provide highest levels of protection for
critical data.

For example we, an application might consist of two types of work:
* 10% of changes are changes to important customer details
* 90% of changes are less important data that the business can more easily survive if it is lost, such as chat messages between users.

With sync replication options specified at the application level (on the
master) we can offer sync rep for the most important changes, without
slowing down the bulk of the total workload. Application level options
are an important and practical tool for allowing the benefits of
synchronous replication for high performance applications.

Without sync rep options specified at app level, we would have a choice
of either slowing down 90% of the workload because 10% of it is
important. Or giving up our durability goals because of performance. Or
splitting those two functions onto separate database servers so that we
can set options differently on each. None of those 3 options is truly
attractive.

PostgreSQL also allows the system administrator the ability to specify
the service levels offered by standby servers. This allows multiple
standby servers to work together in various roles within a server farm.

''Note: the information about the parameters used here reflects and earlier version of this feature, and needs to be updated to reflect the form it was committed into 9.1 as''

Control of this feature relies on just 3 parameters:
On the master we can set

* synchronous_replication
* synchronous_replication_timeout

On the standby we can set

* synchronous_replication_service

These are explained in more detail in the following sections.

=USER'S OVERVIEW=

Two new USERSET parameters on the master control this
* synchronous_replication = async (default) | recv | fsync | apply
* synchronous_replication_timeout = 0+ (0 means never timeout)
(default timeout 10sec)

synchronous_replication = async is the default and means that no
synchronisaton is requested and so the commit will not wait. This is the
fastest setting. The word async is short for "asynchronous" and you may
see the term asynchronous replication discussed.

Other settings refer to progressively higher levels of durability. The
higher the level of durability requested, the longer the wait for that
level of durability to be achieved.

The precise meaning of the synchronous_replication settings is
* async - commit does not wait for a standby before replying to user
* recv - commit waits until standby has received WAL
* fsync - commit waits until standby has received and fsynced WAL
* apply - commit waits until standby has received, fsynced and applied
This provides a simple, easily understood mechanism - and one that in
its default form is very similar to other RDBMS (e.g. Oracle).

Note that in apply mode it is possible that the changes could be
accessible on the standby before the transaction that made the change
has been notified that the change is complete. Minor issue.

Network delays may occur and the standby may also crash. If no reply is
received within the timeout we raise a NOTICE and then return successful
commit (no other action is possible). Note that it is possible to
request that we never timeout, so if no standby is available we wait for
it one to appear.

When user commits, if the master does not have a currently connected
standby offering the required level of replication it will pick the next
best available level of replication. It is up to the sysadmin to provide
sufficient range of standby nodes to ensure at least one is available to
meet the requested service levels.

If multiple standbys exist, the first standby to reply that the desired
level of durability has been achieved will release the waiting commit on
the master. Other options are available also via a plugin.

==ADMINISTRATOR'S OVERVIEW==

On the standby we specify the highest type of replication service
offered by this standby server. This information is passed to the master
server when the standby connects for replication.

This allows sysadmins to designate preferred standbys. It also allows
sysadmins to completely refuse to offer a synchronous replication
service, allowing a master to explicitly avoid synchronisation across
low bandwidth or high latency links.

An additional parameter can be set in recovery.conf on the standby

* synchronous_replication_service = async (def) | recv | fsync | apply

= IMPLEMENTATION =

Some aspects can be changed without significantly altering basic
proposal, for example master-specified standby registration wouldn't
really alter this very much.

== STANDBY ==

Master-controlled sync rep means that all user wait logic is centred on
the master. The details of sync rep requests on the master are not sent
to the standby, so there is no additional master to standby traffic nor
standby-side bookkeeping overheads. It also reduces complexity of
standby code.

On the standby side the WAL Writer now operates during recovery. This
frees the WALReceiver to spend more time sending and receiving messages,
thereby minimising latency for users choosing the "recv" option. We now
have 3 processes handling WAL in an asynchronous pipeline: WAL Receiver
reads WAL data from the libpq connection then writes it to the WAL file,
the WAL Writer then fsyncs the WAL file and then the Startup process
replays the WAL. These processes act independently, so WAL pointers
(LSNs) are defined as WALReceiverLSN >= WALWriterLSN >= StartupLSN

For each new message WALReceiver gets from master we issue a reply. Each
reply sends the current state of the 3 LSNs, so the reply message size
is only 28 bytes. Replies are sent half-duplex, i.e. we don't reply
while a new message is arriving.

Note that there is absolutely not one reply per transaction on the
master. The standby knows nothing about what has been requested on the
master - replies always refer to the latest standby state and
effectively batch the responses.

We act according to the requested synchronous_replication_service
* async - no replies are sent
* recv - replies are sent upon receipt only
* fsync - replies are sent upon receipt and following fsync only
* apply - replies are sent following receipt, fsync and apply.

Replies are sent at the next available opportunity.

In apply mode, when the WALReceiver is completely quiet this means we
send 3 reply messages - one at recv, one at fsync and one at apply. When
WALreceiver is busy the volume of messages does *not* increase since the
reply can't be sent until the current incoming message has been
received, after which we were going to reply anyway so it is not an
additional message. This means we piggyback an "apply" response onto a
later "recv" reply. As a result we get minimum response times in *all*
modes and maximum throughput is not impaired at all.

When each new messages arrives from master the WALreceiver will write
the new data to the WAL file, wake the WALwriter and then reply. Each
new message from master receives a reply. If no further WAL data has
been received the WALreceiver waits on the latch. If the WALReceiver is
woken by WALWriter or Startup then it will reply to master with a
message, even if no new WAL has been received.

So in both recv, fsync and apply cases a message as soon as possible to
master, so in all cases the wait time is minimised.

When WALwriter is woken it sees if there is outstanding WAL data and if
so fsyncs it and wakes both WALreceiver and Startup. When no WAL remains
it waits on the latch.

Startup process will wake WALreceiver when it has got to the end of the
latest chunk of WAL. If no further WAL is available then it waits on its
latch.

== MASTER ==

When user backends request sync rep they wait in a queue ordered by
requested LSN. A separate queue exists for each request mode.

WALSender receives the 3 LSNs from the standby. It then wakes backends
in sequence from each queue.

We provide a single wakeup rule: first WALSender to reply with the
requested XLogRecPtr will wake the backend. This guarantees that the WAL
data for the commit is transferred as requested to at least one standby.
That is sufficient for the use cases we have discussed.

More complex wakeup rules would be possible via a plugin.

Wait timeout would be set by individual backends with a timer, just as
we do for statement_timeout.

= CODE =

Total code to implement this is low. Breaks down into 5 areas
* Zoltan's libpq changes, included almost verbatim; fairly modular, so easy to replace with something we like better
* A new module syncrep.c and syncrep.h handle the backend wait/wakeup
* Light changes to allow streaming rep to make appropriate calls
* Small amount of code to allow WALWriter to be active in recovery
* Parameter code
No docs yet.

The patch works on top of latches, though does not rely upon them for
its bulk performance characteristics. Latches only improve response time
for very low transaction rates; latches provide no additional throughput
for medium to high transaction rates.

= PERFORMANCE ANALYSIS =

Since we reply to each new chunk sent from master, "recv" mode has
absolutely minimal latency, especially since WALreceiver no longer
performs majority of fsyncs, as in 9.0 code. WALreceiver does not wait
for fsync or apply actions to complete before we reply, so fsync and
apply modes will always wait at least 2 standby->master messages which
is appropriate because those actions will typically occur much later.

This response mechanism offers highest responsive performance achievable
in "recv" mode and very good throughput under load. Note that the
different modes do not interfere with each other and can co-exist
happily while providing highest performance.

Starting WALWriter is helpful, no matter what the
synchronous_replication_service specified.

Can we optimise the sending of reply messages so that only chunks that
contain a commit deserve a reply? We could, but then we'd need to do
extra work on the master to do bookkeeping of that. It would need to be
demonstrated that there is a performance issue big enough to be worth
the overhead on master and extra code.

Is there an optimisation from reducing the number of options the standby
provides? The architecture on the standby side doesn't rely heavily on
the service level specified, nor does it rely in any way on the actual
sync rep mode specified on master. No further simplification is
possible.

= NOT YET IMPLEMENTED =

* Timeout code & NOTICE
* Code and test plugin
* Loops in walsender, walwriter and receiver treat shutdown incorrectly

I haven't yet looked at Fujii's code for this, not even sure where it
is, though hope to do so in the future. Zoltan's libpq code is the only
part of that patch used.

So far I have spent 3.5 days on this and expect to complete tomorrow. I
think that throws out the argument that this proposal is too complex to
develop in this release.

= OTHER ISSUES =

* How should master behave when we shut it down?
* How should standby behave when we shut it down?

[[Category:Replication]]

VACUUM FULL

2012-12-11T18:25:55Z

Schmiddy: fix link to CREATE INDEX

= VACUUM vs VACUUM FULL =

The <code>[http://www.postgresql.org/docs/current/static/sql-vacuum.html VACUUM]</code> command and associated autovacuum process are PostgreSQL's way of controlling MVCC bloat. The <code>VACUUM</code> command has two main forms of interest - ordinary <code>VACUUM</code>, and <code>VACUUM FULL</code>. These two commands are actually quite different and should not be confused.

<code>VACUUM</code> scans a table, marking tuples that are no longer needed as free space so that they can be overwritten by newly inserted or updated data. See [[Introduction to VACUUM, ANALYZE, EXPLAIN, and COUNT]] and the PostgreSQL documentation on [http://www.postgresql.org/docs/current/static/mvcc.html MVCC] for a detailed explanation of this. Note that you should rarely need to use the <code>VACUUM</code> command directly on a modern PostgreSQL database, as [http://www.postgresql.org/docs/current/static/routine-vacuuming.html#AUTOVACUUM autovacuum] should take care of it for you if properly set up.

<code>VACUUM FULL</code>, unlike <code>VACUUM</code>, touches data that has not been deleted. On pre-9.0 versions of PostgreSQL, it moves data into spaces earlier in the file that have been freed. Once it has created a free space at the end of the file, it truncates the file so that the OS knows that space is free and may be reused for other things. Moving in-use data around this way can have adverse side-effects, including taking heavy weight locks, increased i/o, and adding index bloat. On older systems, there are better ways to free space if you need to, and better ways to optimize tables (see below) so you should essentially never use <code>VACUUM FULL</code> on a pre-9.x system. Even on 9.x and above, the system is designed with the goal that you should never be running <code>VACUUM FULL</code> regularly, and doing so can have costs like huge WAL archive output and high loads on any streaming replication servers.

For clarity, <b>9.0 changes VACUUM FULL</b>. As covered in the [http://developer.postgresql.org/pgdocs/postgres/sql-vacuum.html documentation], the VACUUM FULL implementation has been changed to one that's similar to using CLUSTER in older versions. This gives a slightly different set of trade-offs from the older VACUUM FULL described here. While the potential to make the database slower via index bloating had been removed by this change, it's still something you may want to avoid doing, due to the locking and general performance overhead of a VACUUM FULL.

== When to use <code>VACUUM FULL</code> and when not to ==

Many people, either based on misguided advice on the 'net or on the assumption that it must be "better", periodically run <code>VACUUM FULL</code> on their tables. This is generally not recommended and in some cases can make your database slower, not faster.

<code>VACUUM FULL</code> is only needed when you have a table that is mostly dead rows - ie, the vast majority of its contents have been deleted. It should not be used for table optimization or periodic maintenance, as it's generally counterproductive. In most cases the freed space will be promptly re-allocated, possibly increasing file-system-level fragmentation and requiring file system space allocations that're slower than just re-using existing free space within a table.

When you run <code>VACUUM FULL</code> on a table, that table is locked for the duration of the operation, so nothing else can work with the table. <code>VACUUM FULL</code> is <i>much</i> slower than a normal <i>VACUUM</i>, so the table may be unavailable for a while.

More importantly, on pre-9.0 systems, while <code>VACUUM FULL</code> compacts the table, it does not compact the indexes - and in fact may increase their size, thus slowing them down, causing more disk I/O when the indexes are used, and increasing the amount of memory they require. A <code>REINDEX</code> may be required after <code>VACUUM FULL</code> on PostgreSQL versions older than 9.0. See the main documentation's [http://www.postgresql.org/docs/current/static/routine-vacuuming.html#VACUUM-BASICS notes on VACUUM vs VACUUM FULL].

== What to use instead ==

If you shouldn't use regularly use <code>VACUUM FULL</code> (or use it at all on versions older than 9.0) ... what should you be using?

=== [http://www.postgresql.org/docs/current/static/routine-vacuuming.html#AUTOVACUUM Autovacuum] ===

If [http://www.postgresql.org/docs/current/static/routine-vacuuming.html#AUTOVACUUM autovacuum] is running frequently enough and aggressively enough, your tables should never grow ("bloat") due to unreclaimed dead rows, so you should never need to return "dead" space to the OS.

Autovacuum continues to improve dramatically with every PostgreSQL version, and is a very good reason to make sure you are running the latest version. For example, with 8.4 the free space map is now managed automatically, removing a no-longer-necessary tuning parameter and eliminating a major source of table bloat.

If autovacuum isn't doing enough to keep your tables and indexes bloat-free, tune it, don't supplement it with manual vacuuming and reindexing. You may need to increase your free space map settings (pre-8.4), tune autovacuum to run more frequently, and/or tell autovacuum to vacuum certain frequently-updated tables more aggressively than others.

=== <code>[http://www.postgresql.org/docs/current/static/sql-vacuum.html VACUUM]</code> ===

Unless you need to return space to the OS so that other tables or other parts of the system can use that space, or you are trying to repair a table that has bloated out of control due to insufficient autovacuum, you should use <code>VACUUM</code> instead of <code>VACUUM FULL</code>.

If you need to manually <code>VACUUM</code> your tables at any time other than when running major admin or update tasks that rewrite large parts of your tables, you probably don't have autovacuum set up well enough.

=== <code>[http://www.postgresql.org/docs/current/static/sql-cluster.html CLUSTER]</code> ===

If you're trying to "optimise" your tables by packing them down and removing table bloat that's accumulated due to (say) insufficiently aggressive autovacuuming, or you're trying to return dead space in a table to the operating system, it's fine to use VACUUM FULL in PostgreSQL 9.0 and above.

Consider setting a FILLFACTOR of less than the default 100, so the rewritten table has some free space pre-alloacated within it for updates and new inserts; otherwise you'll just get file system allocations as soon as you do anything to the table.

In older versions, it's preferable to use <code>[http://www.postgresql.org/docs/current/static/sql-cluster.html CLUSTER]</code>. It runs a <i>lot</i> faster than pre-9.0 <code>VACUUM FULL</code> and will compact and optimise the indexes as well as the table its self. However, you will need enough free space to hold all the in-use data from the table while <code>CLUSTER</code> runs. As with post-9.0 VACUUM FULL, a non-default FILLFACTOR may be wise.

=== <code>[http://www.postgresql.org/docs/current/static/sql-truncate.html TRUNCATE TABLE]</code> ===

If you've been using <code>VACUUM FULL</code> to free space from a table that's periodically completely emptied using <code>DELETE FROM tablename;</code> (without a <code>WHERE</code> clause), you can use <code>[http://www.postgresql.org/docs/current/static/sql-truncate.html TRUNCATE TABLE]</code> to replace those two steps with one much, <i>much</i> faster one.

Instead of:

<pre>
DELETE FROM tablename;
VACUUM FULL tablename;
</pre>

write:

<pre>
TRUNCATE TABLE tablename;
</pre>

Please make sure to read the caveats in the notes on the <code>[http://www.postgresql.org/docs/current/static/sql-truncate.html TRUNCATE TABLE]</code> documentation. If <code>TRUNCATE TABLE</code> isn't suitable for your needs, you can use <code>DELETE</code> followed by <code>CLUSTER</code> instead.

If using <code>DELETE</code> on a table that is the target of foreign key references, consider adding an index to the referencing columns. That will allow checks for foreign key enforcement to avoid a sequential scan on the referencing table, making <code>DELETE</code> from the referenced table vastly faster.

=== <code>[http://www.postgresql.org/docs/current/static/sql-altertable.html ALTER TABLE .. SET DATA TYPE]</code> (relevant for 8.4 and below only)===

''This section is obsolete for PostgreSQL 9.0 and above. Skip it unless you use a very old version.''

The problem with <code>CLUSTER</code> is that it reorders the table following an index. If the table is not already approximately in that index' order, this will take a long time because it will have to do a scattered read of the table pages over and over as it looks for each tuple. A faster alternative is to request a full table rewrite without requiring a particular order. PostgreSQL versions prior to 9.0 do not offer any direct way to invoke this operation; however, you can use the following workaround. Choose any table column, and use <code>[http://www.postgresql.org/docs/current/static/sql-altertable.html ALTER TABLE]</code> to change its type <i>to the same type</i>. This is obviously going to cause no logical change to the table, but the server will have to rewrite the table, getting rid of dead tuples while at it.

For example, assuming the <code>an_integer_column</code> is of type <code>INTEGER</code>:

<pre>
ALTER TABLE your_table ALTER an_integer_column SET DATA TYPE integer;
</pre>

This trick will not work in PostgreSQL versions 9.1 or later, as it detects that the change in data type is degenerate and so no rewrite is necessary.

=== <code>[http://www.postgresql.org/docs/current/static/sql-selectinto.html SELECT ... INTO]</code> (relevant for 8.4 and below only) ===

''This section is obsolete for PostgreSQL 9.0 and above. Skip it unless you use a very old version.''

Sometimes it can be faster to use a <code>[http://www.postgresql.org/docs/current/static/sql-selectinto.html SELECT ... INTO]</code> command to copy data from a bloated table into a new table, then re-create the indexes and finally rename the tables to replace the old one with the new one. It's rarely worth doing this instead of using <code>CLUSTER</code>, though, as <code>CLUSTER</code> does almost the same thing automatically and can rebuild indexes in parallel. The main reason you may want to use <code>SELECT ... INTO</code> instead of <code>CLUSTER</code> is if you don't want to sort the table.

== Recovering from index bloat caused by <code>VACUUM FULL</code> (relevant for 8.4 and below only) ==

''This section is obsolete for PostgreSQL 9.0 and above. Skip it unless you use a very old version.''

If you have indexes badly bloated by regular use of <code>VACUUM FULL</code>, your best bet is usually going to be to use <code>CLUSTER</code> to rewrite the table and rebuild the indexes.

If you can't afford to have the table locked for that long, you can rebuild each index individually while queries continue to run on the table. PostgreSQL unfortunately doesn't have a <code>REINDEX CONCURRENTLY</code> command, but it can be simulated with appropriate use of [http://www.postgresql.org/docs/current/static/sql-createindex.html CREATE INDEX ... CONCURRENTLY], [http://www.postgresql.org/docs/current/static/sql-alterindex.html ALTER INDEX ... RENAME] and [http://www.postgresql.org/docs/current/static/sql-dropindex.html DROP INDEX] to create new indexes, swap the old and new ones by renaming, and drop the old indexes. Note that since you can't drop some indexes, such as primary keys, this may not be a possible cleanup technique for all of them.

Originally by --[[User:Ringerc|Ringerc]] 03:48, 26 November 2009 (UTC)

[[Category:FAQ]][[Category:Vacuuming]][[Category:Administration]][[Category:Performance]]

Streaming Replication

2012-11-09T21:06:16Z

Schmiddy: Make clear that synchronous replication is supported now

User:Schmiddy

2012-11-08T22:50:19Z

Schmiddy: /* TODO Items / Things to Investigate */

== PostgreSQL Notes and Misc ==

; Contact Info
: [mailto:josh**at**kupershmidt.org Email]
: [http://kupershmidt.org web]

IRC2RWNames

2012-11-08T22:46:10Z

Schmiddy: /* List of IRC nicks with their respective real world names */

=== List of IRC nicks with their respective real world names ===

You can find many PostgreSQL users and developers chatting in [irc://irc.freenode.net/postgresql #postgresql on freenode]. Here's more information about some of the regulars there:

{| border="1"
|-
!Nickname || Real Name
|-
|ads || Andreas Scherbaum
|-
|agliodbs, aglio2 (freenode), jberkus (oftc) || Josh Berkus
|-
|ahammond || Andrew Hammond
|-
|alvherre || Alvaro Herrera
|-
|andres || Andres Freund
|-
|Assid || Satish Alwani
|-
|aurynn || Aurynn Shaw
|-
|BlueAidan/BlueAidan_work || [[user:davidblewett | David Blewett]]
|-
|bmomjian || Bruce Momjian
|-
|cbbrowne || Christopher Browne
|-
|cce || Clark C. Evans
|-
|chicagoben || Benjamin Johnson
|-
|crab || Abhijit Menon-Sen
|-
|Crad || Gavin M. Roy
|-
|daamien || Damien Clochard
|-
|DarcyB || Darcy Buskermolen
|-
|darkixion || Thom Brown
|-
|davidfetter || David Fetter
|-
|dbb || Brian Hamlin / darkblue_b
|-
|dcolish || [http://www.unencrypted.org Dan Colish]
|-
|dcramer || Dave Cramer
|-
|DeciBull, TheCougar || Jim C. Nasby
|-
|dennisb || Dennis Björklund
|-
|depesz || Hubert Lubaczewski
|-
|devrimgunduz || Devrim Gündüz
|-
|digicon || [http://digicondev.blogspot.com Zach Conrad]
|-
|dim || Dimitri Fontaine
|-
|direvus || Brendan Jurd
|-
|drbair || Ryan Bair
|-
|DrLou || Lou Picciano
|-
|duck_tape || Adi Alurkar
|-
|dvl || [http://langille.org/ Dan Langille]
|-
|eggyknap || Joshua Tolley
|-
|endpoint_david || David Christensen
|-
|eulerto || Euler Taveira
|-
|f3ew/devdas || Devdas Vasu Bhagat
|-
|feivel || Michael Meskes
|-
|elein || Elein Mustain
|-
|gleu || Guillaume Lelarge
|-
|gorthx || [[User:Gabrielle|Gabrielle Roth]]
|-
|grzm || Michael Glaesemann
|-
|gsmet || Guillaume Smet
|-
|gregs1104 || Greg Smith
|-
|G_SabinoMullane || Greg Sabino Mullane
|-
|HarrisonF || Harrison Fisk
|-
|ioguix || Jehan-Guillaume de Rorthais
|-
|indigo || Phil Frost
|-
|intgr || Marti Raudsepp
|-
|JanniCash || Jan Wieck
|-
|jconway || Joe Conway
|-
|jdavis, jdavis_ || Jeff Davis
|-
|jkatz05 || Jonathan S. Katz
|-
|johto || Marko Tiikkaja
|-
|jurka || Kris Jurka
|-
|justatheory || David Wheeler
|-
|jpa || Jean-Paul Argudo
|-
|jwp || James Pye
|-
|j_williams || Josh Williams
|-
|kgrittn || Kevin Grittner
|-
|klando || Cédric Villemain
|-
|larryrtx || Larry Rosenman
|-
|linuxpoet, postgresman || Joshua D. Drake
|-
|lluad || Steve Atkins
|-
|lsmith || Lukas Smith
|-
|magnush || Magnus Hagander
|-
|marco44 || Marc Cousin
|-
|markwkm || Mark Wong
|-
|mastermind || [[user:mastermind | Stefan Kaltenbrunner]]
|-
|mbalmer || [[user:mbalmer | Marc Balmer]]
|-
|merlin83 || Chua Khee Chin
|-
|merlinm || Merlin Moncure
|-
|metatrontech || Chris Travers
|-
|miracee || Susanne Ebrecht
|-
|Moosbert || Peter Eisentraut
|-
|neilc || Neil Conway
|-
|oicu || Andrew Dunstan
|-
|okbobcz || Pavel Stehule
|-
|pg_docbot || [[IRCBotSyntax]]
|-
|pgSnake || Dave Page
|-
|PJMODOS || Petr Jelínek
|-
|Possible || Robert Ivens
|-
|postwait || Theo Schlossnagle
|-
|prothid || R Brenton Strickler
|-
|psoo || Bernd Helmle
|-
|pyarra || Philip Yarra
|-
|raptelan || [[user:Cshobe|Casey Allen Shobe]]
|-
|rhaas || Robert Haas
|-
|RhodiumToad (formerly AndrewSN) || Andrew Gierth
|-
|Robe || [[user:Robe | Michael Renner]]
|-
|rotellaro || Federico Campoli
|-
|rz || Kirill Simonov
|-
|SAS || Stéphane Schildknecht
|-
|schmiddy || Josh Kupershmidt
|-
|scrappy || Marc G. Fournier
|-
|selenamarie || Selena Deckelmann
|-
|SkippyDigits || Sherri Kalm
|-
|Snow-Man || Stephen Frost
|-
|Spritz || Matteo Beccati
|-
|sternocera || Peter Geoghegan
|-
|StuckMojo, MojoWork || Jon Erdman
|-
|swm || Gavin Sherry
|-
|vy || Volkan YAZICI
|-
|wulczer || Jan Urbański
|-
|xaprb || Baron Schwartz
|-
|xzilla, xzi11a || [[User:Xzilla|Robert Treat]]
|}

[[Category:Community]]

FAQ

2012-10-23T05:34:09Z

Schmiddy: fix a few doc links to point to "current" instead of 8.x versions

{{Languages}}
[[:Category:FAQ|Additional FAQ Entries on this Wiki]]

== Translations of this Document ==

* [[Häufig gestellte Fragen|German]]
* [[Perguntas Frequentes|Portuguese]]
* [[Preguntas Frecuentes|Spanish]]
* [[Часто Задаваемые Вопросы|Русский]]

== Platform-specific questions ==

Windows users should also read the [[Running & Installing PostgreSQL On Native Windows|platform FAQ for Windows]]. There are [[Frequently Asked Questions#Platform FAQs|FAQs for other platforms]] too.

== General Questions ==

=== What is PostgreSQL? How is it pronounced? What is Postgres? ===

PostgreSQL is pronounced Post-Gres-Q-L. (For those curious about how
to say "PostgreSQL", an [http://www.postgresql.org/files/postgresql.mp3 audio file] is available.)

PostgreSQL is an object-relational database system that has the
features of traditional proprietary database systems with enhancements
to be found in next-generation DBMS systems. PostgreSQL is free and
the complete source code is available.

PostgreSQL development is performed by a team of mostly volunteer
developers spread throughout the world and communicating via the
Internet. It is a community project and is not controlled by any
company. To get involved, see the [[Developer_FAQ | Developer FAQ]].

Postgres is a widely-used nickname for PostgreSQL. It was the original
name of the project at Berkeley and is strongly preferred over other
nicknames. If you find 'PostgreSQL' hard to pronounce, call it
'Postgres' instead.

=== Who controls PostgreSQL? ===

If you are looking for a PostgreSQL gatekeeper, central committee, or
controlling company, give up --- there isn't one. We do have a core
committee and CVS committers, but these groups are more for
administrative purposes than control. The project is directed by the
community of developers and users, which anyone can join. All you need
to do is subscribe to the mailing lists and participate in the
discussions. (See the [[Developer FAQ|Developer's FAQ]] for information on how to get
involved in PostgreSQL development.)

=== Who is the PostgreSQL Global Development Group? ===

The "PGDG" is an international, unincorporated association of
individuals and companies who have contributed to the PostgreSQL
project. The PostgreSQL Core Team generally act as spokespeople
for the PGDG.

=== Who is the PostgreSQL Core Team? ===

A committee of five to seven (currently six) senior contributors to
PostgreSQL who do the following for the project: (a) set release dates,
(b) handle confidential matters for the project, (c) act as spokespeople
for the PGDG when required, and (d) arbitrate community decisions which
are not settled by consensus. The current Core Team is listed on top of
[http://www.postgresql.org/community/contributors/ the contributors page]

=== What about the various PostgreSQL foundations? ===

While the PostgreSQL project utilizes non-profit corporations in the
USA, Europe, Brazil and Japan for fundraising and project coordination,
these entities do not own the PostgreSQL code.

=== What is the license of PostgreSQL? ===

PostgreSQL is distributed under a license similar to BSD and MIT. Basically, it
allows users to do anything they want with the code, including
reselling binaries without the source code. The only restriction is
that you not hold us legally liable for problems with the software.
There is also the requirement that this copyright appear in all copies
of the software. Here is the license we use:

PostgreSQL Database Management System
(formerly known as Postgres, then as Postgres95)

Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group

Portions Copyright (c) 1994, The Regents of the University of California

Permission to use, copy, modify, and distribute this software and its
documentation for any purpose, without fee, and without a written agreement
is hereby granted, provided that the above copyright notice and this
paragraph and the following two paragraphs appear in all copies.

IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR
DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS
ON AN "AS IS" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO
PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.

=== What platforms does PostgreSQL support? ===

In general, any modern Unix-compatible platform should be able to run
PostgreSQL. The platforms that have received recent explicit testing
can be seen in the [http://buildfarm.postgresql.org/ Build farm].
The documentation contains more details about supported platforms at http://www.postgresql.org/docs/current/static/supported-platforms.html.

PostgreSQL also runs natively on Microsoft Windows NT-based operating
systems like Win2000 SP4, WinXP, and Win2003. A prepackaged installer
is available at http://www.postgresql.org/download/windows.
MSDOS-based versions of Windows (Win95, Win98, WinMe) can run
PostgreSQL using Cygwin.

=== Where can I get PostgreSQL? ===

There are binary distributions for various operating systems and platforms; see [http://www.postgresql.org/download our download area].

The source code can be obtained [http://www.postgresql.org/ftp/ via web browser] or [ftp://ftp.postgresql.org/pub/ through ftp].

=== What is the most recent release? ===

The latest release of PostgreSQL is shown on the front page of [http://www.postgresql.org/ our website].

We typically have a major release every year, with minor releases every few months.
Minor releases are usually made at the same time for all supported major-release branches.
For more about major versus minor releases, see
http://www.postgresql.org/support/versioning.

=== Where can I get support? ===

The PostgreSQL community provides assistance to many of its users via
email. The main web site to subscribe to the email lists is
http://www.postgresql.org/community/lists/. The general or bugs lists
are a good place to start. For best results, consider reading the
[[guide to reporting problems]] before you post to make sure you
include enough information for people to help you.

The major IRC channel is #postgresql on Freenode (irc.freenode.net).
A Spanish one also
exists on the same network, (#postgresql-es), a French one,
(#postgresqlfr), and a Brazilian one, (#postgresql-br). There is also
a PostgreSQL channel on EFNet.

A list of support companies is available at
http://www.postgresql.org/support/professional_support.

=== How do I submit a bug report? ===

Visit the PostgreSQL bug form at
http://www.postgresql.org/support/submitbug to submit your bug
report to the pgsql-bugs mailing list. Also check out our ftp
site ftp://ftp.postgresql.org/pub/ to see if there is a more recent
PostgreSQL version.

For a prompt and helpful response, it is important for you to read the
[[guide to reporting problems]] to make sure that you include the
information required to fully understand and act on your report.

Bugs submitted using the bug form or posted to any PostgreSQL mailing
list typically generates one of the following replies:
* It is not a bug, and why
* It is a known bug and is already on the TODO list
* The bug has been fixed in the current release
* The bug has been fixed but is not packaged yet in an official release
* A request is made for more detailed information:
** Operating system
** PostgreSQL version
** Reproducible test case
** Debugging information
** [[Generating_a_stack_trace_of_a_PostgreSQL_backend|Debugger backtrace output]]
* The bug is new. The following might happen:
** A patch is created and will be included in the next major or minor release
** The bug cannot be fixed immediately and is added to the TODO list

=== How do I find out about known bugs or missing features? ===

PostgreSQL supports an extended subset of SQL:2008. See our [[Todo|TODO list]]
for known bugs, missing features, and future plans.

A feature request usually results in one of the following replies:
* The feature is already on the TODO list
* The feature is not desired because:
** It duplicates existing functionality that already follows the SQL standard
** The feature would increase code complexity but add little benefit
** The feature would be insecure or unreliable
* The new feature is added to the TODO list

PostgreSQL does not use a bug tracking system because we find it more
efficient to respond directly to email and keep the TODO list
up-to-date. In practice, bugs don't last very long in the software,
and bugs that affect a large number of users are fixed rapidly. The
only place to find all changes, improvements, and fixes in a
PostgreSQL release is to read the CVS log messages. Even the release
notes do not list every change made to the software.

=== A bug I'm encountering is fixed in a newer minor release of PostgreSQL, but I don't want to upgrade. Can I get a patch for just this issue? ===

No. Nobody will make a custom patch for you so you can (say) extract a fix from 8.4.3 and apply it to 8.4.1 .
That's because there should never be any need to do that.

PostgreSQL has a strict policy that only bug fixes are back-patched into point releases, as per the [http://www.postgresql.org/support/versioning version policy]. It is safe to upgrade from 8.4.1 to 8.4.3,
for example. Binary compatibility will be maintained, no dump and reload is required, nothing will break, but bugs that might
cause problems have been fixed. Even if you are not yet encountering a particular bug, you might later, and it is wise to upgrade promptly.
You just have to install the update and re-start the database server, nothing more.

Upgrading from 8.3 to 8.4, or 8.4 to 9.0, is a major upgrade that does not come with the same guarantees. However, if a bug
is discovered in 9.0 then it will generally be fixed in all maintained older versions like 8.4 and 8.3 if it is safe and
practical to do so.

This means that if you're running 8.1.0, upgrading to 8.1.21 is <b>strongly</b> recommended and very safe. On the other hand,
upgrading to the next major release, 8.2.x, may require changes to your app, and will certainly require a dump and reload.

If you want to be careful about all upgrades, you should read the [http://www.postgresql.org/docs/current/static/release.html release notes]
for each point release between your current one and the latest minor version of the same major release carefully. If you're
exceptionally paranoid about upgrades, you can fetch the source code to each set of point release changes from [http://git.postgresql.org/ PostgreSQL's git repository] and examine it.

It is strongly recommended that you <b>always</b> upgrade to the latest minor release. Avoid trying to extract and apply individual fixes
from point releases; by doing so you're bypassing all the QA done by the PostgreSQL team when they prepare a release, and are creating your
own custom version that <i>nobody else has ever used</i>. It's a lot safer to just update to the latest tested, safe release. <i>Patching your own custom, non-standard build will also take more time/effort, and will require the same amount of downtime as a normal upgrade.</i>

=== I have a program that says it wants PostgreSQL x.y.1. Can I use PostgreSQL x.y.2 instead? ===

Any program that works with a particular version, like 8.4.1, should work with any other minor version in the same major version. That means that if a program says it wants (eg) 8.4.1, you can and should install the latest in the 8.4 series instead.

See the previous question for more details.

=== What documentation is available? ===

PostgreSQL includes extensive documentation, including a large manual,
manual pages, and some test examples. See the /doc directory. You can
also browse the manuals online at http://www.postgresql.org/docs.

There are a number of PostgreSQL
books available for purchase; two of them are also available online. A list of books can be found at
http://www.postgresql.org/docs/books/. One of the most popular ones is the one by Korry & Susan
Douglas.

There is also a collection of
PostgreSQL technical articles on the
[[Community_Generated_Articles%2C_Guides%2C_and_Documentation | wiki]].

The command line client program psql has some \d commands to show
information about types, operators, functions, aggregates, etc. - use
\? to display the available commands.

=== How can I learn SQL? ===

First, consider the PostgreSQL-specific books mentioned above. Many of
our users also like The Practical SQL Handbook, Bowman, Judith S., et
al., Addison-Wesley. Others like The Complete Reference SQL, Groff et
al., McGraw-Hill.

Many people consider the PostgreSQL documentation to be an excellent guide
for learning SQL its self, as well as for PostgreSQL's implementation of it.
For best results use PostgreSQL alongside another full-featured SQL database as
you learn, so you get used to SQL without becoming reliant on PostgreSQL-specific
features. The PostgreSQL documentation generally mentions when features are PostgreSQL
extensions of the standard.

There are also many nice tutorials available online:
* http://www.intermedia.net/support/sql/sqltut.shtm
* http://sqlcourse.com
* http://www.w3schools.com/sql/default.asp
* http://mysite.verizon.net/Graeme_Birchall/id1.html
* http://sqlzoo.net

=== How do I submit a patch or join the development team? ===

See the [[Developer FAQ|Developer's FAQ]].

=== How does PostgreSQL compare to other DBMSs? ===

There are several ways of measuring software: features, performance,
reliability, support, and price.

==== Features ====

PostgreSQL has most features present in large proprietary DBMSs,
like transactions, subselects, triggers, views, foreign key
referential integrity, and sophisticated locking. We have some
features they do not have, like user-defined types,
inheritance, rules, and multi-version concurrency control to
reduce lock contention.

==== Performance ====

PostgreSQL's performance is comparable to other proprietary and
open source databases. It is faster for some things, slower for
others. Our performance is usually +/-10% compared to other databases.

==== Reliability ====

We realize that a DBMS must be reliable, or it is worthless. We
strive to release well-tested, stable code that has a minimum
of bugs. Each release has at least one month of beta testing,
and our release history shows that we can provide stable, solid
releases that are ready for production use. We believe we
compare favorably to other database software in this area.

==== Support ====

Our mailing lists provide contact with a large group of
developers and users to help resolve any problems encountered.
While we cannot guarantee a fix, proprietary DBMSs do not always
supply a fix either. Direct access to developers, the user
community, manuals, and the source code often make PostgreSQL
support superior to other DBMSs. There is commercial
per-incident support available for those who need it. (See [[#Where_can_I_get_support.3F | section 1.7]]).

==== Price ====

We are free for all use, both proprietary and open source.
You can add our code to your product with no limitations,
except those outlined in our BSD-style license stated above.

=== Can PostgreSQL be embedded? ===

PostgreSQL is designed as a client/server architecture, which requires
separate processes for each client and server, and various helper
processes. Many embedded architectures can support such requirements.
However, if your embedded architecture requires the database server to
run inside the application process, you cannot use Postgres and should
select a lighter-weight database solution.

Popular embeddable options include [http://sqlite.org SQLite] and [http://firebirdsql.org Firebird SQL].

=== How do I unsubscribe from the PostgreSQL email lists? How do I avoid receiving duplicate emails? ===

The PostgreSQL Majordomo page allows subscribing or unsubscribing from
any of the PostgreSQL email lists. (You might need to have your
Majordomo password emailed to you to log in.)

All PostgreSQL email lists are configured so a group reply goes to the
email list and the original email author. This is done so users
receive the quickest possible email replies. If you would prefer not
to receive duplicate email from the list in cases where you already
receive an email directly, check eliminatecc from the Majordomo Change
Settings page. You can also prevent yourself from receiving copies of
emails you post to the lists by unchecking selfcopy.

== User Client Questions ==

=== What interfaces are available for PostgreSQL? ===

The PostgreSQL install includes only the C and embedded C interfaces.
All other interfaces are independent projects that are downloaded
separately; being separate allows them to have their own release
schedule and development teams.

Some programming languages like PHP include an interface to
PostgreSQL. Interfaces for languages like Perl, TCL, Python, and many
others are available at http://pgfoundry.org.

=== What tools are available for using PostgreSQL with Web pages? ===

A nice introduction to Database-backed Web pages can be seen at:
http://www.webreview.com

For Web integration, PHP (http://www.php.net) is an excellent
interface.

For complex cases, many use the Perl and DBD::Pg with CGI.pm or
mod_perl.

=== Does PostgreSQL have a graphical user interface? ===

There are a large number of GUI Tools that are available for
PostgreSQL from both proprietary and open source developers. A detailed
list can be found in the [[Community Guide to PostgreSQL GUI Tools]].

== Administrative Questions ==

=== How do I install PostgreSQL somewhere other than /usr/local/pgsql? ===

Specify the --prefix option when running configure.

=== I'm installing PostgreSQL and don't know the password for the postgres user ===

Dave Page wrote a [http://pgsnake.blogspot.com/2010/07/postgresql-passwords-and-installers.html blog post] explaining what the different passwords are used for, and how to overcome common problems such as resetting them.

=== How do I control connections from other hosts? ===

By default, PostgreSQL only allows connections from the local machine
using Unix domain sockets or TCP/IP connections. Other machines will
not be able to connect unless you modify listen_addresses in the
postgresql.conf file, enable host-based authentication by modifying
the $PGDATA/pg_hba.conf file, and restart the database server.

=== How do I tune the database engine for better performance? ===

There are three major areas for potential performance improvement:

==== Query Changes ====
This involves modifying queries to obtain better performance:

* Creation of indexes, including expression and partial indexes
* Use of COPY instead of multiple INSERTs
* Grouping of multiple statements into a single transaction to reduce commit overhead
* Use of CLUSTER when retrieving many rows from an index
* Use of LIMIT for returning a subset of a query's output
* Use of Prepared queries
* Use of ANALYZE to maintain accurate optimizer statistics
* Regular use of VACUUM or pg_autovacuum
* Dropping of indexes during large data changes

==== Server Configuration ====
A number of postgresql.conf settings affect performance. For
more details, see Administration Guide/Server Run-time
Environment/Run-time Configuration.

==== Hardware Selection ====
The effect of hardware on performance is detailed in
http://www.powerpostgresql.com/PerfList/ and
http://momjian.us/main/writings/pgsql/hw_performance/index.html.

=== What debugging features are available? ===

There are many log_* server configuration variables at http://www.postgresql.org/docs/current/interactive/runtime-config-logging.html that enable printing of query and process statistics which can be very useful for debugging and performance measurements.

=== Why do I get "Sorry, too many clients" when trying to connect? ===

You have reached the default limit of 100 database sessions. You need
to increase the server's limit on how many concurrent backend
processes it can start by changing the max_connections value in
postgresql.conf and restarting the server.

=== What is the upgrade process for PostgreSQL? ===

See http://www.postgresql.org/support/versioning for a general
discussion about upgrading, and
http://www.postgresql.org/docs/current/static/install-upgrading.html
for specific instructions.

=== Will PostgreSQL handle recent daylight saving time changes in various countries? ===

PostgreSQL releases 8.0 and up depend on the widely-used tzdata database
(also called the zoneinfo database or the [http://www.twinsun.com/tz/tz-link.htm Olson timezone database]) for
daylight savings information. To deal with a DST law change that affects you,
install a new tzdata file set and restart the server.

All PostgreSQL update releases include the latest available tzdata files,
so keeping up-to-date on minor releases for your major version is usually
sufficient for this.

On platforms that receive regular software updates including new tzdata files,
it may be more convenient to rely on the system's copy of the tzdata files.
This is possible as a compile-time option. Most Linux distributions choose
this approach for their pre-built versions of PostgreSQL.

PostgreSQL releases before 8.0 always rely on the operating system's timezone
information.

=== What computer hardware should I use? ===

Because PC hardware is mostly compatible, people tend to believe that
all PC hardware is of equal quality. It is not. ECC RAM, SCSI, and
quality motherboards are more reliable and have better performance
than less expensive hardware. PostgreSQL will run on almost any
hardware, but if reliability and performance are important it is wise
to research your hardware options thoroughly.

Database servers, unlike many other applications, are usually I/O and memory constrained, so it is wise to focus on the I/O subsystem first, then memory capacity, and lastly consider CPU issues. For example, a disk controller with a
battery-backed cache is often the cheapest and easiest way to improve database performance. Our email lists can be used to discuss hardware options and tradeoffs.

=== How does PostgreSQL use CPU resources? ===

The PostgreSQL server is process-based (not threaded), and uses one operating system process per database session. A single database session (connection) cannot utilize more than one CPU. Of course, multiple sessions are automatically spread across all available CPUs by your operating system. Client applications can easily use threads and create multiple database connections from each thread.

A single complex and CPU-intensive query is unable to use more than one CPU to do the processing for the query. The OS may still be able to use others for disk I/O etc, but you won't see much benefit from more than one spare core.

=== Why does PostgreSQL have so many processes, even when idle? ===

As noted in [[FAQ#How does PostgreSQL use CPU resources?|the answer above]], PostgreSQL is process based, so it starts one <code>postgres</code> (or <code>postgres.exe</code> on Windows) instance per connection. The postmaster (which accepts connections and starts new postgres instances for them) is always running. In addition, PostgreSQL generally has one or more "helper" processes like the stats collector, background writer, autovacuum daemon, walsender, etc, all of which show up as "postgres" instances in most system monitoring tools.

Despite the number of processes, they actually use very little in the way of real resources. See [[FAQ#Why does PostgreSQL show up as using so much memory in my system monitoring tool?|the next answer]].

=== Why does PostgreSQL use so much memory? ===

Despite appearances, this is absolutely normal, and there's actually nowhere near as much memory being used as tools like <code>top</code> or the Windows process monitor say PostgreSQL is using.

Tools like <code>top</code> and the Windows process monitor may show many <code>postgres</code> instances (see above), each of which appears to use a huge amount of memory. Often, when added up, the amount the postgres instances use is many times the amount of memory actually installed in the computer!

This is a consequence of how these tools report memory use. They generally don't understand shared memory very well, and show it as if it was memory used individually and exclusively by each postgres instance. PostgreSQL uses a big chunk of shared memory to communicate between its backends and cache data. Because these tools count that shared memory block once per <code>postgres</code> instance instead of counting it once for <i>all</i> <code>postgres</code> instances, they massively over-estimate how much memory PostgreSQL is using.

Furthermore, many versions of these tools don't report the entire shared memory block as being used by an individual instance immediately when it starts, but rather count the number of shared pages it has touched since starting. Over the lifetime of an instance, it will inevitably touch more and more of the shared memory until it has touched every page, so that its reported usage will gradually rise to include the entire shared memory block. This is frequently misinterpreted to be a memory leak; but it is no such thing, only a reporting artifact.

== Operational Questions ==

=== How do I SELECT only the first few rows of a query? A random row? ===

To retrieve only a few rows, if you know at the number of rows needed
at the time of the SELECT use LIMIT . If an index matches the ORDER BY
it is possible the entire query does not have to be executed. If you
don't know the number of rows at SELECT time, use a cursor and FETCH.

To SELECT a random row, use:
SELECT col
FROM tab
ORDER BY random()
LIMIT 1;

See also this [http://blog.rhodiumtoad.org.uk/2009/03/08/selecting-random-rows-from-a-table/ blog entry by Andrew Gierth]
that has more information on this topic.

=== How do I find out what tables, indexes, databases, and users are defined? How do I see the queries used by psql to display them? ===

Use the \dt command to see tables in psql. For a complete list of
commands inside psql you can use \?. Alternatively you can read the
source code for psql in file pgsql/src/bin/psql/describe.c, it
contains SQL commands that generate the output for psql's backslash
commands. You can also start psql with the -E option so it will print
out the queries it uses to execute the commands you give. PostgreSQL
also provides an SQL compliant INFORMATION SCHEMA interface you can
query to get information about the database.

There are also system tables beginning with pg_ that describe these
too.

Use psql -l will list all databases.

Also try the file pgsql/src/tutorial/syscat.source. It illustrates
many of the SELECTs needed to get information from the database system
tables.

=== How do you change a column's data type? ===

Changing the data type of a column can be done easily in 8.0 and later
with ALTER TABLE ALTER COLUMN TYPE.

In earlier releases, do this:
BEGIN;
ALTER TABLE tab ADD COLUMN new_col new_data_type;
UPDATE tab SET new_col = CAST(old_col AS new_data_type);
ALTER TABLE tab DROP COLUMN old_col;
COMMIT;

You might then want to do VACUUM FULL tab to reclaim the disk space
used by the expired rows.

=== What is the maximum size for a row, a table, and a database? ===

These are the limits:

Maximum size for a database? unlimited (32 TB databases exist)
Maximum size for a table? 32 TB
Maximum size for a row? 400 GB
Maximum size for a field? 1 GB
Maximum number of rows in a table? unlimited
Maximum number of columns in a table? 250-1600 depending on column types
Maximum number of indexes on a table? unlimited

Of course, these are not actually unlimited, but limited to available
disk space and memory/swap space. Performance may suffer when these
values get unusually large.

The maximum table size of 32 TB does not require large file support
from the operating system. Large tables are stored as multiple 1 GB
files so file system size limits are not important.

The maximum table size, row size, and maximum number of columns can be
quadrupled by increasing the default block size to 32k. The maximum
table size can also be increased using table partitioning.

One limitation is that indexes can not be created on columns longer
than about 2,000 characters. Fortunately, such indexes are rarely
needed. Uniqueness is best guaranteed by a function index of an MD5
hash of the long column, and full text indexing allows for searching
of words within the column.

=== How much database disk space is required to store data from a typical text file? ===

A PostgreSQL database may require up to five times the disk space to
store data from a text file.

As an example, consider a file of 100,000 lines with an integer and
text description on each line. Suppose the text string averages
twenty bytes in length. The flat file would be 2.8 MB. The size of the
PostgreSQL database file containing this data can be estimated as 5.2
MB:
24 bytes: each row header (approximate)
24 bytes: one int field and one text field
+ 4 bytes: pointer on page to tuple
----------------------------------------
52 bytes per row

The data page size in PostgreSQL is 8192 bytes (8 KB), so:

8192 bytes per page
------------------- = 158 rows per database page (rounded down)
52 bytes per row

100000 data rows
------------------ = 633 database pages (rounded up)
158 rows per page

633 database pages * 8192 bytes per page = 5,185,536 bytes (5.2 MB)

Indexes do not require as much overhead, but do contain the data that
is being indexed, so they can be large also.

NULLs are stored as bitmaps, so they use very little space.

Note that long values may be compressed transparently.

See also this presentation on the topic: [[Image:How_Long_Is_a_String.pdf]].

=== Why are my queries slow? Why don't they use my indexes? ===

Indexes are not used by every query. Indexes are used only if the
table is larger than a minimum size, and the query selects only a
small percentage of the rows in the table. This is because the random
disk access caused by an index scan can be slower than a straight read
through the table, or sequential scan.

To determine if an index should be used, PostgreSQL must have
statistics about the table. These statistics are collected using
VACUUM ANALYZE, or simply ANALYZE. Using statistics, the optimizer
knows how many rows are in the table, and can better determine if
indexes should be used. Statistics are also valuable in determining
optimal join order and join methods. Statistics collection should be
performed periodically as the contents of the table change.

Indexes are normally not used for ORDER BY or to perform joins. A
sequential scan followed by an explicit sort is usually faster than an
index scan of a large table. However, LIMIT combined with ORDER BY
often will use an index because only a small portion of the table is
returned.

If you believe the optimizer is incorrect in choosing a sequential
scan, use SET enable_seqscan TO 'off' and run query again to see if an
index scan is indeed faster.

When using wild-card operators such as LIKE or ~, indexes can only be
used in certain circumstances:

* The beginning of the search string must be anchored to the start of the string, i.e.
** LIKE patterns must not start with % or _.
** ~ (regular expression) patterns must start with ^.

* The search string can not start with a character class, e.g. [a-e].

* Case-insensitive searches such as ILIKE and ~* do not utilize indexes. Instead, use expression indexes, which are described in [[#How_do_I_perform_regular_expression_searches_and_case-insensitive_regular_expression_searches.3F_How_do_I_use_an_index_for_case-insensitive_searches.3F | section 4.8]].

* C locale must be used during initdb because sorting in a non-C locale often doesn't match the behavior of LIKE. You can create a special text_pattern_ops index that will work in such cases, but note it is only helpful for LIKE indexing.

It is also possible to use full text indexing for word searches.

The [[SlowQueryQuestions]] article contains some more tips and guidance.

=== How do I see how the query optimizer is evaluating my query? ===

This is done with the EXPLAIN command; see [[Using EXPLAIN]].

=== How do I change the sort ordering of textual data? ===

PostgreSQL sorts textual data according to the ordering that is defined by the current locale, which is selected during initdb.
(In 8.4 and up it will be possible to select a different locale when creating a new database.)
If you don't like the ordering then you need to use a different locale.
In particular, most locales other than "C" sort according to dictionary order, which largely ignores punctuation and spacing.
If that's not what you want then you need "C" locale.

=== How do I perform regular expression searches and case-insensitive regular expression searches? How do I use an index for case-insensitive searches? ===

The ~ operator does regular expression matching, and ~* does
case-insensitive regular expression matching. The case-insensitive
variant of LIKE is called ILIKE.

Case-insensitive equality comparisons are normally expressed as:
SELECT *
FROM tab
WHERE lower(col) = 'abc';

This will not use a standard index on "col". However, if you create an
expression index on "lower(col)", it will be used:
CREATE INDEX tabindex ON tab (lower(col));

If the above index is created as UNIQUE, then the column can store
upper and lowercase characters, but it cannot contain identical values that
differ only in case. To force a particular case to be stored in the
column, use a CHECK constraint or a trigger.

In PostgreSQL 8.4 and later, you can also use the contributed [http://www.postgresql.org/docs/current/static/citext.html CITEXT data type], which internally implements the "lower()" calls, so that you can effectively treat it as a fully case-insensitive data type. CITEXT is also [https://svn.kineticode.com/citext/trunk/ available for 8.3], and an earlier version that treats only ASCII characters case-insensitively on 8.2 and earlier is available on [http://pgfoundry.org/projects/citext/ pgFoundry].

=== In a query, how do I detect if a field is NULL? How do I concatenate possible NULLs? How can I sort on whether a field is NULL or not? ===

You can test the value with IS NULL or IS NOT NULL, like this:
SELECT *
FROM tab
WHERE col IS NULL;

Concatenating a NULL with something else produces another NULL.
If that's not what you want, you can replace the NULL(s) using
COALESCE(), like this:
<nowiki>SELECT COALESCE(col1, '') || COALESCE(col2, '')</nowiki>
FROM tab;

To sort by the NULL status, use an IS NULL or IS NOT NULL test
in your ORDER BY clause. Things that are true will sort higher than
things that are false, so the following will put NULL entries at the
front of the output:
SELECT *
FROM tab
ORDER BY (col IS NOT NULL), col;

In PostgreSQL 8.3 and up, you can also control sort ordering of NULLs
using the recently-standardized NULLS FIRST/NULLS LAST modifiers,
like this:
SELECT *
FROM tab
ORDER BY col NULLS FIRST;

=== What is the difference between the various character types? ===

{| border="1"
!Type
!Internal Name
!Notes
|-
|VARCHAR(n)
|varchar
|size specifies maximum length, no padding
|-
|CHAR(n)
|bpchar
|blank-padded to the specified fixed length
|-
|TEXT
|text
|no specific upper limit on length
|-
|BYTEA
|bytea
|variable-length byte array (null-byte safe)
|-
|"char" (with the quotes)
|char
|one byte
|}

You will see the internal name when examining system catalogs and in
some error messages.

The first four types above are "varlena" types (i.e., the field length
is explicitly stored on disk, followed by the data). Thus the actual
space used is slightly greater than the expected size. However, long
values are also subject to compression, so the space on disk might
also be less than expected.

VARCHAR(n) is best when storing variable-length strings if a specific
upper limit on the string length is required by the application.
TEXT is for strings of "unlimited" length (though all fields in PostgreSQL
are subject to a maximum value length of one gigabyte).

CHAR(n) is for storing strings that are all the same length. CHAR(n)
pads with blanks to the specified length, while VARCHAR(n) only stores
the characters supplied. BYTEA is for storing binary data,
particularly values that include zero bytes. All these types have similar
performance characteristics, except that the blank-padding involved
in CHAR(n) requires additional storage and some extra runtime.

The "char" type (the quotes are required to distinguish it from CHAR(n))
is a specialized datatype that can store exactly one byte. It is found in
the system catalogs but its use in user tables is generally discouraged.

=== How do I create a serial/auto-incrementing field? ===

PostgreSQL supports a SERIAL data type. Actually, this isn't quite
a real type. It's a shorthand for creating an integer column that
is fed from a sequence.

For example, this:
CREATE TABLE person (
id SERIAL,
name TEXT
);
is automatically translated into this:
CREATE SEQUENCE person_id_seq;
CREATE TABLE person (
id INTEGER NOT NULL DEFAULT nextval('person_id_seq'),
name TEXT
);

The automatically created sequence is named ''table''_''serialcolumn''_seq,
where ''table'' and ''serialcolumn'' are the names of the table and SERIAL
column, respectively. See the CREATE SEQUENCE manual page for more
information about sequences.

There is also BIGSERIAL, which is like SERIAL except that the resulting
column is of type BIGINT instead of INTEGER. Use this type if you think
that you might need more than 2 billion serial values over the lifespan
of the table.

Note that sequences may contain "holes" or "gaps" as a normal part of operation. It is entirely normal for generated keys to go 1, 4, 5, 6, 9, ... . See [[#Why are there gaps in the numbering of my sequence/SERIAL column? Why aren't my sequence numbers reused on transaction abort?|the FAQ entry on sequence gaps]].

=== How do I get the value of a SERIAL insert? ===

The simplest way is to retrieve the assigned SERIAL value with
RETURNING. Using the example table in the previous question, it would look like this:
INSERT INTO person (name) VALUES ('Blaise Pascal') RETURNING id;

You can also call nextval() and use that value in the INSERT, or call
currval() after the INSERT.

=== Doesn't currval() lead to a race condition with other users? ===

No. currval() returns the latest sequence value assigned by your session,
independently of what is happening in other sessions.

=== Why are there gaps in the numbering of my sequence/SERIAL column? Why aren't my sequence numbers reused on transaction abort? ===

To improve concurrency, sequence values are given out to running
transactions on-demand; the sequence object is not kept locked but is
immediately available for another transaction to get another sequence
value. This causes gaps in numbering from aborted transactions, as documented in the [http://www.postgresql.org/docs/current/static/functions-sequence.html NOTE section for the nextval() function].

Additionally, an unclean server shutdown will cause sequences to increment on recovery, because PostgreSQL keeps a cache of sequence numbers to hand out and in an unclean shutdown it isn't sure which of those cached numbers has already been used. Since sequences are allowed to have gaps anyway it takes the safe option and increments the sequence.

Another cause for gaps in sequence is the use of the CACHE clause in [http://www.postgresql.org/docs/current/static/sql-createsequence.html CREATE SEQUENCE].

In general, you should not rely on SERIAL keys or SEQUENCEs being gapless, nor should you make assumptions about their order; [http://www.postgresql.org/docs/current/static/sql-createsequence.html#AEN69802 it is ''not'' guaranteed that id n+1 was inserted after id n except when both were generated within the same transaction]. Compare synthetic keys for equality and only for equality.

Gap-less sequences are possible, but are very bad for performance. At most one transaction at a time can be inserting rows from a gapless sequence. There is no built-in SERIAL or SEQUENCE equivalent for gap-less sequences, but one is [http://stackoverflow.com/a/9985219/398670 trivial to implement]. Information on gapless sequence implementations can be found in the mailing list archives, on Stack Overflow, and in [http://www.varlena.com/GeneralBits/130.php this useful article]. Avoid using a gap-less sequence unless it is an absolute business requirement. Consider dynamically generating the gap-less numbering on demand for display, using the [http://www.postgresql.org/docs/current/static/tutorial-window.html row_number() window function], or adding it in a batch process that runs periodically.

See also: [http://www.neilconway.org/docs/sequences/ FAQ: Using sequences in PostgreSQL].

=== What is an OID? ===

If a table is created WITH OIDS, each row includes an OID column that is automatically filled in during INSERT.
OIDs are sequentially assigned 4-byte integers. Initially they are unique
across the entire installation. However, the OID counter wraps around at 4 billion,
and after that OIDs may be duplicated.

It is possible to prevent duplication of OIDs within a single table by
creating a unique index on the OID column (but note that the WITH OIDS
clause doesn't by itself create such an index).
The system checks the index to see if a newly
generated OID is already present, and if so generates a new OID and
repeats. This works well so long as no OID-containing table has
more than a small fraction of 4 billion rows.

PostgreSQL uses OIDs for object identifiers in the system catalogs,
where the size limit is unlikely to be a problem.

To uniquely number rows in user tables, it is best to use SERIAL
rather than an OID column, or BIGSERIAL if the table is expected to
have more than 2 billion entries over its lifespan.

=== What is a CTID? ===

CTIDs identify specific physical rows by their block and
offset positions within a table.
They are used by index entries to point to physical rows.
A logical row's CTID changes when it is updated, so the CTID
cannot be used as a long-term row identifier. But it is sometimes
useful to identify a row within a transaction when no competing
update is expected.

=== Why do I get the error "ERROR: Memory exhausted in AllocSetAlloc()"? ===

You probably have run out of virtual memory on your system, or your
kernel has a low limit for certain resources. Try this before starting
the server:
ulimit -d 262144
limit datasize 256m

Depending on your shell, only one of these may succeed, but it will
set your process data segment limit much higher and perhaps allow the
query to complete. This command applies to the current process, and
all subprocesses created after the command is run. If you are having a
problem with the SQL client because the backend is returning too much
data, try it before starting the client.

=== How do I tell what PostgreSQL version I am running? ===

Run this query: SELECT version();

=== Is there a way to leave an audit trail of database operations? ===

There's nothing built-in, but it's not too difficult to build such
facilities yourself.

Simple example right in the official docs:
http://www.postgresql.org/docs/current/static/plpgsql-trigger.html#PLPGSQL-TRIGGER-AUDIT-EXAMPLE

Project targeting this feature: http://pgfoundry.org/projects/tablelog/

Background information and other sample implementations:
http://it.toolbox.com/blogs/database-soup/simple-data-auditing-19014
http://www.go4expert.com/forums/showthread.php?t=7252
http://www.alberton.info/postgresql_table_audit.html

=== How do I create a column that will default to the current time? ===

Use CURRENT_TIMESTAMP:
CREATE TABLE test (x int, modtime TIMESTAMP DEFAULT CURRENT_TIMESTAMP );

=== How do I perform an outer join? ===

PostgreSQL supports outer joins using the SQL standard syntax. Here
are two examples:
SELECT *
FROM t1 LEFT OUTER JOIN t2 ON (t1.col = t2.col);

or
SELECT *
FROM t1 LEFT OUTER JOIN t2 USING (col);

These identical queries join t1.col to t2.col, and also return any
unjoined rows in t1 (those with no match in t2). A RIGHT join would
add unjoined rows of t2. A FULL join would return the matched rows
plus all unjoined rows from t1 and t2. The word OUTER is optional and
is assumed in LEFT, RIGHT, and FULL joins. Ordinary joins are called
INNER joins.

=== How do I perform queries using multiple databases? ===

There is no way to query a database other than the current one.
Because PostgreSQL loads database-specific system catalogs, it is
uncertain how a cross-database query should even behave.

contrib/dblink allows cross-database queries using function calls. Of
course, a client can also make simultaneous connections to different
databases and merge the results on the client side.

=== How do I return multiple rows or columns from a function? ===

It is easy using set-returning functions,
[[Return more than one row of data from PL/pgSQL functions]].

=== Why do I get "relation with OID ##### does not exist" errors when accessing temporary tables in PL/PgSQL functions? ===

In PostgreSQL versions < 8.3, PL/PgSQL caches function scripts, and an
unfortunate side effect is that if a PL/PgSQL function accesses a
temporary table, and that table is later dropped and recreated, and
the function called again, the function will fail because the cached
function contents still point to the old temporary table. The solution
is to use EXECUTE for temporary table access in PL/PgSQL. This will
cause the query to be reparsed every time.

This problem does not occur in PostgreSQL 8.3 and later.

=== What replication solutions are available? ===

Though "replication" is a single term, there are several technologies
for doing replication, with advantages and disadvantages for each.
Our documentation contains a good introduction to this topic at
http://www.postgresql.org/docs/current/static/high-availability.html and a
grid listing replication software and features is at
[[Replication, Clustering, and Connection Pooling]]

Master/slave replication allows a single master to receive read/write
queries, while slaves can only accept read/SELECT queries. The most
popular freely available master-slave PostgreSQL replication solution
is Slony-I.

Multi-master replication allows read/write queries to be sent to
multiple replicated computers. This capability also has a severe
impact on performance due to the need to synchronize changes between
servers. PGCluster is the most popular such solution freely available
for PostgreSQL.

There are also proprietary and hardware-based replication solutions
available supporting a variety of replication models.

=== Is possible to create a shared-storage postgresql server cluster? ===

PostgreSQL does not support clustering using [[Shared_Storage|shared storage]] on a SAN, SCSI backplane,
iSCSI volume, or other shared media. Such "RAC-style" clustering isn't supported.
Only replication-based clustering is currently supported.

See [[Replication, Clustering, and Connection Pooling]] information for details.

[[Shared_Storage|Shared-storage]] 'failover' is possible, but it is not safe to have more than one
postmaster running and accessing the data store at the same time. Heartbeat and
[http://en.wikipedia.org/wiki/STONITH STONITH] or some other hard-disconnect option are recommended.

=== Why are my table and column names not recognized in my query? Why is capitalization not preserved? ===

The most common cause of unrecognized names is the use of
double-quotes around table or column names during table creation. When
double-quotes are used, table and column names (called identifiers)
are stored case-sensitive, meaning you must use double-quotes when
referencing the names in a query. Some interfaces, like pgAdmin,
automatically double-quote identifiers during table creation. So, for
identifiers to be recognized, you must either:

* Avoid double-quoting identifiers when creating tables
* Use only lowercase characters in identifiers
* Double-quote identifiers when referencing them in queries

=== I lost the database password. What can I do to recover it? ===

You can't. However, you can reset it to something else. To do this, you

* edit pg_hba.conf to allow ''trust'' authorization temporarily
* Reload the config file (pg_ctl reload)
* Connect and issue ALTER ROLE / PASSWORD to set the new password
* edit pg_hba.conf again and restore the previous settings
* Reload the config file again

=== Does PostgreSQL have stored procedures? ===

PostgreSQL doesn't. However PostgreSQL have very powerful functions and user-defined functions capabilities that can do most things that other RDBMS stored routines (procedures and functions) can do and in many cases more.

These functions can be of different types and can be implemented in several programming languages.
(Refer to documentation for more details. [http://www.postgresql.org/docs/current/static/xfunc.html User-Defined Functions])

PostgreSQL functions can be invoked in many ways. If you want to invoke a function as you would call a stored procedure in other RDBMS (typically a function with side-effects but whose result you don't care for example because it returns void), one option would be to use [http://www.postgresql.org/docs/current/static/plpgsql.html PL/pgSQL Language] for your procedure and the [http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-NORESULT PERFORM] command. Example:
PERFORM theNameOfTheFunction(arg1, arg2);

Note that invoking instead:
SELECT theNameOfTheFunction(arg1, arg2);
would produce a result even if the function returns void (this result would be one row containing a void value).

[http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-NORESULT PERFORM] could thus be used to discard this unuseful result.

The main limitations on Pg's stored functions - as compared to true stored procedures - are:

* inability to return multiple result sets
* no support for autonomous transactions (<code>BEGIN</code>, <code>COMMIT</code> and <code>ROLLBACK</code> within a function)
* no support for the SQL-standard <code>CALL</code> syntax, though the ODBC and JDBC drivers will translate calls for you.

=== Why don't BEGIN, ROLLBACK and COMMIT work in stored procedures/functions? ===

PostgreSQL doesn't support autonomous transactions in its stored functions. Like all PostgreSQL queries, stored functions always run in a transaction and cannot operate outside a transaction.

If you need a stored procedure to manage transactions, you can look into the dblink interface or do the work from a client-side script instead. In some cases you can do what you need to using [http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING exception blocks in PL/PgSQL], because each BEGIN/EXCEPTION/END block creates a subtransaction.

=== Why is "SELECT count(*) FROM bigtable;" slow? ===

It can't be answered directly from an index. PostgreSQL has to check the visibility for each record, so it
forces a sequential scan of the entire table. If you want, you can keep track of the number of rows yourself with triggers, but beware that this will slow down write access to the table.

You can get an estimation. The reltuples column in [http://www.postgresql.org/docs/current/static/catalog-pg-class.html pg_class] contains the information from the latest [http://www.postgresql.org/docs/current/static/sql-analyze.html ANALYZE] of the table. On a large table this is often accurate to within a few thousandths of a percent, which is accurate enough for many purposes.

An "exact" count is often not exact for very long, anyway; due to [http://www.postgresql.org/docs/current/static/mvcc-intro.html MVCC] concurrency, the count will be accurate as of the moment the SELECT count(*) query (or, for stricter [http://www.postgresql.org/docs/current/static/transaction-iso.html transaction isolation] levels, its transaction) ''started'', and may well be out-of-date by the time the query completes. In a transaction mix where the table is being modified, two count(*) executions which return at the same moment might have different values, if a modifying transaction committed between their start times.

For more information, see [[Slow Counting]].

=== Why is my query much slower when run as a prepared query? ===

When PostgreSQL has the full query with all parameters known by planning time, it can use statistics in the table to find out if the values used in the query are very common or very uncommon in a column. This lets it change the way it fetches the data to be more efficient, as it knows to expect lots or very few results from a certain part of the query. For example, it might choose an sequential scan instead of doing an index scan if you search for 'active=y' and it knows that 99% of the records in the table have 'active=y', because in this case a sequential scan will be much faster.

In a prepared query, PostgreSQL doesn't have the value of all parameters when it's creating the plan. It has to try to pick a "safe" plan that should work fairly well no matter what value you supply as the parameter when you execute the prepared query. Unfortunately, this plan might not be very appropriate if the value you supply is vastly more common, or vastly less common, than is average for some randomly selected values in the table.

If you suspect this issue is affecting you, start by using the [http://www.postgresql.org/docs/current/static/sql-explain.html EXPLAIN] command to compare the slow and fast queries. Look at the output of <code>EXPLAIN SELECT query...</code> and compare it to the result of <code>PREPARE query... ; EXPLAIN EXECUTE query...</code> to see if the plans are notably different. <code>EXPLAIN ANALYZE</code> may give you more information, such as row count estimates and counts.

Usually people having this problem are trying to use prepared queries as a security measure to prevent SQL injection, rather than as a performance tuning option for expensive-to-plan queries frequently executed with a variety of different parameters. Those people should consider using client-side prepared statements if their client interface (eg PgJDBC) supports it.

At present, PostgreSQL does not offer a way to request re-planning of a prepared statement using a particular set of parameter values; doing so somewhat defeats the purpose of server-side prepared statements. Running a statistics check to see if a particular parameter value is notably outside the norm and automatically re-planning in that case has been discussed, but not agreed upon or implemented as yet.

See [[Using_EXPLAIN]]. If you're going to ask for help on the mailing lists, please read the [[Guide to reporting problems]].

=== Why is my query much slower when run in a function than standalone? ===

See [[FAQ#Why is my query much slower when run as a prepared query?]]. Queries in PL/PgSQL functions are prepared and cached, so they execute in much the same way as if you'd <code>PREPARE</code>d then <code>EXECUTE</code>d the query yourself.

If you're having really severe issues with this that improving the table statistics or adjusting your query don't help with, you can work around it by forcing PL/PgSQL to re-prepare your query at every execution. To do this, use the <code>EXECUTE ... USING</code> statement in PL/PgSQL to supply your query as a textual string. Alternately, the [http://www.postgresql.org/docs/current/static/functions-string.html quote_literal or quote_nullable] functions may be used to escape parameters substituted into query text.

=== Why do my strings sort incorrectly? ===

First, make sure you are using the locale you want to be using. Use <code>SHOW lc_collate</code> to show the database-wide locale in effect. If you are using per-column collations, check those. If everything is how you want it, then read on.

PostgreSQL uses the C library's locale facilities for sorting strings. So if the sort order of the strings is not what you expect, the issue is likely in the C library. You can verify the C library's idea of sorting using the <code>sort</code> utility on a text file, e.g.,

LC_COLLATE=xx_YY.utf8 sort testfile.txt

If this results in the same order that PostgreSQL gives you, then the problem is outside of PostgreSQL.

PostgreSQL deviates from the libc behavior in so far as it breaks ties by sorting strings in byte order. This should rarely make a
difference in practice, and is usually not the source of the problem when users complain about the sort order, but it could affect cases where, for example, combining and precombined Unicode characters are mixed.

If the problem is in the C library, you will have to take it up with your operating system maintainers. Note, however, that while actual bugs in locale definitions of C libraries have been known to exist, it is more likely that the C library is correct, where "correct" means it follows some recognized international or national standard. Possibly, you are expecting one of multiple equally valid interpretations of a language's sorting rules.

Common complaint patterns include:

* Spaces and special characters: The sorting algorithm normally works in multiple passes. First, all the letters are compared, ignoring spaces and punctuation. Then, spaces and punctuation are compared to break ties. (This is a simplification of what actually happens.) It's not possible to change this without changing the locale definitions themselves (and even then it's difficult). You might want to restructure your data slightly to avoid this problem. For example, if you are sorting a name field, you could split the field into first and last name fields, avoiding the space in between.

* Upper/lower case: Locales other than the C locale generally sort upper and lower case letters together. So the order will be something like a A b B c C ... instead of the A B C ... a b c ... that a sort based on ASCII byte values will give. That is correct.

* German locale: sort order of ä as a or ae. Both of these are valid (see http://de.wikipedia.org/wiki/Alphabetische_Sortierung), but most C libraries only provide the first one. Fixing this would require creating a custom locale. This is possible, but will take some work.

* It is not in ASCII/byte order. No, it's not, it's not supposed to be. ASCII is an encoding, not a sort order. If you want this, you can use the C locale, but then you use the ability to non-ASCII characters.

That said, if you are on Mac OS X or a BSD-family operating system, and you are using UTF-8, then give up. The locale definitions on
those operating systems are broken.

[[Category:FAQ]]

FAQ

2012-10-22T18:32:42Z

Schmiddy: /* Why are there gaps in the numbering of my sequence/SERIAL column? Why aren't my sequence numbers reused on transaction abort? */ fix link

{{Languages}}
[[:Category:FAQ|Additional FAQ Entries on this Wiki]]

== Translations of this Document ==

* [[Häufig gestellte Fragen|German]]
* [[Perguntas Frequentes|Portuguese]]
* [[Preguntas Frecuentes|Spanish]]
* [[Часто Задаваемые Вопросы|Русский]]

== Platform-specific questions ==

Windows users should also read the [[Running & Installing PostgreSQL On Native Windows|platform FAQ for Windows]]. There are [[Frequently Asked Questions#Platform FAQs|FAQs for other platforms]] too.

== General Questions ==

=== What is PostgreSQL? How is it pronounced? What is Postgres? ===

PostgreSQL is pronounced Post-Gres-Q-L. (For those curious about how
to say "PostgreSQL", an [http://www.postgresql.org/files/postgresql.mp3 audio file] is available.)

PostgreSQL is an object-relational database system that has the
features of traditional proprietary database systems with enhancements
to be found in next-generation DBMS systems. PostgreSQL is free and
the complete source code is available.

PostgreSQL development is performed by a team of mostly volunteer
developers spread throughout the world and communicating via the
Internet. It is a community project and is not controlled by any
company. To get involved, see the [[Developer_FAQ | Developer FAQ]].

Postgres is a widely-used nickname for PostgreSQL. It was the original
name of the project at Berkeley and is strongly preferred over other
nicknames. If you find 'PostgreSQL' hard to pronounce, call it
'Postgres' instead.

=== Who controls PostgreSQL? ===

If you are looking for a PostgreSQL gatekeeper, central committee, or
controlling company, give up --- there isn't one. We do have a core
committee and CVS committers, but these groups are more for
administrative purposes than control. The project is directed by the
community of developers and users, which anyone can join. All you need
to do is subscribe to the mailing lists and participate in the
discussions. (See the [[Developer FAQ|Developer's FAQ]] for information on how to get
involved in PostgreSQL development.)

=== Who is the PostgreSQL Global Development Group? ===

The "PGDG" is an international, unincorporated association of
individuals and companies who have contributed to the PostgreSQL
project. The PostgreSQL Core Team generally act as spokespeople
for the PGDG.

=== Who is the PostgreSQL Core Team? ===

A committee of five to seven (currently six) senior contributors to
PostgreSQL who do the following for the project: (a) set release dates,
(b) handle confidential matters for the project, (c) act as spokespeople
for the PGDG when required, and (d) arbitrate community decisions which
are not settled by consensus. The current Core Team is listed on top of
[http://www.postgresql.org/community/contributors/ the contributors page]

=== What about the various PostgreSQL foundations? ===

While the PostgreSQL project utilizes non-profit corporations in the
USA, Europe, Brazil and Japan for fundraising and project coordination,
these entities do not own the PostgreSQL code.

=== What is the license of PostgreSQL? ===

PostgreSQL is distributed under a license similar to BSD and MIT. Basically, it
allows users to do anything they want with the code, including
reselling binaries without the source code. The only restriction is
that you not hold us legally liable for problems with the software.
There is also the requirement that this copyright appear in all copies
of the software. Here is the license we use:

PostgreSQL Database Management System
(formerly known as Postgres, then as Postgres95)

Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group

Portions Copyright (c) 1994, The Regents of the University of California

Permission to use, copy, modify, and distribute this software and its
documentation for any purpose, without fee, and without a written agreement
is hereby granted, provided that the above copyright notice and this
paragraph and the following two paragraphs appear in all copies.

IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR
DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS
ON AN "AS IS" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO
PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.

=== What platforms does PostgreSQL support? ===

In general, any modern Unix-compatible platform should be able to run
PostgreSQL. The platforms that have received recent explicit testing
can be seen in the [http://buildfarm.postgresql.org/ Build farm].
The documentation contains more details about supported platforms at http://www.postgresql.org/docs/current/static/supported-platforms.html.

PostgreSQL also runs natively on Microsoft Windows NT-based operating
systems like Win2000 SP4, WinXP, and Win2003. A prepackaged installer
is available at http://www.postgresql.org/download/windows.
MSDOS-based versions of Windows (Win95, Win98, WinMe) can run
PostgreSQL using Cygwin.

=== Where can I get PostgreSQL? ===

There are binary distributions for various operating systems and platforms; see [http://www.postgresql.org/download our download area].

The source code can be obtained [http://www.postgresql.org/ftp/ via web browser] or [ftp://ftp.postgresql.org/pub/ through ftp].

=== What is the most recent release? ===

The latest release of PostgreSQL is shown on the front page of [http://www.postgresql.org/ our website].

We typically have a major release every year, with minor releases every few months.
Minor releases are usually made at the same time for all supported major-release branches.
For more about major versus minor releases, see
http://www.postgresql.org/support/versioning.

=== Where can I get support? ===

The PostgreSQL community provides assistance to many of its users via
email. The main web site to subscribe to the email lists is
http://www.postgresql.org/community/lists/. The general or bugs lists
are a good place to start. For best results, consider reading the
[[guide to reporting problems]] before you post to make sure you
include enough information for people to help you.

The major IRC channel is #postgresql on Freenode (irc.freenode.net).
A Spanish one also
exists on the same network, (#postgresql-es), a French one,
(#postgresqlfr), and a Brazilian one, (#postgresql-br). There is also
a PostgreSQL channel on EFNet.

A list of support companies is available at
http://www.postgresql.org/support/professional_support.

=== How do I submit a bug report? ===

Visit the PostgreSQL bug form at
http://www.postgresql.org/support/submitbug to submit your bug
report to the pgsql-bugs mailing list. Also check out our ftp
site ftp://ftp.postgresql.org/pub/ to see if there is a more recent
PostgreSQL version.

For a prompt and helpful response, it is important for you to read the
[[guide to reporting problems]] to make sure that you include the
information required to fully understand and act on your report.

Bugs submitted using the bug form or posted to any PostgreSQL mailing
list typically generates one of the following replies:
* It is not a bug, and why
* It is a known bug and is already on the TODO list
* The bug has been fixed in the current release
* The bug has been fixed but is not packaged yet in an official release
* A request is made for more detailed information:
** Operating system
** PostgreSQL version
** Reproducible test case
** Debugging information
** [[Generating_a_stack_trace_of_a_PostgreSQL_backend|Debugger backtrace output]]
* The bug is new. The following might happen:
** A patch is created and will be included in the next major or minor release
** The bug cannot be fixed immediately and is added to the TODO list

=== How do I find out about known bugs or missing features? ===

PostgreSQL supports an extended subset of SQL:2008. See our [[Todo|TODO list]]
for known bugs, missing features, and future plans.

A feature request usually results in one of the following replies:
* The feature is already on the TODO list
* The feature is not desired because:
** It duplicates existing functionality that already follows the SQL standard
** The feature would increase code complexity but add little benefit
** The feature would be insecure or unreliable
* The new feature is added to the TODO list

PostgreSQL does not use a bug tracking system because we find it more
efficient to respond directly to email and keep the TODO list
up-to-date. In practice, bugs don't last very long in the software,
and bugs that affect a large number of users are fixed rapidly. The
only place to find all changes, improvements, and fixes in a
PostgreSQL release is to read the CVS log messages. Even the release
notes do not list every change made to the software.

=== A bug I'm encountering is fixed in a newer minor release of PostgreSQL, but I don't want to upgrade. Can I get a patch for just this issue? ===

No. Nobody will make a custom patch for you so you can (say) extract a fix from 8.4.3 and apply it to 8.4.1 .
That's because there should never be any need to do that.

PostgreSQL has a strict policy that only bug fixes are back-patched into point releases, as per the [http://www.postgresql.org/support/versioning version policy]. It is safe to upgrade from 8.4.1 to 8.4.3,
for example. Binary compatibility will be maintained, no dump and reload is required, nothing will break, but bugs that might
cause problems have been fixed. Even if you are not yet encountering a particular bug, you might later, and it is wise to upgrade promptly.
You just have to install the update and re-start the database server, nothing more.

Upgrading from 8.3 to 8.4, or 8.4 to 9.0, is a major upgrade that does not come with the same guarantees. However, if a bug
is discovered in 9.0 then it will generally be fixed in all maintained older versions like 8.4 and 8.3 if it is safe and
practical to do so.

This means that if you're running 8.1.0, upgrading to 8.1.21 is <b>strongly</b> recommended and very safe. On the other hand,
upgrading to the next major release, 8.2.x, may require changes to your app, and will certainly require a dump and reload.

If you want to be careful about all upgrades, you should read the [http://www.postgresql.org/docs/current/static/release.html release notes]
for each point release between your current one and the latest minor version of the same major release carefully. If you're
exceptionally paranoid about upgrades, you can fetch the source code to each set of point release changes from [http://git.postgresql.org/ PostgreSQL's git repository] and examine it.

It is strongly recommended that you <b>always</b> upgrade to the latest minor release. Avoid trying to extract and apply individual fixes
from point releases; by doing so you're bypassing all the QA done by the PostgreSQL team when they prepare a release, and are creating your
own custom version that <i>nobody else has ever used</i>. It's a lot safer to just update to the latest tested, safe release. <i>Patching your own custom, non-standard build will also take more time/effort, and will require the same amount of downtime as a normal upgrade.</i>

=== I have a program that says it wants PostgreSQL x.y.1. Can I use PostgreSQL x.y.2 instead? ===

Any program that works with a particular version, like 8.4.1, should work with any other minor version in the same major version. That means that if a program says it wants (eg) 8.4.1, you can and should install the latest in the 8.4 series instead.

See the previous question for more details.

=== What documentation is available? ===

PostgreSQL includes extensive documentation, including a large manual,
manual pages, and some test examples. See the /doc directory. You can
also browse the manuals online at http://www.postgresql.org/docs.

There are a number of PostgreSQL
books available for purchase; two of them are also available online. A list of books can be found at
http://www.postgresql.org/docs/books/. One of the most popular ones is the one by Korry & Susan
Douglas.

There is also a collection of
PostgreSQL technical articles on the
[[Community_Generated_Articles%2C_Guides%2C_and_Documentation | wiki]].

The command line client program psql has some \d commands to show
information about types, operators, functions, aggregates, etc. - use
\? to display the available commands.

=== How can I learn SQL? ===

First, consider the PostgreSQL-specific books mentioned above. Many of
our users also like The Practical SQL Handbook, Bowman, Judith S., et
al., Addison-Wesley. Others like The Complete Reference SQL, Groff et
al., McGraw-Hill.

Many people consider the PostgreSQL documentation to be an excellent guide
for learning SQL its self, as well as for PostgreSQL's implementation of it.
For best results use PostgreSQL alongside another full-featured SQL database as
you learn, so you get used to SQL without becoming reliant on PostgreSQL-specific
features. The PostgreSQL documentation generally mentions when features are PostgreSQL
extensions of the standard.

There are also many nice tutorials available online:
* http://www.intermedia.net/support/sql/sqltut.shtm
* http://sqlcourse.com
* http://www.w3schools.com/sql/default.asp
* http://mysite.verizon.net/Graeme_Birchall/id1.html
* http://sqlzoo.net

=== How do I submit a patch or join the development team? ===

See the [[Developer FAQ|Developer's FAQ]].

=== How does PostgreSQL compare to other DBMSs? ===

There are several ways of measuring software: features, performance,
reliability, support, and price.

==== Features ====

PostgreSQL has most features present in large proprietary DBMSs,
like transactions, subselects, triggers, views, foreign key
referential integrity, and sophisticated locking. We have some
features they do not have, like user-defined types,
inheritance, rules, and multi-version concurrency control to
reduce lock contention.

==== Performance ====

PostgreSQL's performance is comparable to other proprietary and
open source databases. It is faster for some things, slower for
others. Our performance is usually +/-10% compared to other databases.

==== Reliability ====

We realize that a DBMS must be reliable, or it is worthless. We
strive to release well-tested, stable code that has a minimum
of bugs. Each release has at least one month of beta testing,
and our release history shows that we can provide stable, solid
releases that are ready for production use. We believe we
compare favorably to other database software in this area.

==== Support ====

Our mailing lists provide contact with a large group of
developers and users to help resolve any problems encountered.
While we cannot guarantee a fix, proprietary DBMSs do not always
supply a fix either. Direct access to developers, the user
community, manuals, and the source code often make PostgreSQL
support superior to other DBMSs. There is commercial
per-incident support available for those who need it. (See [[#Where_can_I_get_support.3F | section 1.7]]).

==== Price ====

We are free for all use, both proprietary and open source.
You can add our code to your product with no limitations,
except those outlined in our BSD-style license stated above.

=== Can PostgreSQL be embedded? ===

PostgreSQL is designed as a client/server architecture, which requires
separate processes for each client and server, and various helper
processes. Many embedded architectures can support such requirements.
However, if your embedded architecture requires the database server to
run inside the application process, you cannot use Postgres and should
select a lighter-weight database solution.

Popular embeddable options include [http://sqlite.org SQLite] and [http://firebirdsql.org Firebird SQL].

=== How do I unsubscribe from the PostgreSQL email lists? How do I avoid receiving duplicate emails? ===

The PostgreSQL Majordomo page allows subscribing or unsubscribing from
any of the PostgreSQL email lists. (You might need to have your
Majordomo password emailed to you to log in.)

All PostgreSQL email lists are configured so a group reply goes to the
email list and the original email author. This is done so users
receive the quickest possible email replies. If you would prefer not
to receive duplicate email from the list in cases where you already
receive an email directly, check eliminatecc from the Majordomo Change
Settings page. You can also prevent yourself from receiving copies of
emails you post to the lists by unchecking selfcopy.

== User Client Questions ==

=== What interfaces are available for PostgreSQL? ===

The PostgreSQL install includes only the C and embedded C interfaces.
All other interfaces are independent projects that are downloaded
separately; being separate allows them to have their own release
schedule and development teams.

Some programming languages like PHP include an interface to
PostgreSQL. Interfaces for languages like Perl, TCL, Python, and many
others are available at http://pgfoundry.org.

=== What tools are available for using PostgreSQL with Web pages? ===

A nice introduction to Database-backed Web pages can be seen at:
http://www.webreview.com

For Web integration, PHP (http://www.php.net) is an excellent
interface.

For complex cases, many use the Perl and DBD::Pg with CGI.pm or
mod_perl.

=== Does PostgreSQL have a graphical user interface? ===

There are a large number of GUI Tools that are available for
PostgreSQL from both proprietary and open source developers. A detailed
list can be found in the [[Community Guide to PostgreSQL GUI Tools]].

== Administrative Questions ==

=== How do I install PostgreSQL somewhere other than /usr/local/pgsql? ===

Specify the --prefix option when running configure.

=== I'm installing PostgreSQL and don't know the password for the postgres user ===

Dave Page wrote a [http://pgsnake.blogspot.com/2010/07/postgresql-passwords-and-installers.html blog post] explaining what the different passwords are used for, and how to overcome common problems such as resetting them.

=== How do I control connections from other hosts? ===

By default, PostgreSQL only allows connections from the local machine
using Unix domain sockets or TCP/IP connections. Other machines will
not be able to connect unless you modify listen_addresses in the
postgresql.conf file, enable host-based authentication by modifying
the $PGDATA/pg_hba.conf file, and restart the database server.

=== How do I tune the database engine for better performance? ===

There are three major areas for potential performance improvement:

==== Query Changes ====
This involves modifying queries to obtain better performance:

* Creation of indexes, including expression and partial indexes
* Use of COPY instead of multiple INSERTs
* Grouping of multiple statements into a single transaction to reduce commit overhead
* Use of CLUSTER when retrieving many rows from an index
* Use of LIMIT for returning a subset of a query's output
* Use of Prepared queries
* Use of ANALYZE to maintain accurate optimizer statistics
* Regular use of VACUUM or pg_autovacuum
* Dropping of indexes during large data changes

==== Server Configuration ====
A number of postgresql.conf settings affect performance. For
more details, see Administration Guide/Server Run-time
Environment/Run-time Configuration.

==== Hardware Selection ====
The effect of hardware on performance is detailed in
http://www.powerpostgresql.com/PerfList/ and
http://momjian.us/main/writings/pgsql/hw_performance/index.html.

=== What debugging features are available? ===

There are many log_* server configuration variables at http://www.postgresql.org/docs/current/interactive/runtime-config-logging.html that enable printing of query and process statistics which can be very useful for debugging and performance measurements.

=== Why do I get "Sorry, too many clients" when trying to connect? ===

You have reached the default limit of 100 database sessions. You need
to increase the server's limit on how many concurrent backend
processes it can start by changing the max_connections value in
postgresql.conf and restarting the server.

=== What is the upgrade process for PostgreSQL? ===

See http://www.postgresql.org/support/versioning for a general
discussion about upgrading, and
http://www.postgresql.org/docs/current/static/install-upgrading.html
for specific instructions.

=== Will PostgreSQL handle recent daylight saving time changes in various countries? ===

PostgreSQL releases 8.0 and up depend on the widely-used tzdata database
(also called the zoneinfo database or the [http://www.twinsun.com/tz/tz-link.htm Olson timezone database]) for
daylight savings information. To deal with a DST law change that affects you,
install a new tzdata file set and restart the server.

All PostgreSQL update releases include the latest available tzdata files,
so keeping up-to-date on minor releases for your major version is usually
sufficient for this.

On platforms that receive regular software updates including new tzdata files,
it may be more convenient to rely on the system's copy of the tzdata files.
This is possible as a compile-time option. Most Linux distributions choose
this approach for their pre-built versions of PostgreSQL.

PostgreSQL releases before 8.0 always rely on the operating system's timezone
information.

=== What computer hardware should I use? ===

Because PC hardware is mostly compatible, people tend to believe that
all PC hardware is of equal quality. It is not. ECC RAM, SCSI, and
quality motherboards are more reliable and have better performance
than less expensive hardware. PostgreSQL will run on almost any
hardware, but if reliability and performance are important it is wise
to research your hardware options thoroughly.

Database servers, unlike many other applications, are usually I/O and memory constrained, so it is wise to focus on the I/O subsystem first, then memory capacity, and lastly consider CPU issues. For example, a disk controller with a
battery-backed cache is often the cheapest and easiest way to improve database performance. Our email lists can be used to discuss hardware options and tradeoffs.

=== How does PostgreSQL use CPU resources? ===

The PostgreSQL server is process-based (not threaded), and uses one operating system process per database session. A single database session (connection) cannot utilize more than one CPU. Of course, multiple sessions are automatically spread across all available CPUs by your operating system. Client applications can easily use threads and create multiple database connections from each thread.

A single complex and CPU-intensive query is unable to use more than one CPU to do the processing for the query. The OS may still be able to use others for disk I/O etc, but you won't see much benefit from more than one spare core.

=== Why does PostgreSQL have so many processes, even when idle? ===

As noted in [[FAQ#How does PostgreSQL use CPU resources?|the answer above]], PostgreSQL is process based, so it starts one <code>postgres</code> (or <code>postgres.exe</code> on Windows) instance per connection. The postmaster (which accepts connections and starts new postgres instances for them) is always running. In addition, PostgreSQL generally has one or more "helper" processes like the stats collector, background writer, autovacuum daemon, walsender, etc, all of which show up as "postgres" instances in most system monitoring tools.

Despite the number of processes, they actually use very little in the way of real resources. See [[FAQ#Why does PostgreSQL show up as using so much memory in my system monitoring tool?|the next answer]].

=== Why does PostgreSQL use so much memory? ===

Despite appearances, this is absolutely normal, and there's actually nowhere near as much memory being used as tools like <code>top</code> or the Windows process monitor say PostgreSQL is using.

Tools like <code>top</code> and the Windows process monitor may show many <code>postgres</code> instances (see above), each of which appears to use a huge amount of memory. Often, when added up, the amount the postgres instances use is many times the amount of memory actually installed in the computer!

This is a consequence of how these tools report memory use. They generally don't understand shared memory very well, and show it as if it was memory used individually and exclusively by each postgres instance. PostgreSQL uses a big chunk of shared memory to communicate between its backends and cache data. Because these tools count that shared memory block once per <code>postgres</code> instance instead of counting it once for <i>all</i> <code>postgres</code> instances, they massively over-estimate how much memory PostgreSQL is using.

Furthermore, many versions of these tools don't report the entire shared memory block as being used by an individual instance immediately when it starts, but rather count the number of shared pages it has touched since starting. Over the lifetime of an instance, it will inevitably touch more and more of the shared memory until it has touched every page, so that its reported usage will gradually rise to include the entire shared memory block. This is frequently misinterpreted to be a memory leak; but it is no such thing, only a reporting artifact.

== Operational Questions ==

=== How do I SELECT only the first few rows of a query? A random row? ===

To retrieve only a few rows, if you know at the number of rows needed
at the time of the SELECT use LIMIT . If an index matches the ORDER BY
it is possible the entire query does not have to be executed. If you
don't know the number of rows at SELECT time, use a cursor and FETCH.

To SELECT a random row, use:
SELECT col
FROM tab
ORDER BY random()
LIMIT 1;

See also this [http://blog.rhodiumtoad.org.uk/2009/03/08/selecting-random-rows-from-a-table/ blog entry by Andrew Gierth]
that has more information on this topic.

=== How do I find out what tables, indexes, databases, and users are defined? How do I see the queries used by psql to display them? ===

Use the \dt command to see tables in psql. For a complete list of
commands inside psql you can use \?. Alternatively you can read the
source code for psql in file pgsql/src/bin/psql/describe.c, it
contains SQL commands that generate the output for psql's backslash
commands. You can also start psql with the -E option so it will print
out the queries it uses to execute the commands you give. PostgreSQL
also provides an SQL compliant INFORMATION SCHEMA interface you can
query to get information about the database.

There are also system tables beginning with pg_ that describe these
too.

Use psql -l will list all databases.

Also try the file pgsql/src/tutorial/syscat.source. It illustrates
many of the SELECTs needed to get information from the database system
tables.

=== How do you change a column's data type? ===

Changing the data type of a column can be done easily in 8.0 and later
with ALTER TABLE ALTER COLUMN TYPE.

In earlier releases, do this:
BEGIN;
ALTER TABLE tab ADD COLUMN new_col new_data_type;
UPDATE tab SET new_col = CAST(old_col AS new_data_type);
ALTER TABLE tab DROP COLUMN old_col;
COMMIT;

You might then want to do VACUUM FULL tab to reclaim the disk space
used by the expired rows.

=== What is the maximum size for a row, a table, and a database? ===

These are the limits:

Maximum size for a database? unlimited (32 TB databases exist)
Maximum size for a table? 32 TB
Maximum size for a row? 400 GB
Maximum size for a field? 1 GB
Maximum number of rows in a table? unlimited
Maximum number of columns in a table? 250-1600 depending on column types
Maximum number of indexes on a table? unlimited

Of course, these are not actually unlimited, but limited to available
disk space and memory/swap space. Performance may suffer when these
values get unusually large.

The maximum table size of 32 TB does not require large file support
from the operating system. Large tables are stored as multiple 1 GB
files so file system size limits are not important.

The maximum table size, row size, and maximum number of columns can be
quadrupled by increasing the default block size to 32k. The maximum
table size can also be increased using table partitioning.

One limitation is that indexes can not be created on columns longer
than about 2,000 characters. Fortunately, such indexes are rarely
needed. Uniqueness is best guaranteed by a function index of an MD5
hash of the long column, and full text indexing allows for searching
of words within the column.

=== How much database disk space is required to store data from a typical text file? ===

A PostgreSQL database may require up to five times the disk space to
store data from a text file.

As an example, consider a file of 100,000 lines with an integer and
text description on each line. Suppose the text string averages
twenty bytes in length. The flat file would be 2.8 MB. The size of the
PostgreSQL database file containing this data can be estimated as 5.2
MB:
24 bytes: each row header (approximate)
24 bytes: one int field and one text field
+ 4 bytes: pointer on page to tuple
----------------------------------------
52 bytes per row

The data page size in PostgreSQL is 8192 bytes (8 KB), so:

8192 bytes per page
------------------- = 158 rows per database page (rounded down)
52 bytes per row

100000 data rows
------------------ = 633 database pages (rounded up)
158 rows per page

633 database pages * 8192 bytes per page = 5,185,536 bytes (5.2 MB)

Indexes do not require as much overhead, but do contain the data that
is being indexed, so they can be large also.

NULLs are stored as bitmaps, so they use very little space.

Note that long values may be compressed transparently.

See also this presentation on the topic: [[Image:How_Long_Is_a_String.pdf]].

=== Why are my queries slow? Why don't they use my indexes? ===

Indexes are not used by every query. Indexes are used only if the
table is larger than a minimum size, and the query selects only a
small percentage of the rows in the table. This is because the random
disk access caused by an index scan can be slower than a straight read
through the table, or sequential scan.

To determine if an index should be used, PostgreSQL must have
statistics about the table. These statistics are collected using
VACUUM ANALYZE, or simply ANALYZE. Using statistics, the optimizer
knows how many rows are in the table, and can better determine if
indexes should be used. Statistics are also valuable in determining
optimal join order and join methods. Statistics collection should be
performed periodically as the contents of the table change.

Indexes are normally not used for ORDER BY or to perform joins. A
sequential scan followed by an explicit sort is usually faster than an
index scan of a large table. However, LIMIT combined with ORDER BY
often will use an index because only a small portion of the table is
returned.

If you believe the optimizer is incorrect in choosing a sequential
scan, use SET enable_seqscan TO 'off' and run query again to see if an
index scan is indeed faster.

When using wild-card operators such as LIKE or ~, indexes can only be
used in certain circumstances:

* The beginning of the search string must be anchored to the start of the string, i.e.
** LIKE patterns must not start with % or _.
** ~ (regular expression) patterns must start with ^.

* The search string can not start with a character class, e.g. [a-e].

* Case-insensitive searches such as ILIKE and ~* do not utilize indexes. Instead, use expression indexes, which are described in [[#How_do_I_perform_regular_expression_searches_and_case-insensitive_regular_expression_searches.3F_How_do_I_use_an_index_for_case-insensitive_searches.3F | section 4.8]].

* C locale must be used during initdb because sorting in a non-C locale often doesn't match the behavior of LIKE. You can create a special text_pattern_ops index that will work in such cases, but note it is only helpful for LIKE indexing.

It is also possible to use full text indexing for word searches.

The [[SlowQueryQuestions]] article contains some more tips and guidance.

=== How do I see how the query optimizer is evaluating my query? ===

This is done with the EXPLAIN command; see [[Using EXPLAIN]].

=== How do I change the sort ordering of textual data? ===

PostgreSQL sorts textual data according to the ordering that is defined by the current locale, which is selected during initdb.
(In 8.4 and up it will be possible to select a different locale when creating a new database.)
If you don't like the ordering then you need to use a different locale.
In particular, most locales other than "C" sort according to dictionary order, which largely ignores punctuation and spacing.
If that's not what you want then you need "C" locale.

=== How do I perform regular expression searches and case-insensitive regular expression searches? How do I use an index for case-insensitive searches? ===

The ~ operator does regular expression matching, and ~* does
case-insensitive regular expression matching. The case-insensitive
variant of LIKE is called ILIKE.

Case-insensitive equality comparisons are normally expressed as:
SELECT *
FROM tab
WHERE lower(col) = 'abc';

This will not use a standard index on "col". However, if you create an
expression index on "lower(col)", it will be used:
CREATE INDEX tabindex ON tab (lower(col));

If the above index is created as UNIQUE, then the column can store
upper and lowercase characters, but it cannot contain identical values that
differ only in case. To force a particular case to be stored in the
column, use a CHECK constraint or a trigger.

In PostgreSQL 8.4 and later, you can also use the contributed [http://www.postgresql.org/docs/8.4/static/citext.html CITEXT data type], which internally implements the "lower()" calls, so that you can effectively treat it as a fully case-insensitive data type. CITEXT is also [https://svn.kineticode.com/citext/trunk/ available for 8.3], and an earlier version that treats only ASCII characters case-insensitively on 8.2 and earlier is available on [http://pgfoundry.org/projects/citext/ pgFoundry].

=== In a query, how do I detect if a field is NULL? How do I concatenate possible NULLs? How can I sort on whether a field is NULL or not? ===

You can test the value with IS NULL or IS NOT NULL, like this:
SELECT *
FROM tab
WHERE col IS NULL;

Concatenating a NULL with something else produces another NULL.
If that's not what you want, you can replace the NULL(s) using
COALESCE(), like this:
<nowiki>SELECT COALESCE(col1, '') || COALESCE(col2, '')</nowiki>
FROM tab;

To sort by the NULL status, use an IS NULL or IS NOT NULL test
in your ORDER BY clause. Things that are true will sort higher than
things that are false, so the following will put NULL entries at the
front of the output:
SELECT *
FROM tab
ORDER BY (col IS NOT NULL), col;

In PostgreSQL 8.3 and up, you can also control sort ordering of NULLs
using the recently-standardized NULLS FIRST/NULLS LAST modifiers,
like this:
SELECT *
FROM tab
ORDER BY col NULLS FIRST;

=== What is the difference between the various character types? ===

{| border="1"
!Type
!Internal Name
!Notes
|-
|VARCHAR(n)
|varchar
|size specifies maximum length, no padding
|-
|CHAR(n)
|bpchar
|blank-padded to the specified fixed length
|-
|TEXT
|text
|no specific upper limit on length
|-
|BYTEA
|bytea
|variable-length byte array (null-byte safe)
|-
|"char" (with the quotes)
|char
|one byte
|}

You will see the internal name when examining system catalogs and in
some error messages.

The first four types above are "varlena" types (i.e., the field length
is explicitly stored on disk, followed by the data). Thus the actual
space used is slightly greater than the expected size. However, long
values are also subject to compression, so the space on disk might
also be less than expected.

VARCHAR(n) is best when storing variable-length strings if a specific
upper limit on the string length is required by the application.
TEXT is for strings of "unlimited" length (though all fields in PostgreSQL
are subject to a maximum value length of one gigabyte).

CHAR(n) is for storing strings that are all the same length. CHAR(n)
pads with blanks to the specified length, while VARCHAR(n) only stores
the characters supplied. BYTEA is for storing binary data,
particularly values that include zero bytes. All these types have similar
performance characteristics, except that the blank-padding involved
in CHAR(n) requires additional storage and some extra runtime.

The "char" type (the quotes are required to distinguish it from CHAR(n))
is a specialized datatype that can store exactly one byte. It is found in
the system catalogs but its use in user tables is generally discouraged.

=== How do I create a serial/auto-incrementing field? ===

PostgreSQL supports a SERIAL data type. Actually, this isn't quite
a real type. It's a shorthand for creating an integer column that
is fed from a sequence.

For example, this:
CREATE TABLE person (
id SERIAL,
name TEXT
);
is automatically translated into this:
CREATE SEQUENCE person_id_seq;
CREATE TABLE person (
id INTEGER NOT NULL DEFAULT nextval('person_id_seq'),
name TEXT
);

The automatically created sequence is named ''table''_''serialcolumn''_seq,
where ''table'' and ''serialcolumn'' are the names of the table and SERIAL
column, respectively. See the CREATE SEQUENCE manual page for more
information about sequences.

There is also BIGSERIAL, which is like SERIAL except that the resulting
column is of type BIGINT instead of INTEGER. Use this type if you think
that you might need more than 2 billion serial values over the lifespan
of the table.

Note that sequences may contain "holes" or "gaps" as a normal part of operation. It is entirely normal for generated keys to go 1, 4, 5, 6, 9, ... . See [[#Why are there gaps in the numbering of my sequence/SERIAL column? Why aren't my sequence numbers reused on transaction abort?|the FAQ entry on sequence gaps]].

=== How do I get the value of a SERIAL insert? ===

The simplest way is to retrieve the assigned SERIAL value with
RETURNING. Using the example table in the previous question, it would look like this:
INSERT INTO person (name) VALUES ('Blaise Pascal') RETURNING id;

You can also call nextval() and use that value in the INSERT, or call
currval() after the INSERT.

=== Doesn't currval() lead to a race condition with other users? ===

No. currval() returns the latest sequence value assigned by your session,
independently of what is happening in other sessions.

=== Why are there gaps in the numbering of my sequence/SERIAL column? Why aren't my sequence numbers reused on transaction abort? ===

To improve concurrency, sequence values are given out to running
transactions on-demand; the sequence object is not kept locked but is
immediately available for another transaction to get another sequence
value. This causes gaps in numbering from aborted transactions, as documented in the [http://www.postgresql.org/docs/current/static/functions-sequence.html NOTE section for the nextval() function].

Additionally, an unclean server shutdown will cause sequences to increment on recovery, because PostgreSQL keeps a cache of sequence numbers to hand out and in an unclean shutdown it isn't sure which of those cached numbers has already been used. Since sequences are allowed to have gaps anyway it takes the safe option and increments the sequence.

Another cause for gaps in sequence is the use of the CACHE clause in [http://www.postgresql.org/docs/current/static/sql-createsequence.html CREATE SEQUENCE].

In general, you should not rely on SERIAL keys or SEQUENCEs being gapless, nor should you make assumptions about their order; [http://www.postgresql.org/docs/current/static/sql-createsequence.html#AEN69802 it is ''not'' guaranteed that id n+1 was inserted after id n except when both were generated within the same transaction]. Compare synthetic keys for equality and only for equality.

Gap-less sequences are possible, but are very bad for performance. At most one transaction at a time can be inserting rows from a gapless sequence. There is no built-in SERIAL or SEQUENCE equivalent for gap-less sequences, but one is [http://stackoverflow.com/a/9985219/398670 trivial to implement]. Information on gapless sequence implementations can be found in the mailing list archives, on Stack Overflow, and in [http://www.varlena.com/GeneralBits/130.php this useful article]. Avoid using a gap-less sequence unless it is an absolute business requirement. Consider dynamically generating the gap-less numbering on demand for display, using the [http://www.postgresql.org/docs/current/static/tutorial-window.html row_number() window function], or adding it in a batch process that runs periodically.

See also: [http://www.neilconway.org/docs/sequences/ FAQ: Using sequences in PostgreSQL].

=== What is an OID? ===

If a table is created WITH OIDS, each row includes an OID column that is automatically filled in during INSERT.
OIDs are sequentially assigned 4-byte integers. Initially they are unique
across the entire installation. However, the OID counter wraps around at 4 billion,
and after that OIDs may be duplicated.

It is possible to prevent duplication of OIDs within a single table by
creating a unique index on the OID column (but note that the WITH OIDS
clause doesn't by itself create such an index).
The system checks the index to see if a newly
generated OID is already present, and if so generates a new OID and
repeats. This works well so long as no OID-containing table has
more than a small fraction of 4 billion rows.

PostgreSQL uses OIDs for object identifiers in the system catalogs,
where the size limit is unlikely to be a problem.

To uniquely number rows in user tables, it is best to use SERIAL
rather than an OID column, or BIGSERIAL if the table is expected to
have more than 2 billion entries over its lifespan.

=== What is a CTID? ===

CTIDs identify specific physical rows by their block and
offset positions within a table.
They are used by index entries to point to physical rows.
A logical row's CTID changes when it is updated, so the CTID
cannot be used as a long-term row identifier. But it is sometimes
useful to identify a row within a transaction when no competing
update is expected.

=== Why do I get the error "ERROR: Memory exhausted in AllocSetAlloc()"? ===

You probably have run out of virtual memory on your system, or your
kernel has a low limit for certain resources. Try this before starting
the server:
ulimit -d 262144
limit datasize 256m

Depending on your shell, only one of these may succeed, but it will
set your process data segment limit much higher and perhaps allow the
query to complete. This command applies to the current process, and
all subprocesses created after the command is run. If you are having a
problem with the SQL client because the backend is returning too much
data, try it before starting the client.

=== How do I tell what PostgreSQL version I am running? ===

Run this query: SELECT version();

=== Is there a way to leave an audit trail of database operations? ===

There's nothing built-in, but it's not too difficult to build such
facilities yourself.

Simple example right in the official docs:
http://www.postgresql.org/docs/8.3/static/plpgsql-trigger.html#PLPGSQL-TRIGGER-AUDIT-EXAMPLE

Project targeting this feature: http://pgfoundry.org/projects/tablelog/

Background information and other sample implementations:
http://it.toolbox.com/blogs/database-soup/simple-data-auditing-19014
http://www.go4expert.com/forums/showthread.php?t=7252
http://www.alberton.info/postgresql_table_audit.html

=== How do I create a column that will default to the current time? ===

Use CURRENT_TIMESTAMP:
CREATE TABLE test (x int, modtime TIMESTAMP DEFAULT CURRENT_TIMESTAMP );

=== How do I perform an outer join? ===

PostgreSQL supports outer joins using the SQL standard syntax. Here
are two examples:
SELECT *
FROM t1 LEFT OUTER JOIN t2 ON (t1.col = t2.col);

or
SELECT *
FROM t1 LEFT OUTER JOIN t2 USING (col);

These identical queries join t1.col to t2.col, and also return any
unjoined rows in t1 (those with no match in t2). A RIGHT join would
add unjoined rows of t2. A FULL join would return the matched rows
plus all unjoined rows from t1 and t2. The word OUTER is optional and
is assumed in LEFT, RIGHT, and FULL joins. Ordinary joins are called
INNER joins.

=== How do I perform queries using multiple databases? ===

There is no way to query a database other than the current one.
Because PostgreSQL loads database-specific system catalogs, it is
uncertain how a cross-database query should even behave.

contrib/dblink allows cross-database queries using function calls. Of
course, a client can also make simultaneous connections to different
databases and merge the results on the client side.

=== How do I return multiple rows or columns from a function? ===

It is easy using set-returning functions,
[[Return more than one row of data from PL/pgSQL functions]].

=== Why do I get "relation with OID ##### does not exist" errors when accessing temporary tables in PL/PgSQL functions? ===

In PostgreSQL versions < 8.3, PL/PgSQL caches function scripts, and an
unfortunate side effect is that if a PL/PgSQL function accesses a
temporary table, and that table is later dropped and recreated, and
the function called again, the function will fail because the cached
function contents still point to the old temporary table. The solution
is to use EXECUTE for temporary table access in PL/PgSQL. This will
cause the query to be reparsed every time.

This problem does not occur in PostgreSQL 8.3 and later.

=== What replication solutions are available? ===

Though "replication" is a single term, there are several technologies
for doing replication, with advantages and disadvantages for each.
Our documentation contains a good introduction to this topic at
http://www.postgresql.org/docs/8.3/static/high-availability.html and a
grid listing replication software and features is at
[[Replication, Clustering, and Connection Pooling]]

Master/slave replication allows a single master to receive read/write
queries, while slaves can only accept read/SELECT queries. The most
popular freely available master-slave PostgreSQL replication solution
is Slony-I.

Multi-master replication allows read/write queries to be sent to
multiple replicated computers. This capability also has a severe
impact on performance due to the need to synchronize changes between
servers. PGCluster is the most popular such solution freely available
for PostgreSQL.

There are also proprietary and hardware-based replication solutions
available supporting a variety of replication models.

=== Is possible to create a shared-storage postgresql server cluster? ===

PostgreSQL does not support clustering using [[Shared_Storage|shared storage]] on a SAN, SCSI backplane,
iSCSI volume, or other shared media. Such "RAC-style" clustering isn't supported.
Only replication-based clustering is currently supported.

See [[Replication, Clustering, and Connection Pooling]] information for details.

[[Shared_Storage|Shared-storage]] 'failover' is possible, but it is not safe to have more than one
postmaster running and accessing the data store at the same time. Heartbeat and
[http://en.wikipedia.org/wiki/STONITH STONITH] or some other hard-disconnect option are recommended.

=== Why are my table and column names not recognized in my query? Why is capitalization not preserved? ===

The most common cause of unrecognized names is the use of
double-quotes around table or column names during table creation. When
double-quotes are used, table and column names (called identifiers)
are stored case-sensitive, meaning you must use double-quotes when
referencing the names in a query. Some interfaces, like pgAdmin,
automatically double-quote identifiers during table creation. So, for
identifiers to be recognized, you must either:

* Avoid double-quoting identifiers when creating tables
* Use only lowercase characters in identifiers
* Double-quote identifiers when referencing them in queries

=== I lost the database password. What can I do to recover it? ===

You can't. However, you can reset it to something else. To do this, you

* edit pg_hba.conf to allow ''trust'' authorization temporarily
* Reload the config file (pg_ctl reload)
* Connect and issue ALTER ROLE / PASSWORD to set the new password
* edit pg_hba.conf again and restore the previous settings
* Reload the config file again

=== Does PostgreSQL have stored procedures? ===

PostgreSQL doesn't. However PostgreSQL have very powerful functions and user-defined functions capabilities that can do most things that other RDBMS stored routines (procedures and functions) can do and in many cases more.

These functions can be of different types and can be implemented in several programming languages.
(Refer to documentation for more details. [http://www.postgresql.org/docs/current/static/xfunc.html User-Defined Functions])

PostgreSQL functions can be invoked in many ways. If you want to invoke a function as you would call a stored procedure in other RDBMS (typically a function with side-effects but whose result you don't care for example because it returns void), one option would be to use [http://www.postgresql.org/docs/current/static/plpgsql.html PL/pgSQL Language] for your procedure and the [http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-NORESULT PERFORM] command. Example:
PERFORM theNameOfTheFunction(arg1, arg2);

Note that invoking instead:
SELECT theNameOfTheFunction(arg1, arg2);
would produce a result even if the function returns void (this result would be one row containing a void value).

[http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-NORESULT PERFORM] could thus be used to discard this unuseful result.

The main limitations on Pg's stored functions - as compared to true stored procedures - are:

* inability to return multiple result sets
* no support for autonomous transactions (<code>BEGIN</code>, <code>COMMIT</code> and <code>ROLLBACK</code> within a function)
* no support for the SQL-standard <code>CALL</code> syntax, though the ODBC and JDBC drivers will translate calls for you.

=== Why don't BEGIN, ROLLBACK and COMMIT work in stored procedures/functions? ===

PostgreSQL doesn't support autonomous transactions in its stored functions. Like all PostgreSQL queries, stored functions always run in a transaction and cannot operate outside a transaction.

If you need a stored procedure to manage transactions, you can look into the dblink interface or do the work from a client-side script instead. In some cases you can do what you need to using [http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING exception blocks in PL/PgSQL], because each BEGIN/EXCEPTION/END block creates a subtransaction.

=== Why is "SELECT count(*) FROM bigtable;" slow? ===

It can't be answered directly from an index. PostgreSQL has to check the visibility for each record, so it
forces a sequential scan of the entire table. If you want, you can keep track of the number of rows yourself with triggers, but beware that this will slow down write access to the table.

You can get an estimation. The reltuples column in [http://www.postgresql.org/docs/current/static/catalog-pg-class.html pg_class] contains the information from the latest [http://www.postgresql.org/docs/current/static/sql-analyze.html ANALYZE] of the table. On a large table this is often accurate to within a few thousandths of a percent, which is accurate enough for many purposes.

An "exact" count is often not exact for very long, anyway; due to [http://www.postgresql.org/docs/current/static/mvcc-intro.html MVCC] concurrency, the count will be accurate as of the moment the SELECT count(*) query (or, for stricter [http://www.postgresql.org/docs/current/static/transaction-iso.html transaction isolation] levels, its transaction) ''started'', and may well be out-of-date by the time the query completes. In a transaction mix where the table is being modified, two count(*) executions which return at the same moment might have different values, if a modifying transaction committed between their start times.

For more information, see [[Slow Counting]].

=== Why is my query much slower when run as a prepared query? ===

When PostgreSQL has the full query with all parameters known by planning time, it can use statistics in the table to find out if the values used in the query are very common or very uncommon in a column. This lets it change the way it fetches the data to be more efficient, as it knows to expect lots or very few results from a certain part of the query. For example, it might choose an sequential scan instead of doing an index scan if you search for 'active=y' and it knows that 99% of the records in the table have 'active=y', because in this case a sequential scan will be much faster.

In a prepared query, PostgreSQL doesn't have the value of all parameters when it's creating the plan. It has to try to pick a "safe" plan that should work fairly well no matter what value you supply as the parameter when you execute the prepared query. Unfortunately, this plan might not be very appropriate if the value you supply is vastly more common, or vastly less common, than is average for some randomly selected values in the table.

If you suspect this issue is affecting you, start by using the [http://www.postgresql.org/docs/current/static/sql-explain.html EXPLAIN] command to compare the slow and fast queries. Look at the output of <code>EXPLAIN SELECT query...</code> and compare it to the result of <code>PREPARE query... ; EXPLAIN EXECUTE query...</code> to see if the plans are notably different. <code>EXPLAIN ANALYZE</code> may give you more information, such as row count estimates and counts.

Usually people having this problem are trying to use prepared queries as a security measure to prevent SQL injection, rather than as a performance tuning option for expensive-to-plan queries frequently executed with a variety of different parameters. Those people should consider using client-side prepared statements if their client interface (eg PgJDBC) supports it.

At present, PostgreSQL does not offer a way to request re-planning of a prepared statement using a particular set of parameter values; doing so somewhat defeats the purpose of server-side prepared statements. Running a statistics check to see if a particular parameter value is notably outside the norm and automatically re-planning in that case has been discussed, but not agreed upon or implemented as yet.

See [[Using_EXPLAIN]]. If you're going to ask for help on the mailing lists, please read the [[Guide to reporting problems]].

=== Why is my query much slower when run in a function than standalone? ===

See [[FAQ#Why is my query much slower when run as a prepared query?]]. Queries in PL/PgSQL functions are prepared and cached, so they execute in much the same way as if you'd <code>PREPARE</code>d then <code>EXECUTE</code>d the query yourself.

If you're having really severe issues with this that improving the table statistics or adjusting your query don't help with, you can work around it by forcing PL/PgSQL to re-prepare your query at every execution. To do this, use the <code>EXECUTE ... USING</code> statement in PL/PgSQL to supply your query as a textual string. Alternately, the [http://www.postgresql.org/docs/current/static/functions-string.html quote_literal or quote_nullable] functions may be used to escape parameters substituted into query text.

=== Why do my strings sort incorrectly? ===

First, make sure you are using the locale you want to be using. Use <code>SHOW lc_collate</code> to show the database-wide locale in effect. If you are using per-column collations, check those. If everything is how you want it, then read on.

PostgreSQL uses the C library's locale facilities for sorting strings. So if the sort order of the strings is not what you expect, the issue is likely in the C library. You can verify the C library's idea of sorting using the <code>sort</code> utility on a text file, e.g.,

LC_COLLATE=xx_YY.utf8 sort testfile.txt

If this results in the same order that PostgreSQL gives you, then the problem is outside of PostgreSQL.

PostgreSQL deviates from the libc behavior in so far as it breaks ties by sorting strings in byte order. This should rarely make a
difference in practice, and is usually not the source of the problem when users complain about the sort order, but it could affect cases where, for example, combining and precombined Unicode characters are mixed.

If the problem is in the C library, you will have to take it up with your operating system maintainers. Note, however, that while actual bugs in locale definitions of C libraries have been known to exist, it is more likely that the C library is correct, where "correct" means it follows some recognized international or national standard. Possibly, you are expecting one of multiple equally valid interpretations of a language's sorting rules.

Common complaint patterns include:

* Spaces and special characters: The sorting algorithm normally works in multiple passes. First, all the letters are compared, ignoring spaces and punctuation. Then, spaces and punctuation are compared to break ties. (This is a simplification of what actually happens.) It's not possible to change this without changing the locale definitions themselves (and even then it's difficult). You might want to restructure your data slightly to avoid this problem. For example, if you are sorting a name field, you could split the field into first and last name fields, avoiding the space in between.

* Upper/lower case: Locales other than the C locale generally sort upper and lower case letters together. So the order will be something like a A b B c C ... instead of the A B C ... a b c ... that a sort based on ASCII byte values will give. That is correct.

* German locale: sort order of ä as a or ae. Both of these are valid (see http://de.wikipedia.org/wiki/Alphabetische_Sortierung), but most C libraries only provide the first one. Fixing this would require creating a custom locale. This is possible, but will take some work.

* It is not in ASCII/byte order. No, it's not, it's not supposed to be. ASCII is an encoding, not a sort order. If you want this, you can use the C locale, but then you use the ability to non-ASCII characters.

That said, if you are on Mac OS X or a BSD-family operating system, and you are using UTF-8, then give up. The locale definitions on
those operating systems are broken.

[[Category:FAQ]]

Pg reorg

2012-10-16T03:33:34Z

Schmiddy: trim out features, bugs and wishlist section of this page: we are using the github issue tracker and wiki now, and this page should just be a pointer

Pg reorg

2012-10-15T02:18:10Z

Schmiddy:

{{DISPLAYTITLE:pg_reorg}}

== Project Organization ==

The pg_reorg developers are slowly moving the project from its old home on [http://pgfoundry.org/projects/reorg/ pgfoundry] to [https://github.com/reorg/pg_reorg github]. The current stable release of pg_reorg is [http://pgfoundry.org/frs/?group_id=1000411&release_id=1873 1.1.7]. The project mailing list is still hosted on pgfoundry; please report bugs, issues and feature requests on the [http://lists.pgfoundry.org/mailman/listinfo/reorg-general reorg-general list].

Official documentation page: [http://reorg.projects.postgresql.org/pg_reorg.html pg_reorg], though this wiki page may serve as an unofficial documentation point.

== Why use pg_reorg? ==
pg_reorg is handy when you have a large table which has become bloated (see [[Show database bloat]] for a useful bloat-detection query). If you are able to hold an AccessExclusive lock on the table for an extended period, you have it easy: just use [http://www.postgresql.org/docs/current/static/sql-cluster.html CLUSTER] or [[VACUUM FULL]]. However, if your table is busy being accessed by queries which can't wait hours while a CLUSTER/VACUUM FULL completes, you need a solution which will de-bloat your table and indexes while allowing concurrent reads and writes of the table. pg_reorg allows you to do precisely this. See also [http://www.depesz.com/2011/07/06/bloat-happens/ depesz's summary].

== Pending Features ==
A few features have been added to cvs, git master or another git branch, and are not yet in a stable release. TODO: break this list out into items to be fixed in REL1_1, items committed to git master, and items in other git branches.
# Bundle pg_reorg as an extension, so CREATE EXTENSION can be used
# Make pg_reorg available on pgxn
# Use ALTER TABLE ... ENABLE ALWAYS so pg_reorg can operate on a Slony slave node: [http://pgfoundry.org/pipermail/reorg-general/2012-October/000094.html]
# Print status message while waiting on table lock: [http://pgfoundry.org/pipermail/reorg-general/2012-September/000069.html]
# Column name quoting: [http://pgfoundry.org/pipermail/reorg-general/2012-September/000071.html]
# Trigger to prevent TRUNCATE on target table.

== Bugs, Known Issues ==
# Problem using pg_reorg on a newly-promoted streaming replication slave: [http://pgfoundry.org/tracker/index.php?func=detail&aid=1011203&group_id=1000411&atid=1376| #1011203] [http://archives.postgresql.org/pgsql-bugs/2012-09/msg00278.php]
# It is generally unsafe to perform DDL on your table while it is being reorg'ed, except for VACUUM or ANALYZE. Although the primary benefit of pg_reorg is that it does not hold a high-level lock on the target table while it is being reorg'ed, this also leaves us with no hard and fast way to prevent unsafe DDL. Perhaps we could add in some sanity checks of pg_class, pg_tablespace, etc. before and after the reorg finishes, and throw an error if any unexpected changes in the table attributes are detected.
# pg_reorg chokes when an invalid index is left behind e.g. by CREATE INDEX CONCURRENTLY [http://pgfoundry.org/pipermail/reorg-general/2012-October/000101.html]

== Wishlist ==
# Concurrent index builds using multiple connections.

Pg reorg

2012-10-15T02:17:45Z

Schmiddy: Change title to "pg_reorg" using DISPLAYTITLE

=== pg_reorg ===

{{DISPLAYTITLE:pg_reorg}}

== Project Organization ==

The pg_reorg developers are slowly moving the project from its old home on [http://pgfoundry.org/projects/reorg/ pgfoundry] to [https://github.com/reorg/pg_reorg github]. The current stable release of pg_reorg is [http://pgfoundry.org/frs/?group_id=1000411&release_id=1873 1.1.7]. The project mailing list is still hosted on pgfoundry; please report bugs, issues and feature requests on the [http://lists.pgfoundry.org/mailman/listinfo/reorg-general reorg-general list].

Official documentation page: [http://reorg.projects.postgresql.org/pg_reorg.html pg_reorg], though this wiki page may serve as an unofficial documentation point.

== Why use pg_reorg? ==
pg_reorg is handy when you have a large table which has become bloated (see [[Show database bloat]] for a useful bloat-detection query). If you are able to hold an AccessExclusive lock on the table for an extended period, you have it easy: just use [http://www.postgresql.org/docs/current/static/sql-cluster.html CLUSTER] or [[VACUUM FULL]]. However, if your table is busy being accessed by queries which can't wait hours while a CLUSTER/VACUUM FULL completes, you need a solution which will de-bloat your table and indexes while allowing concurrent reads and writes of the table. pg_reorg allows you to do precisely this. See also [http://www.depesz.com/2011/07/06/bloat-happens/ depesz's summary].

== Pending Features ==
A few features have been added to cvs, git master or another git branch, and are not yet in a stable release. TODO: break this list out into items to be fixed in REL1_1, items committed to git master, and items in other git branches.
# Bundle pg_reorg as an extension, so CREATE EXTENSION can be used
# Make pg_reorg available on pgxn
# Use ALTER TABLE ... ENABLE ALWAYS so pg_reorg can operate on a Slony slave node: [http://pgfoundry.org/pipermail/reorg-general/2012-October/000094.html]
# Print status message while waiting on table lock: [http://pgfoundry.org/pipermail/reorg-general/2012-September/000069.html]
# Column name quoting: [http://pgfoundry.org/pipermail/reorg-general/2012-September/000071.html]
# Trigger to prevent TRUNCATE on target table.

== Bugs, Known Issues ==
# Problem using pg_reorg on a newly-promoted streaming replication slave: [http://pgfoundry.org/tracker/index.php?func=detail&aid=1011203&group_id=1000411&atid=1376| #1011203] [http://archives.postgresql.org/pgsql-bugs/2012-09/msg00278.php]
# It is generally unsafe to perform DDL on your table while it is being reorg'ed, except for VACUUM or ANALYZE. Although the primary benefit of pg_reorg is that it does not hold a high-level lock on the target table while it is being reorg'ed, this also leaves us with no hard and fast way to prevent unsafe DDL. Perhaps we could add in some sanity checks of pg_class, pg_tablespace, etc. before and after the reorg finishes, and throw an error if any unexpected changes in the table attributes are detected.
# pg_reorg chokes when an invalid index is left behind e.g. by CREATE INDEX CONCURRENTLY [http://pgfoundry.org/pipermail/reorg-general/2012-October/000101.html]

== Wishlist ==
# Concurrent index builds using multiple connections.

Pg reorg

2012-10-15T02:15:29Z

Schmiddy: /* Bugs, Known Issues */

=== pg_reorg ===

== Project Organization ==

The pg_reorg developers are slowly moving the project from its old home on [http://pgfoundry.org/projects/reorg/ pgfoundry] to [https://github.com/reorg/pg_reorg github]. The current stable release of pg_reorg is [http://pgfoundry.org/frs/?group_id=1000411&release_id=1873 1.1.7]. The project mailing list is still hosted on pgfoundry; please report bugs, issues and feature requests on the [http://lists.pgfoundry.org/mailman/listinfo/reorg-general reorg-general list].

Official documentation page: [http://reorg.projects.postgresql.org/pg_reorg.html pg_reorg], though this wiki page may serve as an unofficial documentation point.

== Why use pg_reorg? ==
pg_reorg is handy when you have a large table which has become bloated (see [[Show database bloat]] for a useful bloat-detection query). If you are able to hold an AccessExclusive lock on the table for an extended period, you have it easy: just use [http://www.postgresql.org/docs/current/static/sql-cluster.html CLUSTER] or [[VACUUM FULL]]. However, if your table is busy being accessed by queries which can't wait hours while a CLUSTER/VACUUM FULL completes, you need a solution which will de-bloat your table and indexes while allowing concurrent reads and writes of the table. pg_reorg allows you to do precisely this. See also [http://www.depesz.com/2011/07/06/bloat-happens/ depesz's summary].

== Pending Features ==
A few features have been added to cvs, git master or another git branch, and are not yet in a stable release. TODO: break this list out into items to be fixed in REL1_1, items committed to git master, and items in other git branches.
# Bundle pg_reorg as an extension, so CREATE EXTENSION can be used
# Make pg_reorg available on pgxn
# Use ALTER TABLE ... ENABLE ALWAYS so pg_reorg can operate on a Slony slave node: [http://pgfoundry.org/pipermail/reorg-general/2012-October/000094.html]
# Print status message while waiting on table lock: [http://pgfoundry.org/pipermail/reorg-general/2012-September/000069.html]
# Column name quoting: [http://pgfoundry.org/pipermail/reorg-general/2012-September/000071.html]
# Trigger to prevent TRUNCATE on target table.

== Bugs, Known Issues ==
# Problem using pg_reorg on a newly-promoted streaming replication slave: [http://pgfoundry.org/tracker/index.php?func=detail&aid=1011203&group_id=1000411&atid=1376| #1011203] [http://archives.postgresql.org/pgsql-bugs/2012-09/msg00278.php]
# It is generally unsafe to perform DDL on your table while it is being reorg'ed, except for VACUUM or ANALYZE. Although the primary benefit of pg_reorg is that it does not hold a high-level lock on the target table while it is being reorg'ed, this also leaves us with no hard and fast way to prevent unsafe DDL. Perhaps we could add in some sanity checks of pg_class, pg_tablespace, etc. before and after the reorg finishes, and throw an error if any unexpected changes in the table attributes are detected.
# pg_reorg chokes when an invalid index is left behind e.g. by CREATE INDEX CONCURRENTLY [http://pgfoundry.org/pipermail/reorg-general/2012-October/000101.html]

== Wishlist ==
# Concurrent index builds using multiple connections.

Pg reorg

2012-10-15T01:14:26Z

Schmiddy:

=== pg_reorg ===

== Project Organization ==

The pg_reorg developers are slowly moving the project from its old home on [http://pgfoundry.org/projects/reorg/ pgfoundry] to [https://github.com/reorg/pg_reorg github]. The current stable release of pg_reorg is [http://pgfoundry.org/frs/?group_id=1000411&release_id=1873 1.1.7]. The project mailing list is still hosted on pgfoundry; please report bugs, issues and feature requests on the [http://lists.pgfoundry.org/mailman/listinfo/reorg-general reorg-general list].

Official documentation page: [http://reorg.projects.postgresql.org/pg_reorg.html pg_reorg], though this wiki page may serve as an unofficial documentation point.

== Why use pg_reorg? ==
pg_reorg is handy when you have a large table which has become bloated (see [[Show database bloat]] for a useful bloat-detection query). If you are able to hold an AccessExclusive lock on the table for an extended period, you have it easy: just use [http://www.postgresql.org/docs/current/static/sql-cluster.html CLUSTER] or [[VACUUM FULL]]. However, if your table is busy being accessed by queries which can't wait hours while a CLUSTER/VACUUM FULL completes, you need a solution which will de-bloat your table and indexes while allowing concurrent reads and writes of the table. pg_reorg allows you to do precisely this. See also [http://www.depesz.com/2011/07/06/bloat-happens/ depesz's summary].

== Pending Features ==
A few features have been added to cvs, git master or another git branch, and are not yet in a stable release. TODO: break this list out into items to be fixed in REL1_1, items committed to git master, and items in other git branches.
# Bundle pg_reorg as an extension, so CREATE EXTENSION can be used
# Make pg_reorg available on pgxn
# Use ALTER TABLE ... ENABLE ALWAYS so pg_reorg can operate on a Slony slave node: [http://pgfoundry.org/pipermail/reorg-general/2012-October/000094.html]
# Print status message while waiting on table lock: [http://pgfoundry.org/pipermail/reorg-general/2012-September/000069.html]
# Column name quoting: [http://pgfoundry.org/pipermail/reorg-general/2012-September/000071.html]
# Trigger to prevent TRUNCATE on target table.

== Bugs, Known Issues ==
# Problem using pg_reorg on a newly-promoted streaming replication slave: [http://pgfoundry.org/tracker/index.php?func=detail&aid=1011203&group_id=1000411&atid=1376| #1011203] [http://archives.postgresql.org/pgsql-bugs/2012-09/msg00278.php]
# It is generally unsafe to perform DDL on your table while it is being reorg'ed, except for VACUUM or ANALYZE. Although the primary benefit of pg_reorg is that it does not hold a high-level lock on the target table while it is being reorg'ed, this also leaves us with no hard and fast way to prevent unsafe DDL. Perhaps we could add in some sanity checks of pg_class, pg_tablespace, etc. before and after the reorg finishes, and throw an error if any unexpected changes in the table attributes are detected.
# pg_reorg chokes when an invalid index is left behind e.g. by CREATE INDEX CONCURRENTLY

== Wishlist ==
# Concurrent index builds using multiple connections.

Pg reorg

2012-10-14T22:55:08Z

Schmiddy: Start a wiki page for pg_reorg project

=== pg_reorg ===

== Project Organization ==

The pg_reorg developers are slowly moving the project from its old home on [http://pgfoundry.org/projects/reorg/ pgfoundry] to [https://github.com/reorg/pg_reorg github]. The current stable release of pg_reorg is [http://pgfoundry.org/frs/?group_id=1000411&release_id=1873 1.1.7]. The project mailing list is still hosted on pgfoundry; please report bugs, issues and feature requests on the [http://lists.pgfoundry.org/mailman/listinfo/reorg-general reorg-general list].

Official documentation page: [http://reorg.projects.postgresql.org/pg_reorg.html pg_reorg], though this wiki page may serve as an unofficial documentation point.

== Why use pg_reorg? ==
pg_reorg is handy when you have a large table which has become bloated (see [[Show database bloat]] for a useful bloat-detection query). If you are able to hold an AccessExclusive lock on the table for an extended period, you have it easy: just use [http://www.postgresql.org/docs/current/static/sql-cluster.html CLUSTER] or [[VACUUM FULL]]. However, if your table is busy being accessed by queries which can't wait hours while a CLUSTER/VACUUM FULL completes, you need a solution which will de-bloat your table and indexes while allowing concurrent reads and writes of the table. pg_reorg allows you to do precisely this. See also [http://www.depesz.com/2011/07/06/bloat-happens/ depesz's summary].

== Pending Features ==
A few features have been added to cvs, git master or another git branch, and are not yet in a stable release. TODO: break this list out into items to be fixed in REL1_1, items committed to git master, and items in other git branches.
# Bundle pg_reorg as an extension, so CREATE EXTENSION can be used
# Make pg_reorg available on pgxn
# Use ALTER TABLE ... ENABLE ALWAYS so pg_reorg can operate on a Slony slave node: [http://pgfoundry.org/pipermail/reorg-general/2012-October/000094.html]
# Print status message while waiting on table lock: [http://pgfoundry.org/pipermail/reorg-general/2012-September/000069.html]
# Column name quoting: [http://pgfoundry.org/pipermail/reorg-general/2012-September/000071.html]
# Trigger to prevent TRUNCATE on target table.

== Bugs, Known Issues ==
# Problem using pg_reorg on a newly-promoted streaming replication slave: [http://pgfoundry.org/tracker/index.php?func=detail&aid=1011203&group_id=1000411&atid=1376| #1011203] [http://archives.postgresql.org/pgsql-bugs/2012-09/msg00278.php]
# It is generally unsafe to perform DDL on your table while it is being reorg'ed, except for VACUUM or ANALYZE. Although the primary benefit of pg_reorg is that it does not hold a high-level lock on the target table while it is being reorg'ed, this also leaves us with no hard and fast way to prevent unsafe DDL. Perhaps we could add in some sanity checks of pg_class, pg_tablespace, etc. before and after the reorg finishes, and throw an error if any unexpected changes in the table attributes are detected.

== Wishlist ==
# Concurrent index builds using multiple connections.

What's new in PostgreSQL 9.2

2012-08-29T19:24:36Z

Schmiddy: /* Replication improvements */ grammar & spelling

{{Languages}}

This document showcases many of the latest developments in PostgreSQL 9.2, compared to the last major release – PostgreSQL 9.1. There are many improvements in this release, so this wiki page covers many of the more important changes in detail. The full list of changes is itemised in ''Release Notes''.

=Major new features=

==Index-only scans ==

In PostgreSQL, indexes have no "visibility" information. It means that when you access a record by its index, PostgreSQL has to visit the real tuple in the table to be sure it is visible to you: the tuple the index points to may simply be an old version of the record you are looking for.

It can be a very big performance problem: the index is mostly ordered, so accessing its records is quite efficient, while the records may be scattered all over the place (that's a reason why PostgreSQL has a cluster command, but that's another story). In 9.2, PostgreSQL will use an "Index Only Scan" when possible, and not access the record itself if it doesn't need to.

There is still no visibility information in the index. So in order to do this, PostgreSQL uses the visibility map ([http://www.postgresql.org/docs/devel/static/storage-vm.html visibility map]) , which tells it whether the whole content of a (usually) 8K page is visible to all transactions or not. When the index record points to a tuple contained in an «all visible» page, PostgreSQL won't have to access the tuple, it will be able to build it directly from the index. Of course, all the columns requested by the query must be in the index.

The visibility map is maintained by VACUUM (it sets the visible bit), and by the backends doing SQL work (they unset the visible bit).

If the data has been read only since the last VACUUM then the data is All Visible and the index only scan feature can improve performance.

Here is an example.

CREATE TABLE demo_ios (col1 float, col2 float, col3 text);

In this table, we'll put random data, in order to have "scattered" data. We'll put 100 million records, to have a big recordset, and have it not fit in memory (that's a 4GB-ram machine). This is an ideal case, made for this demo. The gains won't be that big in real life.

INSERT INTO demo_ios SELECT generate_series(1,100000000),random(), 'mynotsolongstring';

SELECT pg_size_pretty(pg_total_relation_size('demo_ios'));
pg_size_pretty
----------------
6512 MB

Let's pretend that the query is this:

SELECT col1,col2 FROM demo_ios where col2 BETWEEN 0.01 AND 0.02

In order to use an index only scan on this query, we need an index on col2,col1 (col2 first, as it is used in the WHERE clause).

CREATE index idx_demo_ios on demo_ios(col2,col1);

We vacuum the visibility map to be up-to-date:

VACUUM demo_ios;

All the timing you'll see below are done on a cold OS and PostgreSQL cache (that's where the gains are, as the purpose on Index Only Scans is to reduce I/O).

Let's first try without Index Only Scans:

SET enable_indexonlyscan to off;

EXPLAIN (analyze,buffers) select col1,col2 FROM demo_ios where col2 between 0.01 and 0.02;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on demo_ios (cost=25643.01..916484.44 rows=993633 width=16) (actual time=763.391..362963.899 rows=1000392 loops=1)
Recheck Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double precision))
Rows Removed by Index Recheck: 68098621
Buffers: shared hit=2 read=587779
-> Bitmap Index Scan on idx_demo_ios (cost=0.00..25394.60 rows=993633 width=0) (actual time=759.011..759.011 rows=1000392 loops=1)
Index Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double precision))
Buffers: shared hit=2 read=3835
Total runtime: 364390.127 ms

With Index Only Scans:

explain (analyze,buffers) select col1,col2 from demo_ios where col2 between 0.01 and 0.02;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using idx_demo_ios on demo_ios (cost=0.00..35330.93 rows=993633 width=16) (actual time=58.100..3250.589 rows=1000392 loops=1)
Index Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double precision))
Heap Fetches: 0
Buffers: shared hit=923073 read=3848
Total runtime: 4297.405 ms

As nothing is free, there are a few things to keep in mind:

* Adding indexes for index only scans obviously adds indexes to your table. So updates will be slower.
* You will index columns that weren't indexed before. So there will be less opportunities for HOT updates.
* Gains will probably be smaller in real life situations, especially when data is changed between VACUUMs

This required making visibility map changes crash-safe, so visibility map bit changes are now WAL-logged.

==Replication improvements ==

Streaming Replication becomes more polished with this release.

One of the main remaining gripes about streaming replication is that all the slaves have to be connected to the same and unique master, consuming its resources. Moreover, in case of a failover, it could be complicated to reconnect all the remaining slaves to the newly promoted master, if one is not using a tool like repmgr.

* With 9.2, a standby can also send replication changes, allowing cascading replication.

Let's build this. We start with an already working 9.2 database.

We set it up for replication:

postgresql.conf:
wal_level=hot_standby #(could be archive too)
max_wal_senders=5
hot_standby=on

You'll probably also want to activate archiving in production, it won't be done here.

pg_hba.conf (do not use trust in production):
host replication replication_user 0.0.0.0/0 md5

Create the user:
create user replication_user replication password 'secret';

Clone the database:

pg_basebackup -h localhost -U replication_user -D data2
Password:

We have a brand new cluster in the data2 directory. We'll change the port so that it can start (postgresql.conf):
port=5433

We add a recovery.conf to tell it how to stream from the master database:
standby_mode = on
primary_conninfo = 'host=localhost port=5432 user=replication_user password=secret'

pg_ctl -D data2 start
server starting
LOG: database system was interrupted; last known up at 2012-07-03 17:58:09 CEST
LOG: creating missing WAL directory "pg_xlog/archive_status"
LOG: entering standby mode
LOG: streaming replication successfully connected to primary
LOG: redo starts at 0/9D000020
LOG: consistent recovery state reached at 0/9D0000B8
LOG: database system is ready to accept read only connections

Now, let's add a second slave, which will use this slave:

pg_basebackup -h localhost -U replication_user -D data3 -p 5433
Password:

We edit data3's postgresql.conf to change the port:
port=5434

We modify the recovery.conf to stream from the slave:
standby_mode = on
primary_conninfo = 'host=localhost port=5433 user=replication_user password=secret' # e.g. 'host=localhost port=5432'

We start the cluster:
pg_ctl -D data3 start
server starting
LOG: database system was interrupted while in recovery at log time 2012-07-03 17:58:09 CEST
HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
LOG: creating missing WAL directory "pg_xlog/archive_status"
LOG: entering standby mode
LOG: streaming replication successfully connected to primary
LOG: redo starts at 0/9D000020
LOG: consistent recovery state reached at 0/9E000000
LOG: database system is ready to accept read only connections

Now, everything modified on the master cluster get streamed to the first slave, and from there to the second slave. This second replication has to be monitored from the first slave (the master knows nothing about it).

* As you may have noticed from the example, pg_basebackup now works from slaves.

* There is another use case that wasn't covered: what if a user didn't care for having a full fledged slave, and only wanted to stream the WAL files to another location, to benefit from the reduced data loss without the burden of maintaining a slave ?

pg_receivexlog is provided just for this purpose: it pretends to be a PostgreSQL slave, but only stores the log files as they are streamed, in a directory:
pg_receivexlog -D /tmp/new_logs -h localhost -U replication_user

will connect to the master (or a slave), and start creating files:
ls /tmp/new_logs/
00000001000000000000009E.partial

Files are of the segment size, so they can be used for a normal recovery of the database. It's the same as an archive command, but with a much smaller granularity.
Remember to rename the last segment to remove the .partial suffix before using it with PITR or other.

* synchronous_commit has a new value: remote_write. It can be used when there is a synchronous slave (synchronous_standby_names is set), meaning that the master doesn't have to wait for the slave to have written the data to disk, only for the slave to have acknowledged the data. With this set, data is protected from a crash on the master, but could still be lost if the slave crashed at the same time (i.e. before having written the in flight data to disk). As this is a quite remote possibility, some people will be interested in this compromise.

==JSON datatype==

The JSON datatype is meant for storing JSON-structured data. It will validate that the input JSON string is correct JSON:

=# SELECT '{"username":"john","posts":121,"emailaddress":"john@nowhere.com"}'::json;
json
-------------------------------------------------------------------
{"username":"john","posts":121,"emailaddress":"john@nowhere.com"}
(1 row)

=# SELECT '{"username","posts":121,"emailaddress":"john@nowhere.com"}'::json;
ERROR: invalid input syntax for type json at character 8
DETAIL: Expected ":", but found ",".
CONTEXT: JSON data, line 1: {"username",...
STATEMENT: SELECT '{"username","posts":121,"emailaddress":"john@nowhere.com"}'::json;
ERROR: invalid input syntax for type json
LINE 1: SELECT '{"username","posts":121,"emailaddress":"john@nowhere...
^
DETAIL: Expected ":", but found ",".
CONTEXT: JSON data, line 1: {"username",...

You can also convert a row type to JSON:

=#SELECT * FROM demo ;
username | posts | emailaddress
----------+-------+---------------------
john | 121 | john@nowhere.com
mickael | 215 | mickael@nowhere.com
(2 rows)

=# SELECT row_to_json(demo) FROM demo;
row_to_json
-------------------------------------------------------------------------
{"username":"john","posts":121,"emailaddress":"john@nowhere.com"}
{"username":"mickael","posts":215,"emailaddress":"mickael@nowhere.com"}
(2 rows)

Or an array type:

=# select array_to_json(array_agg(demo)) from demo;
array_to_json
---------------------------------------------------------------------------------------------------------------------------------------------
[{"username":"john","posts":121,"emailaddress":"john@nowhere.com"},{"username":"mickael","posts":215,"emailaddress":"mickael@nowhere.com"}]
(1 row)

== Range Types ==

Range types are used to store a range of data of a given type. There are a few pre-defined types. They are integer (int4range), bigint (int8range), numeric (numrange), timestamp without timezone (tsrange), timestamp with timezone (tstzrange), and date (daterange).

Ranges can be made of continuous (numeric, timestamp...) or discrete (integer, date...) data types. They can be open (the bound isn't part of the range) or closed (the bound is part of the range). A bound can also be infinite.

Without these datatypes, most people solve the range problems by using two columns in a table. These range types are much more powerful, as you can use many operators on them:

Here is the intersection between then 1000(open)-2000(closed) and 1000(closed)-1200(closed) numeric range:

SELECT '(1000,2000]'::numrange * '[1000,1200]'::numrange;
?column?
-------------
(1000,1200]
(1 row)

So you can query on things like: «give me all ranges that intersect this»:

=# SELECT * from test_range ;
period
-----------------------------------------------------
["2012-01-01 00:00:00+01","2012-01-02 12:00:00+01"]
["2012-01-01 00:00:00+01","2012-03-01 00:00:00+01"]
["2008-01-01 00:00:00+01","2015-01-01 00:00:00+01"]
(3 rows)

=# SELECT * FROM test_range WHERE period && '[2012-01-03 00:00:00,2012-01-03 12:00:00]';
period
-----------------------------------------------------
["2012-01-01 00:00:00+01","2012-03-01 00:00:00+01"]
["2008-01-01 00:00:00+01","2015-01-01 00:00:00+01"]
(2 rows)

This query could use an index defined like this:

=# CREATE INDEX idx_test_range on test_range USING gist (period);

You can also use these range data types to define exclusion constraints:

CREATE EXTENSION btree_gist ;
CREATE TABLE reservation (room_id int, period tstzrange);
ALTER TABLE reservation ADD EXCLUDE USING GIST (room_id WITH =, period WITH &&);

This means that now it is forbidden to have two records in this table where room_id is equal and period overlaps. The extension btree_gist is required to create a GiST index on room_id (it's an integer, it is usually indexed with a btree index).

=# INSERT INTO reservation VALUES (1,'(2012-08-23 14:00:00,2012-08-23 15:00:00)');
INSERT 0 1
=# INSERT INTO reservation VALUES (2,'(2012-08-23 14:00:00,2012-08-23 15:00:00)');
INSERT 0 1
=# INSERT INTO reservation VALUES (1,'(2012-08-23 14:45:00,2012-08-23 15:15:00)');
ERROR: conflicting key value violates exclusion constraint "reservation_room_id_period_excl"
DETAIL: Key (room_id, period)=(1, ("2012-08-23 14:45:00+02","2012-08-23 15:15:00+02"))
conflicts with existing key (room_id, period)=(1, ("2012-08-23 14:00:00+02","2012-08-23 15:00:00+02")).
STATEMENT: INSERT INTO reservation VALUES (1,'(2012-08-23 14:45:00,2012-08-23 15:15:00)');

One can also declare new range types.

=Performance improvements=

This version has performance improvements on a very large range of domains (non-exaustive):

* The most visible will probably be the Index Only Scans, which has already been introduced in this document.

* The lock contention of several big locks has been significantly reduced, leading to better multi-processor scalability, for machines with over 32 cores mostly. 

* The performance of in-memory sorts has been improved by up to 25% in some situations, with certain specialized sort functions introduced. 

* An idle PostgreSQL server now makes less wakeups, leading to lower power consumption. This is especially useful on virtualized and embedded environments.

* COPY has been improved, it will generate less WAL volume and fewer locks of a table's pages. 

* Statistics are collected on array contents, allowing for better estimations of selectivity on array operations.

* Text-to-anytype concatenation and quote_literal/quote_nullable functions are not volatile any more, enabling better optimization in some cases 

* The system can now track IO durations 

This one deserves a little explanation, as it can be a little tricky. Tracking IO durations means asking repeatedly the time to the operating system. Depending on the operating system and the hardware, this can be quite cheap, or extremely costly. The most import factor here is where the system gets its time from. It could be directly retrieved from the processor (TSC), dedicated hardware such as HPET, or an ACPI call. What's most important is that the cost of getting time can vary from a factor of thousands.

If you are interested in this timing data, it's better to first check if your system will support it without to much of a performance hit. PostgreSQL provides you with the pg_test_timing tool:

<pre>
$ pg_test_timing
Testing timing overhead for 3 seconds.
Per loop time including overhead: 28.02 nsec
Histogram of timing durations:
< usec: count percent
32: 41 0.00004%
16: 1405 0.00131%
8: 200 0.00019%
4: 388 0.00036%
2: 2982558 2.78523%
1: 104100166 97.21287%
</pre>

Here, everything is good: getting time costs around 28 nanoseconds, and has a very small variation. Anything under 100 nanoseconds should be good for production. If you get higher values, you may still find a way to tune your system. You'd better check on the [http://www.postgresql.org/docs/9.2/static/pgtesttiming.html documentation].

Anyway, here is the data you'll be able to collect if your system is ready for this:

First, you'll get per-database statistics, which will now give accurate informations about which database is doing most I/O:

<pre>
=# SELECT * FROM pg_stat_database WHERE datname = 'mydb';
-[ RECORD 1 ]--+------------------------------
datid | 16384
datname | mydb
numbackends | 1
xact_commit | 270
xact_rollback | 2
blks_read | 1961
blks_hit | 17944
tup_returned | 269035
tup_fetched | 8850
tup_inserted | 16
tup_updated | 4
tup_deleted | 45
conflicts | 0
temp_files | 0
temp_bytes | 0
deadlocks | 0
blk_read_time | 583.774
blk_write_time | 0
stats_reset | 2012-07-03 17:18:54.796817+02
</pre>
We see here that mydb has only consumed 583.774 milliseconds of read time.

Explain will benefit from this too:
<pre>
=# EXPLAIN (analyze,buffers) SELECT count(*) FROM mots ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Aggregate (cost=1669.95..1669.96 rows=1 width=0) (actual time=21.943..21.943 rows=1 loops=1)
Buffers: shared read=493
I/O Timings: read=2.578
-> Seq Scan on mots (cost=0.00..1434.56 rows=94156 width=0) (actual time=0.059..12.933 rows=94156 loops=1)
Buffers: shared read=493
I/O Timings: read=2.578
Total runtime: 22.059 ms
</pre>
We now have a separate information about the time taken to retrieve data from the operating system. Obviously, here, the data was in the operating system's cache (2 milliseconds to read 493 blocks).

And last, if you have enabled pg_stat_statements:
<pre>
select * from pg_stat_statements where query ~ 'words';
-[ RECORD 1 ]-------+---------------------------
userid | 10
dbid | 16384
query | select count(*) from words;
calls | 2
total_time | 78.332
rows | 2
shared_blks_hit | 0
shared_blks_read | 986
shared_blks_dirtied | 0
shared_blks_written | 0
local_blks_hit | 0
local_blks_read | 0
local_blks_dirtied | 0
local_blks_written | 0
temp_blks_read | 0
temp_blks_written | 0
blk_read_time | 58.427
blk_write_time | 0
</pre>

* As for every version, the optimizer has received its share of improvements 
** Prepared statements used to be optimized once, without any knowledge of the parameters' values. With 9.2, the planner will use specific plans regarding to the parameters sent (the query will be planned at execution), except if the query is executed several times and the planner decides that the generic plan is not too much more expensive than the specific plans.
** A new feature has been added: parameterized paths. Simply put, it means that a sub-part of a query plan can use parameters it has got from a parent node. It fixes several bad plans that could occur, especially when the optimizer couldn't reorder joins to put nested loops where it would have been efficient.

This example is straight from the developpers mailing lists :

<pre>
CREATE TABLE a (
a_id serial PRIMARY KEY NOT NULL,
b_id integer
);
CREATE INDEX a__b_id ON a USING btree (b_id);

CREATE TABLE b (
b_id serial NOT NULL,
c_id integer
);
CREATE INDEX b__c_id ON b USING btree (c_id);

CREATE TABLE c (
c_id serial PRIMARY KEY NOT NULL,
value integer UNIQUE
);

INSERT INTO b (b_id, c_id)
SELECT g.i, g.i FROM generate_series(1, 50000) g(i);

INSERT INTO a(b_id)
SELECT g.i FROM generate_series(1, 50000) g(i);

INSERT INTO c(c_id,value)
VALUES (1,1);
</pre>

So we have a referencing b, b referencing c.

Here is an example of a query working badly with PostgreSQL 9.1:

<pre>
EXPLAIN ANALYZE SELECT 1
FROM
c
WHERE
EXISTS (
SELECT *
FROM a
JOIN b USING (b_id)
WHERE b.c_id = c.c_id)
AND c.value = 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=1347.00..3702.27 rows=1 width=0) (actual time=13.799..13.802 rows=1 loops=1)
Join Filter: (c.c_id = b.c_id)
-> Index Scan using c_value_key on c (cost=0.00..8.27 rows=1 width=4) (actual time=0.006..0.008 rows=1 loops=1)
Index Cond: (value = 1)
-> Hash Join (cost=1347.00..3069.00 rows=50000 width=4) (actual time=13.788..13.788 rows=1 loops=1)
Hash Cond: (a.b_id = b.b_id)
-> Seq Scan on a (cost=0.00..722.00 rows=50000 width=4) (actual time=0.007..0.007 rows=1 loops=1)
-> Hash (cost=722.00..722.00 rows=50000 width=8) (actual time=13.760..13.760 rows=50000 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 1954kB
-> Seq Scan on b (cost=0.00..722.00 rows=50000 width=8) (actual time=0.008..5.702 rows=50000 loops=1)
Total runtime: 13.842 ms
</pre>

Not that bad, 13 milliseconds. Still, we are doing sequential scans on a and b, when our common sense tells us that c.value=1 should be used to filter rows more aggressively.

Here's what 9.2 does with this query:

<pre>
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=0.00..16.97 rows=1 width=0) (actual time=0.035..0.037 rows=1 loops=1)
-> Index Scan using c_value_key on c (cost=0.00..8.27 rows=1 width=4) (actual time=0.007..0.009 rows=1 loops=1)
Index Cond: (value = 1)
-> Nested Loop (cost=0.00..8.69 rows=1 width=4) (actual time=0.025..0.025 rows=1 loops=1)
-> Index Scan using b__c_id on b (cost=0.00..8.33 rows=1 width=8) (actual time=0.007..0.007 rows=1 loops=1)
Index Cond: (c_id = c.c_id)
-> Index Only Scan using a__b_id on a (cost=0.00..0.35 rows=1 width=4) (actual time=0.014..0.014 rows=1 loops=1)
Index Cond: (b_id = b.b_id)
Total runtime: 0.089 ms
</pre>

The «parameterized path» is:
<pre>
-> Nested Loop (cost=0.00..8.69 rows=1 width=4) (actual time=0.025..0.025 rows=1 loops=1)
-> Index Scan using b__c_id on b (cost=0.00..8.33 rows=1 width=8) (actual time=0.007..0.007 rows=1 loops=1)
Index Cond: (c_id = c.c_id)
-> Index Only Scan using a__b_id on a (cost=0.00..0.35 rows=1 width=4) (actual time=0.014..0.014 rows=1 loops=1)
Index Cond: (b_id = b.b_id)
Total runtime: 0.089 ms
</pre>

This part of the plan depends on a parent node (c_id=c.c_id). This part of the plan is called each time with a different parameter coming from the parent node.

This plan is of course much faster, as there is no need to fully scan a, and to fully scan AND hash b.

=SP-GiST=

SP-GiST stands for Space Partitionned GiST, GiST being Generalized Search Tree. GiST is an index type, and has been available for quite a while in PostgreSQL. GiST is already very efficient at indexing complex data types, but performance tends to suffer when the source data isn't uniformly distributed. SP-GiST tries to fix that.

As all indexing methods available in PostgreSQL, SP-GiST is a generic indexing method, meaning its purpose is to index whatever you'll throw at it, using operators you'll provide. It means that if you want to create a new datatype, and make it indexable through SP-GiST, you'll have to follow the documented API.

SP-GiST can be used to implement 3 type of indexes: trie (suffix) indexing, Quadtree (data is divided into quadrants), and k-d tree (k-dimensional tree).

For now, SP-GiST is provided with operator families called "quad_point_ops", "kd_point_ops" and "text_ops".

As their names indicate, the first one indexes point types, using a quadtree, the second one indexes point types using a k-d tree, and the third one indexes text, using suffix.

=pg_stat_statements=

This contrib module has received a lot of improvements in this version:

* Queries are normalized: queries that are identical except for their constant values will be considered the same, as long as their post-parse analysis query tree (that is, the internal representation of the query before rule expansion) are the same. This also implies that differences that are not semantically essential to the query, such as variations in whitespace or alias names, or the use of one particular syntax over another equivalent one will not differentiate queries.

<pre>
=#SELECT * FROM words WHERE word= 'foo';
word
------
(0 ligne)

=# SELECT * FROM words WHERE word= 'bar';
word
------
bar

=#select * from pg_stat_statements where query like '%words where%';
-[ RECORD 1 ]-------+-----------------------------------
userid | 10
dbid | 16384
query | SELECT * FROM words WHERE word= ?;
calls | 2
total_time | 142.314
rows | 1
shared_blks_hit | 3
shared_blks_read | 5
shared_blks_dirtied | 0
shared_blks_written | 0
local_blks_hit | 0
local_blks_read | 0
local_blks_dirtied | 0
local_blks_written | 0
temp_blks_read | 0
temp_blks_written | 0
blk_read_time | 142.165
blk_write_time | 0

</pre>

The two queries are shown as one in pg_stat_statements.

* For prepared statements, the execution part (execute statement) is charged on the prepare statement. That way it is easier to use, and avoids the double-counting there was with PostgreSQL 9.1.

* pg_stat_statements displays timing in milliseconds, to be consistent with other system views.

= Explain improvements=

* Timing can now be disabled with EXPLAIN (analyze on, timing off), leading to lower overhead on platforms where getting the current time is expensive 

=# EXPLAIN (analyze on,timing off) SELECT * FROM reservation ;
QUERY PLAN
----------------------------------------------------------------------------------------
Seq Scan on reservation (cost=0.00..22.30 rows=1230 width=36) (actual rows=2 loops=1)
Total runtime: 0.045 ms

* Have EXPLAIN ANALYZE report the number of rows rejected by filter steps 

This new feature makes it much easier to know how many rows are removed by a filter (and spot potential places to put indexes):

=# EXPLAIN ANALYZE SELECT * FROM test WHERE a ~ 'tra';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on test (cost=0.00..106876.56 rows=2002 width=11) (actual time=2.914..8538.285 rows=120256 loops=1)
Filter: (a ~ 'tra'::text)
Rows Removed by Filter: 5905600
Total runtime: 8549.539 ms
(4 rows)

=Backward compatibility=

These changes may incur regressions in your applications.

==Ensure that xpath() escapes special characters in string values  ==

Before 9.2:
<pre>
SELECT (XPATH('/*/text()', '<root>&lt;</root>'))[1];
xpath
-------
<

'<' Isn't valid XML.
</pre>
With 9.2:
<pre>
SELECT (XPATH('/*/text()', '<root>&lt;</root>'))[1];
xpath
-------
&lt;
</pre>

==Remove hstore's => operator ==
Up to 9.1, one could use the => operator to create a hstore. Hstore is a contrib, used to store key/values pairs in a column.

In 9.1:
<pre>
=# SELECT 'a'=>'b';
?column?
----------
"a"=>"b"
(1 row)

=# SELECT pg_typeof('a'=>'b');
pg_typeof
-----------
hstore
(1 row)
</pre>

With 9.2:
<pre>
SELECT 'a'=>'b';
ERROR: operator does not exist: unknown => unknown at character 11
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
STATEMENT: SELECT 'a'=>'b';
ERROR: operator does not exist: unknown => unknown
LINE 1: SELECT 'a'=>'b';
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
</pre>

It doesn't mean one cannot use '=>' in hstores, it just isn't an operator anymore:

<pre>
=# select hstore('a=>b');
hstore
----------
"a"=>"b"
(1 row)

=# select hstore('a','b');
hstore
----------
"a"=>"b"
(1 row)
</pre>
are still two valid ways to input a hstore.

"=>" is removed as an operator as it is a reserved keyword in SQL.

==Have pg_relation_size() and friends return NULL if the object does not exist ==

A relation could be dropped by a concurrent session, while one was doing a pg_relation_size on it, leading to a SQL exception. Now, it merely returns NULL for this record.

==Remove the spclocation field from pg_tablespace ==

The spclocation field provided the real location of the tablespace. It was filled in during the CREATE or ALTER TABLESPACE command. So it could be wrong: somebody just had to shutdown the cluster, move the tablespace's directory, re-create the symlink in pg_tblspc, and forget to update the spclocation field. The cluster would still run, as the spclocation wasn't used.

So this field has been removed. To get the tablespace's location, use pg_tablespace_location():

<pre>
=# SELECT *, pg_tablespace_location(oid) AS spclocation FROM pg_tablespace;
spcname | spcowner | spcacl | spcoptions | spclocation
------------+----------+--------+------------+----------------
pg_default | 10 | | |
pg_global | 10 | | |
tmptblspc | 10 | | | /tmp/tmptblspc
</pre>

==Have EXTRACT of a non-timezone-aware value measure the epoch from local midnight, not UTC midnight ==

With PostgreSQL 9.1:

<pre>
=#SELECT extract(epoch FROM '2012-07-02 00:00:00'::timestamp);
date_part
------------
1341180000
(1 row)

=# SELECT extract(epoch FROM '2012-07-02 00:00:00'::timestamptz);
date_part
------------
1341180000
(1 row)
</pre>

There is no difference in behaviour between a timstamp with or without timezone.

With 9.1:
<pre>
=#SELECT extract(epoch FROM '2012-07-02 00:00:00'::timestamp);
date_part
------------
1341187200
(1 row)

=# SELECT extract(epoch FROM '2012-07-02 00:00:00'::timestamptz);
date_part
------------
1341180000
(1 row)
</pre>
When the timestamp has no timezone, the epoch is calculated with the "local midnight", meaning the 1st january of 1970 at midnight, local-time.

==Fix to_date() and to_timestamp() to wrap incomplete dates toward 2020 ==

The wrapping was not consistent between 2 digit dates and 3 digit dates: 2 digit dates always chose the date closest to 2020, 3 digit dates mapped dates from 100 to 999 on 1100 to 1999, and 000 to 099 on 2000 to 2099.

Now PostgreSQL chooses the date closest to 2020, for 2 and 3 digit dates.

With 9.1:
<pre>
=# SELECT to_date('200-07-02','YYY-MM-DD');
to_date
------------
1200-07-02
</pre>

With 9.2:
<pre>
SELECT to_date('200-07-02','YYY-MM-DD');
to_date
------------
2200-07-02
</pre>

==pg_stat_activity and pg_stat_replication's definitions have changed ==

The view pg_stat_activity has changed. It's not backward compatible, but let's see what this new definition brings us:

* current_query disappears and is replaced by two columns:
** state: is the session running a query, waiting
** query: what is the last run (or still running if stat is "active") query
* The column procpid is renamed to pid, to be consistent with other system views

The benefit is mostly for tracking «idle in transaction» sessions. Up until now, all we could know was that one of these sessions was idle in transaction, meaning it has started a transaction, maybe done some operations, but still not committed. If that session stayed in this state for a while, there was no way of knowing how it got in this state.

Here is an example:
<pre>
-[ RECORD 1 ]----+---------------------------------
datid | 16384
datname | postgres
pid | 20804
usesysid | 10
usename | postgres
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2012-07-02 15:02:51.146427+02
xact_start | 2012-07-02 15:15:28.386865+02
query_start | 2012-07-02 15:15:30.410834+02
state_change | 2012-07-02 15:15:30.411287+02
waiting | f
state | idle in transaction
query | DELETE FROM test;
</pre>

With PostgreSQL 9.1, all we would have would be «idle in transaction».

As this change was backward-incompatible, procpid was also renamed to pid, to be more consistent with other system views.
The view pg_stat_replication has also changed. The column procpid is renamed to pid, to also be consistent with other system views.

==Change all SQL-level statistics timing values to float8-stored milliseconds ==

pg_stat_user_functions.total_time, pg_stat_user_functions.self_time, pg_stat_xact_user_functions.total_time, pg_stat_xact_user_functions.self_time, and pg_stat_statements.total_time (contrib) are now in milliseconds, to be consistent with the rest of the timing values.

==postgresql.conf parameters changes ==

* silent_mode has been removed. Use pg_ctl -l postmaster.log
* wal_sender_delay has been removed. It is no longer needed
* custom_variable_classes has been removed. All «classes» are accepted without declaration now
* ssl_ca_file, ssl_cert_file, ssl_crl_file, ssl_key_file have been added, meaning you can now specify the ssl files

= Other new features =

== DROP INDEX CONCURRENTLY ==

The regular DROP INDEX command takes an exclusive lock on the table. Most of the time, this isn't a problem, because this lock is short-lived. The problem usually occurs when:

* A long-running transaction is running, and has a (shared) lock on the table
* A DROP INDEX is run on this table in another session, asking for an exclusive lock (and waiting for it, as it won't be granted until the long-running transaction ends)

At this point, all other transactions needing to take a shared lock on the table (for a simple SELECT for instance) will have to wait too: their lock acquisition is queued after the DROP INDEX's one.

DROP INDEX CONCURRENTLY works around this and won't lock normal DML statements, just as CREATE INDEX CONCURRENTLY. The limitations are also the same: Since you can only DROP one index with the CONCURRENTLY option, and the CASCADE option is not supported.

== NOT VALID CHECK constraints ==

PostgreSQL 9.1 introduced «NOT VALID» foreign keys. This has been extended to CHECK constraints. Adding a «NOT VALID» constraint on a table means that current data won't be validated, only new and updated rows.

=# CREATE TABLE test (a int);
CREATE TABLE
=# INSERT INTO test SELECT generate_series(1,100);
INSERT 0 100
=# ALTER TABLE test ADD CHECK (a>100) NOT VALID;
ALTER TABLE
=# INSERT INTO test VALUES (99);
ERROR: new row for relation "test" violates check constraint "test_a_check"
DETAIL: Failing row contains (99).
=# INSERT INTO test VALUES (101);
INSERT 0 1

Then, later, we can validate the whole table:

=# ALTER TABLE test VALIDATE CONSTRAINT test_a_check ;
ERROR: check constraint "test_a_check" is violated by some row

Domains, which are types with added constraints, can also be declared as not valid, and validated later.

Check constraints can also be renamed now:

=# ALTER TABLE test RENAME CONSTRAINT test_a_check TO validate_a;
ALTER TABLE

== NO INHERIT constraints ==

Here is another improvement about constraints: they can be declared as not inheritable, which will be useful in partitioned environments. Let's take PostgreSQL documentation example, and see how it improves the situation:

CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int,
CHECK (logdate IS NULL) NO INHERIT
);

CREATE TABLE measurement_y2006m02 (
CHECK ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
) INHERITS (measurement);
CREATE TABLE measurement_y2006m03 (
CHECK ( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )
) INHERITS (measurement);

INSERT INTO measurement VALUES (1,'2006-02-20',1,1);
ERROR: new row for relation "measurement" violates check constraint "measurement_logdate_check"
DETAIL: Failing row contains (1, 2006-02-20, 1, 1).
INSERT INTO measurement_y2006m02 VALUES (1,'2006-02-20',1,1);
INSERT 0 1

Until now, every check constraint created on measurement would have been inherited by children tables. So adding a constraint forbidding inserts, or allowing only some of them, on the parent table was impossible.

== Reduce ALTER TABLE rewrites ==

A table won't get rewritten anymore during an ALTER TABLE when changing the type of a column in the following cases:

* varchar(x) to varchar(y) when y>=x. It works too if going from varchar(x) to varchar or text (no size limitation)
* numeric(x,z) to numeric(y,z) when y>=x, or to numeric without specifier
* varbit(x) to varbit(y) when y>=x, or to varbit without specifier
* timestamp(x) to timestamp(y) when y>=x or timestamp without specifier
* timestamptz(x) to timestamptz(y) when y>=x or timestamptz without specifier
* interval(x) to interval(y) when y>=x or interval without specifier

== Security barriers and Leakproof ==

This new feature has to do with views security. First, let's explain the problem, with a very simplified example:

=# CREATE TABLE all_data (company_id int, company_data varchar);
CREATE TABLE
# INSERT INTO all_data VALUES (1,'secret_data_for_company_1');
INSERT 0 1
=# INSERT INTO all_data VALUES (2,'secret_data_for_company_2');
INSERT 0 1
=# CREATE VIEW company1_data AS SELECT * FROM all_data WHERE company_id = 1;
CREATE VIEW

This is a quite classical way of giving access to only a part of a table to a user: we'll create a user for company_id 1, grant to him the right to access company1_data, and deny him the right to access all_data.

The plan to this query is the following:

=# explain SELECT * FROM company1_data ;
QUERY PLAN
----------------------------------------------------------
Seq Scan on all_data (cost=0.00..25.38 rows=6 width=36)
Filter: (company_id = 1)

Even if there was more data, a sequential scan could still be forced: just "SET enable_indexscan to OFF" and the likes.

So this query reads all the records from all_data, filters them, and returns to the user only the matching rows. There is a way to display scanned records before they are filtered: just create a function with a very low cost, and call it while doing the query:

CREATE OR REPLACE FUNCTION peek(text) RETURNS boolean LANGUAGE plpgsql AS
$$
BEGIN
RAISE NOTICE '%',$1;
RETURN true;
END
$$
COST 0.1;

This function just has to cost less than the = operator, which costs 1, to be executed first.

The result is this:

=# SELECT * FROM company1_data WHERE peek(company1_data.company_data);
NOTICE: secret_data_for_company_1
NOTICE: secret_data_for_company_2
company_id | company_data
------------+---------------------------
1 | secret_data_for_company_1
(1 row)

We got access to the record from the second company (in the NOTICE messages).

So this is the first new feature: the view can be declared as implementing "security barriers":

=# CREATE VIEW company1_data WITH (security_barrier) AS SELECT * FROM all_data WHERE company_id = 1;
CREATE VIEW
=# SELECT * FROM company1_data WHERE peek(company1_data.company_data);
NOTICE: secret_data_for_company_1
company_id | company_data
------------+---------------------------
1 | secret_data_for_company_1
(1 row)

The view is not leaking anymore. The problem, of course, is that there is a performance impact: maybe the "peek" function could have made the query faster, by filtering lots of rows early in the plan.

This leads to the complementary feature: some function may be declared as "LEAKPROOF", meaning that they won't leak the data they are passed into error or notice messages.

Declaring our peek function as LEAKPROOF is a very bad idea, but let's do it just to demonstrate how it's used:

CREATE OR REPLACE FUNCTION peek(text) RETURNS boolean LEAKPROOF LANGUAGE plpgsql AS
$$
BEGIN
RAISE NOTICE '%',$1;
RETURN true;
END
$$
COST 0.1;

A LEAKPROOF function is executed «normally»:

=# SELECT * FROM company1_data WHERE peek(company1_data.company_data);
NOTICE: secret_data_for_company_1
NOTICE: secret_data_for_company_2
company_id | company_data
------------+---------------------------
1 | secret_data_for_company_1
(1 row)

Of course, in our case, peek isn't LEAKPROOF and shouldn't be declared as such. Only superuser have the permission to declare a LEAKPROOF function.

== New options for pg_dump ==

Until now, one could ask pg_dump to dump a table's data, or a table's meta-data (DDL statements for creating the table's structure, indexes, constraints). Some meta-data is better restored before the data (the table's structure, check constraints), some is better after the data (indexes, unique constraints, foreign keys…), for performance reasons mostly.

So there are now a few more options:

* --section=pre-data: dump what's needed before restoring the data. Of course, this can be combined with a -t for instance, to specify one table
* --section=post-data : dump what's needed after restoring the data.
* --section=data: dump the data
* --exclude-table-data: dump everything, except THIS table's data. It means pg_dump will still dump other tables' data.

[[Category:PostgreSQL 9.2]]

What's new in PostgreSQL 9.2

2012-08-29T18:41:42Z

Schmiddy: grammar fixes

{{Languages}}

This document showcases many of the latest developments in PostgreSQL 9.2, compared to the last major release – PostgreSQL 9.1. There are many improvements in this release, so this wiki page covers many of the more important changes in detail. The full list of changes is itemised in ''Release Notes''.

=Major new features=

==Index-only scans ==

In PostgreSQL, indexes have no "visibility" information. It means that when you access a record by its index, PostgreSQL has to visit the real tuple in the table to be sure it is visible to you: the tuple the index points to may simply be an old version of the record you are looking for.

It can be a very big performance problem: the index is mostly ordered, so accessing its records is quite efficient, while the records may be scattered all over the place (that's a reason why PostgreSQL has a cluster command, but that's another story). In 9.2, PostgreSQL will use an "Index Only Scan" when possible, and not access the record itself if it doesn't need to.

There is still no visibility information in the index. So in order to do this, PostgreSQL uses the visibility map ([http://www.postgresql.org/docs/devel/static/storage-vm.html visibility map]) , which tells it whether the whole content of a (usually) 8K page is visible to all transactions or not. When the index record points to a tuple contained in an «all visible» page, PostgreSQL won't have to access the tuple, it will be able to build it directly from the index. Of course, all the columns requested by the query must be in the index.

The visibility map is maintained by VACUUM (it sets the visible bit), and by the backends doing SQL work (they unset the visible bit).

If the data has been read only since the last VACUUM then the data is All Visible and the index only scan feature can improve performance.

Here is an example.

CREATE TABLE demo_ios (col1 float, col2 float, col3 text);

In this table, we'll put random data, in order to have "scattered" data. We'll put 100 million records, to have a big recordset, and have it not fit in memory (that's a 4GB-ram machine). This is an ideal case, made for this demo. The gains won't be that big in real life.

INSERT INTO demo_ios SELECT generate_series(1,100000000),random(), 'mynotsolongstring';

SELECT pg_size_pretty(pg_total_relation_size('demo_ios'));
pg_size_pretty
----------------
6512 MB

Let's pretend that the query is this:

SELECT col1,col2 FROM demo_ios where col2 BETWEEN 0.01 AND 0.02

In order to use an index only scan on this query, we need an index on col2,col1 (col2 first, as it is used in the WHERE clause).

CREATE index idx_demo_ios on demo_ios(col2,col1);

We vacuum the visibility map to be up-to-date:

VACUUM demo_ios;

All the timing you'll see below are done on a cold OS and PostgreSQL cache (that's where the gains are, as the purpose on Index Only Scans is to reduce I/O).

Let's first try without Index Only Scans:

SET enable_indexonlyscan to off;

EXPLAIN (analyze,buffers) select col1,col2 FROM demo_ios where col2 between 0.01 and 0.02;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on demo_ios (cost=25643.01..916484.44 rows=993633 width=16) (actual time=763.391..362963.899 rows=1000392 loops=1)
Recheck Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double precision))
Rows Removed by Index Recheck: 68098621
Buffers: shared hit=2 read=587779
-> Bitmap Index Scan on idx_demo_ios (cost=0.00..25394.60 rows=993633 width=0) (actual time=759.011..759.011 rows=1000392 loops=1)
Index Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double precision))
Buffers: shared hit=2 read=3835
Total runtime: 364390.127 ms

With Index Only Scans:

explain (analyze,buffers) select col1,col2 from demo_ios where col2 between 0.01 and 0.02;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using idx_demo_ios on demo_ios (cost=0.00..35330.93 rows=993633 width=16) (actual time=58.100..3250.589 rows=1000392 loops=1)
Index Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double precision))
Heap Fetches: 0
Buffers: shared hit=923073 read=3848
Total runtime: 4297.405 ms

As nothing is free, there are a few things to keep in mind:

* Adding indexes for index only scans obviously adds indexes to your table. So updates will be slower.
* You will index columns that weren't indexed before. So there will be less opportunities for HOT updates.
* Gains will probably be smaller in real life situations, especially when data is changed between VACUUMs

This required making visibility map changes crash-safe, so visibility map bit changes are now WAL-logged.

==Replication improvements ==

Streaming Replication becomes more polished with this release.

One on the main remaining gripes about streaming replication is that all the slaves have to be connected to the same and unique master, consuming its resources. Moreover, in case of a failover, it could be complicated to reconnect all the remaining slaves to the newly promoted master, if not using a tool like repmgr.

* With 9.2, a standby can also send replication changes, allowing cascading replication.

Let's build this. We start with an already working 9.2 database.

We set it up for replication:

postgresql.conf:
wal_level=hot_standby #(could be archive too)
max_wal_senders=5
hot_standby=on

You'll probably also want to activate archiving in production, it won't be done here.

pg_hba.conf (do not use trust in production):
host replication replication_user 0.0.0.0/0 md5

Create the user:
create user replication_user replication password 'secret';

Clone the database:

pg_basebackup -h localhost -U replication_user -D data2
Password:

We have a brand new cluster in the data2 directory. We'll change the port so that it can start (postgresql.conf):
port=5433

We add a recovery.conf to tell it how to stream from the master database:
standby_mode = on
primary_conninfo = 'host=localhost port=5432 user=replication_user password=secret'

pg_ctl -D data2 start
server starting
LOG: database system was interrupted; last known up at 2012-07-03 17:58:09 CEST
LOG: creating missing WAL directory "pg_xlog/archive_status"
LOG: entering standby mode
LOG: streaming replication successfully connected to primary
LOG: redo starts at 0/9D000020
LOG: consistent recovery state reached at 0/9D0000B8
LOG: database system is ready to accept read only connections

Now, let's add a second slave, which will use this slave:

pg_basebackup -h localhost -U replication_user -D data3 -p 5433
Password:

We edit data3's postgresql.conf to change the port:
port=5434

We modify the recovery.conf to stream from the slave:
standby_mode = on
primary_conninfo = 'host=localhost port=5433 user=replication_user password=secret' # e.g. 'host=localhost port=5432'

We start the cluster:
pg_ctl -D data3 start
server starting
LOG: database system was interrupted while in recovery at log time 2012-07-03 17:58:09 CEST
HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
LOG: creating missing WAL directory "pg_xlog/archive_status"
LOG: entering standby mode
LOG: streaming replication successfully connected to primary
LOG: redo starts at 0/9D000020
LOG: consistent recovery state reached at 0/9E000000
LOG: database system is ready to accept read only connections

Now, everything modified on the master cluster get streamed to the first slave, and from there to the second slave. This second replication has to be monitored from the first slave (the master knows nothing about it).

* As you may have noticed from the examble, pg_basebackup now works from slaves.

* There is another use case that wasn't covered: what if a user didn't care for having a full fledged slave, and only wanted to stream the WAL files to another location, to benefit from the reduced data loss without the burden of maintaining a slave ?

pg_receivexlog is provided just for this purpose: it pretends to be a PostgreSQL slave, but only stores the log files as they are streamed, in a directory:
pg_receivexlog -D /tmp/new_logs -h localhost -U replication_user

will connect to the master (or a slave), and start creating files:
ls /tmp/new_logs/
00000001000000000000009E.partial

Files are of the segment size, so they can be used for a normal recovery of the database. It's the same as an archive command, but with a much smaller granularity.
Remember to rename the last segment to remove the .partial suffix before using it with PITR or other.

* synchronous_commit has a new value: remote_write. It can be used when there is a synchronous slave (synchronous_standby_names is set), meaning that the master doesn't have to wait for the slave to have written the data to disk, only for the slave to have acknowledged the data. With this set, data is protected from a crash on the master, but could still be lost if the slave crashed at the same time (i.e. before having written the in flight data to disk). As this is a quite remote possibility, some people will be interested in this compromise.

==JSON datatype==

The JSON datatype is meant for storing JSON-structured data. It will validate that the input JSON string is correct JSON:

=# SELECT '{"username":"john","posts":121,"emailaddress":"john@nowhere.com"}'::json;
json
-------------------------------------------------------------------
{"username":"john","posts":121,"emailaddress":"john@nowhere.com"}
(1 row)

=# SELECT '{"username","posts":121,"emailaddress":"john@nowhere.com"}'::json;
ERROR: invalid input syntax for type json at character 8
DETAIL: Expected ":", but found ",".
CONTEXT: JSON data, line 1: {"username",...
STATEMENT: SELECT '{"username","posts":121,"emailaddress":"john@nowhere.com"}'::json;
ERROR: invalid input syntax for type json
LINE 1: SELECT '{"username","posts":121,"emailaddress":"john@nowhere...
^
DETAIL: Expected ":", but found ",".
CONTEXT: JSON data, line 1: {"username",...

You can also convert a row type to JSON:

=#SELECT * FROM demo ;
username | posts | emailaddress
----------+-------+---------------------
john | 121 | john@nowhere.com
mickael | 215 | mickael@nowhere.com
(2 rows)

=# SELECT row_to_json(demo) FROM demo;
row_to_json
-------------------------------------------------------------------------
{"username":"john","posts":121,"emailaddress":"john@nowhere.com"}
{"username":"mickael","posts":215,"emailaddress":"mickael@nowhere.com"}
(2 rows)

Or an array type:

=# select array_to_json(array_agg(demo)) from demo;
array_to_json
---------------------------------------------------------------------------------------------------------------------------------------------
[{"username":"john","posts":121,"emailaddress":"john@nowhere.com"},{"username":"mickael","posts":215,"emailaddress":"mickael@nowhere.com"}]
(1 row)

== Range Types ==

Range types are used to store a range of data of a given type. There are a few pre-defined types. They are integer (int4range), bigint (int8range), numeric (numrange), timestamp without timezone (tsrange), timestamp with timezone (tstzrange), and date (daterange).

Ranges can be made of continuous (numeric, timestamp...) or discrete (integer, date...) data types. They can be open (the bound isn't part of the range) or closed (the bound is part of the range). A bound can also be infinite.

Without these datatypes, most people solve the range problems by using two columns in a table. These range types are much more powerful, as you can use many operators on them:

Here is the intersection between then 1000(open)-2000(closed) and 1000(closed)-1200(closed) numeric range:

SELECT '(1000,2000]'::numrange * '[1000,1200]'::numrange;
?column?
-------------
(1000,1200]
(1 row)

So you can query on things like: «give me all ranges that intersect this»:

=# SELECT * from test_range ;
period
-----------------------------------------------------
["2012-01-01 00:00:00+01","2012-01-02 12:00:00+01"]
["2012-01-01 00:00:00+01","2012-03-01 00:00:00+01"]
["2008-01-01 00:00:00+01","2015-01-01 00:00:00+01"]
(3 rows)

=# SELECT * FROM test_range WHERE period && '[2012-01-03 00:00:00,2012-01-03 12:00:00]';
period
-----------------------------------------------------
["2012-01-01 00:00:00+01","2012-03-01 00:00:00+01"]
["2008-01-01 00:00:00+01","2015-01-01 00:00:00+01"]
(2 rows)

This query could use an index defined like this:

=# CREATE INDEX idx_test_range on test_range USING gist (period);

You can also use these range data types to define exclusion constraints:

CREATE EXTENSION btree_gist ;
CREATE TABLE reservation (room_id int, period tstzrange);
ALTER TABLE reservation ADD EXCLUDE USING GIST (room_id WITH =, period WITH &&);

This means that now it is forbidden to have two records in this table where room_id is equal and period overlaps. The extension btree_gist is required to create a GiST index on room_id (it's an integer, it is usually indexed with a btree index).

=# INSERT INTO reservation VALUES (1,'(2012-08-23 14:00:00,2012-08-23 15:00:00)');
INSERT 0 1
=# INSERT INTO reservation VALUES (2,'(2012-08-23 14:00:00,2012-08-23 15:00:00)');
INSERT 0 1
=# INSERT INTO reservation VALUES (1,'(2012-08-23 14:45:00,2012-08-23 15:15:00)');
ERROR: conflicting key value violates exclusion constraint "reservation_room_id_period_excl"
DETAIL: Key (room_id, period)=(1, ("2012-08-23 14:45:00+02","2012-08-23 15:15:00+02"))
conflicts with existing key (room_id, period)=(1, ("2012-08-23 14:00:00+02","2012-08-23 15:00:00+02")).
STATEMENT: INSERT INTO reservation VALUES (1,'(2012-08-23 14:45:00,2012-08-23 15:15:00)');

One can also declare new range types.

=Performance improvements=

This version has performance improvements on a very large range of domains (non-exaustive):

* The most visible will probably be the Index Only Scans, which has already been introduced in this document.

* The lock contention of several big locks has been significantly reduced, leading to better multi-processor scalability, for machines with over 32 cores mostly. 

* The performance of in-memory sorts has been improved by up to 25% in some situations, with certain specialized sort functions introduced. 

* An idle PostgreSQL server now makes less wakeups, leading to lower power consumption. This is especially useful on virtualized and embedded environments.

* COPY has been improved, it will generate less WAL volume and fewer locks of a table's pages. 

* Statistics are collected on array contents, allowing for better estimations of selectivity on array operations.

* Text-to-anytype concatenation and quote_literal/quote_nullable functions are not volatile any more, enabling better optimization in some cases 

* The system can now track IO durations 

This one deserves a little explanation, as it can be a little tricky. Tracking IO durations means asking repeatedly the time to the operating system. Depending on the operating system and the hardware, this can be quite cheap, or extremely costly. The most import factor here is where the system gets its time from. It could be directly retrieved from the processor (TSC), dedicated hardware such as HPET, or an ACPI call. What's most important is that the cost of getting time can vary from a factor of thousands.

If you are interested in this timing data, it's better to first check if your system will support it without to much of a performance hit. PostgreSQL provides you with the pg_test_timing tool:

<pre>
$ pg_test_timing
Testing timing overhead for 3 seconds.
Per loop time including overhead: 28.02 nsec
Histogram of timing durations:
< usec: count percent
32: 41 0.00004%
16: 1405 0.00131%
8: 200 0.00019%
4: 388 0.00036%
2: 2982558 2.78523%
1: 104100166 97.21287%
</pre>

Here, everything is good: getting time costs around 28 nanoseconds, and has a very small variation. Anything under 100 nanoseconds should be good for production. If you get higher values, you may still find a way to tune your system. You'd better check on the [http://www.postgresql.org/docs/9.2/static/pgtesttiming.html documentation].

Anyway, here is the data you'll be able to collect if your system is ready for this:

First, you'll get per-database statistics, which will now give accurate informations about which database is doing most I/O:

<pre>
=# SELECT * FROM pg_stat_database WHERE datname = 'mydb';
-[ RECORD 1 ]--+------------------------------
datid | 16384
datname | mydb
numbackends | 1
xact_commit | 270
xact_rollback | 2
blks_read | 1961
blks_hit | 17944
tup_returned | 269035
tup_fetched | 8850
tup_inserted | 16
tup_updated | 4
tup_deleted | 45
conflicts | 0
temp_files | 0
temp_bytes | 0
deadlocks | 0
blk_read_time | 583.774
blk_write_time | 0
stats_reset | 2012-07-03 17:18:54.796817+02
</pre>
We see here that mydb has only consumed 583.774 milliseconds of read time.

Explain will benefit from this too:
<pre>
=# EXPLAIN (analyze,buffers) SELECT count(*) FROM mots ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Aggregate (cost=1669.95..1669.96 rows=1 width=0) (actual time=21.943..21.943 rows=1 loops=1)
Buffers: shared read=493
I/O Timings: read=2.578
-> Seq Scan on mots (cost=0.00..1434.56 rows=94156 width=0) (actual time=0.059..12.933 rows=94156 loops=1)
Buffers: shared read=493
I/O Timings: read=2.578
Total runtime: 22.059 ms
</pre>
We now have a separate information about the time taken to retrieve data from the operating system. Obviously, here, the data was in the operating system's cache (2 milliseconds to read 493 blocks).

And last, if you have enabled pg_stat_statements:
<pre>
select * from pg_stat_statements where query ~ 'words';
-[ RECORD 1 ]-------+---------------------------
userid | 10
dbid | 16384
query | select count(*) from words;
calls | 2
total_time | 78.332
rows | 2
shared_blks_hit | 0
shared_blks_read | 986
shared_blks_dirtied | 0
shared_blks_written | 0
local_blks_hit | 0
local_blks_read | 0
local_blks_dirtied | 0
local_blks_written | 0
temp_blks_read | 0
temp_blks_written | 0
blk_read_time | 58.427
blk_write_time | 0
</pre>

* As for every version, the optimizer has received its share of improvements 
** Prepared statements used to be optimized once, without any knowledge of the parameters' values. With 9.2, the planner will use specific plans regarding to the parameters sent (the query will be planned at execution), except if the query is executed several times and the planner decides that the generic plan is not too much more expensive than the specific plans.
** A new feature has been added: parameterized paths. Simply put, it means that a sub-part of a query plan can use parameters it has got from a parent node. It fixes several bad plans that could occur, especially when the optimizer couldn't reorder joins to put nested loops where it would have been efficient.

This example is straight from the developpers mailing lists :

<pre>
CREATE TABLE a (
a_id serial PRIMARY KEY NOT NULL,
b_id integer
);
CREATE INDEX a__b_id ON a USING btree (b_id);

CREATE TABLE b (
b_id serial NOT NULL,
c_id integer
);
CREATE INDEX b__c_id ON b USING btree (c_id);

CREATE TABLE c (
c_id serial PRIMARY KEY NOT NULL,
value integer UNIQUE
);

INSERT INTO b (b_id, c_id)
SELECT g.i, g.i FROM generate_series(1, 50000) g(i);

INSERT INTO a(b_id)
SELECT g.i FROM generate_series(1, 50000) g(i);

INSERT INTO c(c_id,value)
VALUES (1,1);
</pre>

So we have a referencing b, b referencing c.

Here is an example of a query working badly with PostgreSQL 9.1:

<pre>
EXPLAIN ANALYZE SELECT 1
FROM
c
WHERE
EXISTS (
SELECT *
FROM a
JOIN b USING (b_id)
WHERE b.c_id = c.c_id)
AND c.value = 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=1347.00..3702.27 rows=1 width=0) (actual time=13.799..13.802 rows=1 loops=1)
Join Filter: (c.c_id = b.c_id)
-> Index Scan using c_value_key on c (cost=0.00..8.27 rows=1 width=4) (actual time=0.006..0.008 rows=1 loops=1)
Index Cond: (value = 1)
-> Hash Join (cost=1347.00..3069.00 rows=50000 width=4) (actual time=13.788..13.788 rows=1 loops=1)
Hash Cond: (a.b_id = b.b_id)
-> Seq Scan on a (cost=0.00..722.00 rows=50000 width=4) (actual time=0.007..0.007 rows=1 loops=1)
-> Hash (cost=722.00..722.00 rows=50000 width=8) (actual time=13.760..13.760 rows=50000 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 1954kB
-> Seq Scan on b (cost=0.00..722.00 rows=50000 width=8) (actual time=0.008..5.702 rows=50000 loops=1)
Total runtime: 13.842 ms
</pre>

Not that bad, 13 milliseconds. Still, we are doing sequential scans on a and b, when our common sense tells us that c.value=1 should be used to filter rows more aggressively.

Here's what 9.2 does with this query:

<pre>
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=0.00..16.97 rows=1 width=0) (actual time=0.035..0.037 rows=1 loops=1)
-> Index Scan using c_value_key on c (cost=0.00..8.27 rows=1 width=4) (actual time=0.007..0.009 rows=1 loops=1)
Index Cond: (value = 1)
-> Nested Loop (cost=0.00..8.69 rows=1 width=4) (actual time=0.025..0.025 rows=1 loops=1)
-> Index Scan using b__c_id on b (cost=0.00..8.33 rows=1 width=8) (actual time=0.007..0.007 rows=1 loops=1)
Index Cond: (c_id = c.c_id)
-> Index Only Scan using a__b_id on a (cost=0.00..0.35 rows=1 width=4) (actual time=0.014..0.014 rows=1 loops=1)
Index Cond: (b_id = b.b_id)
Total runtime: 0.089 ms
</pre>

The «parameterized path» is:
<pre>
-> Nested Loop (cost=0.00..8.69 rows=1 width=4) (actual time=0.025..0.025 rows=1 loops=1)
-> Index Scan using b__c_id on b (cost=0.00..8.33 rows=1 width=8) (actual time=0.007..0.007 rows=1 loops=1)
Index Cond: (c_id = c.c_id)
-> Index Only Scan using a__b_id on a (cost=0.00..0.35 rows=1 width=4) (actual time=0.014..0.014 rows=1 loops=1)
Index Cond: (b_id = b.b_id)
Total runtime: 0.089 ms
</pre>

This part of the plan depends on a parent node (c_id=c.c_id). This part of the plan is called each time with a different parameter coming from the parent node.

This plan is of course much faster, as there is no need to fully scan a, and to fully scan AND hash b.

=SP-GiST=

SP-GiST stands for Space Partitionned GiST, GiST being Generalized Search Tree. GiST is an index type, and has been available for quite a while in PostgreSQL. GiST is already very efficient at indexing complex data types, but performance tends to suffer when the source data isn't uniformly distributed. SP-GiST tries to fix that.

As all indexing methods available in PostgreSQL, SP-GiST is a generic indexing method, meaning its purpose is to index whatever you'll throw at it, using operators you'll provide. It means that if you want to create a new datatype, and make it indexable through SP-GiST, you'll have to follow the documented API.

SP-GiST can be used to implement 3 type of indexes: trie (suffix) indexing, Quadtree (data is divided into quadrants), and k-d tree (k-dimensional tree).

For now, SP-GiST is provided with operator families called "quad_point_ops", "kd_point_ops" and "text_ops".

As their names indicate, the first one indexes point types, using a quadtree, the second one indexes point types using a k-d tree, and the third one indexes text, using suffix.

=pg_stat_statements=

This contrib module has received a lot of improvements in this version:

* Queries are normalized: queries that are identical except for their constant values will be considered the same, as long as their post-parse analysis query tree (that is, the internal representation of the query before rule expansion) are the same. This also implies that differences that are not semantically essential to the query, such as variations in whitespace or alias names, or the use of one particular syntax over another equivalent one will not differentiate queries.

<pre>
=#SELECT * FROM words WHERE word= 'foo';
word
------
(0 ligne)

=# SELECT * FROM words WHERE word= 'bar';
word
------
bar

=#select * from pg_stat_statements where query like '%words where%';
-[ RECORD 1 ]-------+-----------------------------------
userid | 10
dbid | 16384
query | SELECT * FROM words WHERE word= ?;
calls | 2
total_time | 142.314
rows | 1
shared_blks_hit | 3
shared_blks_read | 5
shared_blks_dirtied | 0
shared_blks_written | 0
local_blks_hit | 0
local_blks_read | 0
local_blks_dirtied | 0
local_blks_written | 0
temp_blks_read | 0
temp_blks_written | 0
blk_read_time | 142.165
blk_write_time | 0

</pre>

The two queries are shown as one in pg_stat_statements.

* For prepared statements, the execution part (execute statement) is charged on the prepare statement. That way it is easier to use, and avoids the double-counting there was with PostgreSQL 9.1.

* pg_stat_statements displays timing in milliseconds, to be consistent with other system views.

= Explain improvements=

* Timing can now be disabled with EXPLAIN (analyze on, timing off), leading to lower overhead on platforms where getting the current time is expensive 

=# EXPLAIN (analyze on,timing off) SELECT * FROM reservation ;
QUERY PLAN
----------------------------------------------------------------------------------------
Seq Scan on reservation (cost=0.00..22.30 rows=1230 width=36) (actual rows=2 loops=1)
Total runtime: 0.045 ms

* Have EXPLAIN ANALYZE report the number of rows rejected by filter steps 

This new feature makes it much easier to know how many rows are removed by a filter (and spot potential places to put indexes):

=# EXPLAIN ANALYZE SELECT * FROM test WHERE a ~ 'tra';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on test (cost=0.00..106876.56 rows=2002 width=11) (actual time=2.914..8538.285 rows=120256 loops=1)
Filter: (a ~ 'tra'::text)
Rows Removed by Filter: 5905600
Total runtime: 8549.539 ms
(4 rows)

=Backward compatibility=

These changes may incur regressions in your applications.

==Ensure that xpath() escapes special characters in string values  ==

Before 9.2:
<pre>
SELECT (XPATH('/*/text()', '<root>&lt;</root>'))[1];
xpath
-------
<

'<' Isn't valid XML.
</pre>
With 9.2:
<pre>
SELECT (XPATH('/*/text()', '<root>&lt;</root>'))[1];
xpath
-------
&lt;
</pre>

==Remove hstore's => operator ==
Up to 9.1, one could use the => operator to create a hstore. Hstore is a contrib, used to store key/values pairs in a column.

In 9.1:
<pre>
=# SELECT 'a'=>'b';
?column?
----------
"a"=>"b"
(1 row)

=# SELECT pg_typeof('a'=>'b');
pg_typeof
-----------
hstore
(1 row)
</pre>

With 9.2:
<pre>
SELECT 'a'=>'b';
ERROR: operator does not exist: unknown => unknown at character 11
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
STATEMENT: SELECT 'a'=>'b';
ERROR: operator does not exist: unknown => unknown
LINE 1: SELECT 'a'=>'b';
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
</pre>

It doesn't mean one cannot use '=>' in hstores, it just isn't an operator anymore:

<pre>
=# select hstore('a=>b');
hstore
----------
"a"=>"b"
(1 row)

=# select hstore('a','b');
hstore
----------
"a"=>"b"
(1 row)
</pre>
are still two valid ways to input a hstore.

"=>" is removed as an operator as it is a reserved keyword in SQL.

==Have pg_relation_size() and friends return NULL if the object does not exist ==

A relation could be dropped by a concurrent session, while one was doing a pg_relation_size on it, leading to a SQL exception. Now, it merely returns NULL for this record.

==Remove the spclocation field from pg_tablespace ==

The spclocation field provided the real location of the tablespace. It was filled in during the CREATE or ALTER TABLESPACE command. So it could be wrong: somebody just had to shutdown the cluster, move the tablespace's directory, re-create the symlink in pg_tblspc, and forget to update the spclocation field. The cluster would still run, as the spclocation wasn't used.

So this field has been removed. To get the tablespace's location, use pg_tablespace_location():

<pre>
=# SELECT *, pg_tablespace_location(oid) AS spclocation FROM pg_tablespace;
spcname | spcowner | spcacl | spcoptions | spclocation
------------+----------+--------+------------+----------------
pg_default | 10 | | |
pg_global | 10 | | |
tmptblspc | 10 | | | /tmp/tmptblspc
</pre>

==Have EXTRACT of a non-timezone-aware value measure the epoch from local midnight, not UTC midnight ==

With PostgreSQL 9.1:

<pre>
=#SELECT extract(epoch FROM '2012-07-02 00:00:00'::timestamp);
date_part
------------
1341180000
(1 row)

=# SELECT extract(epoch FROM '2012-07-02 00:00:00'::timestamptz);
date_part
------------
1341180000
(1 row)
</pre>

There is no difference in behaviour between a timstamp with or without timezone.

With 9.1:
<pre>
=#SELECT extract(epoch FROM '2012-07-02 00:00:00'::timestamp);
date_part
------------
1341187200
(1 row)

=# SELECT extract(epoch FROM '2012-07-02 00:00:00'::timestamptz);
date_part
------------
1341180000
(1 row)
</pre>
When the timestamp has no timezone, the epoch is calculated with the "local midnight", meaning the 1st january of 1970 at midnight, local-time.

==Fix to_date() and to_timestamp() to wrap incomplete dates toward 2020 ==

The wrapping was not consistent between 2 digit dates and 3 digit dates: 2 digit dates always chose the date closest to 2020, 3 digit dates mapped dates from 100 to 999 on 1100 to 1999, and 000 to 099 on 2000 to 2099.

Now PostgreSQL chooses the date closest to 2020, for 2 and 3 digit dates.

With 9.1:
<pre>
=# SELECT to_date('200-07-02','YYY-MM-DD');
to_date
------------
1200-07-02
</pre>

With 9.2:
<pre>
SELECT to_date('200-07-02','YYY-MM-DD');
to_date
------------
2200-07-02
</pre>

==pg_stat_activity and pg_stat_replication's definitions have changed ==

The view pg_stat_activity has changed. It's not backward compatible, but let's see what this new definition brings us:

* current_query disappears and is replaced by two columns:
** state: is the session running a query, waiting
** query: what is the last run (or still running if stat is "active") query
* The column procpid is renamed to pid, to be consistent with other system views

The benefit is mostly for tracking «idle in transaction» sessions. Up until now, all we could know was that one of these sessions was idle in transaction, meaning it has started a transaction, maybe done some operations, but still not committed. If that session stayed in this state for a while, there was no way of knowing how it got in this state.

Here is an example:
<pre>
-[ RECORD 1 ]----+---------------------------------
datid | 16384
datname | postgres
pid | 20804
usesysid | 10
usename | postgres
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2012-07-02 15:02:51.146427+02
xact_start | 2012-07-02 15:15:28.386865+02
query_start | 2012-07-02 15:15:30.410834+02
state_change | 2012-07-02 15:15:30.411287+02
waiting | f
state | idle in transaction
query | DELETE FROM test;
</pre>

With PostgreSQL 9.1, all we would have would be «idle in transaction».

As this change was backward-incompatible, procpid was also renamed to pid, to be more consistent with other system views.
The view pg_stat_replication has also changed. The column procpid is renamed to pid, to also be consistent with other system views.

==Change all SQL-level statistics timing values to float8-stored milliseconds ==

pg_stat_user_functions.total_time, pg_stat_user_functions.self_time, pg_stat_xact_user_functions.total_time, pg_stat_xact_user_functions.self_time, and pg_stat_statements.total_time (contrib) are now in milliseconds, to be consistent with the rest of the timing values.

==postgresql.conf parameters changes ==

* silent_mode has been removed. Use pg_ctl -l postmaster.log
* wal_sender_delay has been removed. It is no longer needed
* custom_variable_classes has been removed. All «classes» are accepted without declaration now
* ssl_ca_file, ssl_cert_file, ssl_crl_file, ssl_key_file have been added, meaning you can now specify the ssl files

= Other new features =

== DROP INDEX CONCURRENTLY ==

The regular DROP INDEX command takes an exclusive lock on the table. Most of the time, this isn't a problem, because this lock is short-lived. The problem usually occurs when:

* A long-running transaction is running, and has a (shared) lock on the table
* A DROP INDEX is run on this table in another session, asking for an exclusive lock (and waiting for it, as it won't be granted until the long-running transaction ends)

At this point, all other transactions needing to take a shared lock on the table (for a simple SELECT for instance) will have to wait too: their lock acquisition is queued after the DROP INDEX's one.

DROP INDEX CONCURRENTLY works around this and won't lock normal DML statements, just as CREATE INDEX CONCURRENTLY. The limitations are also the same: Since you can only DROP one index with the CONCURRENTLY option, and the CASCADE option is not supported.

== NOT VALID CHECK constraints ==

PostgreSQL 9.1 introduced «NOT VALID» foreign keys. This has been extended to CHECK constraints. Adding a «NOT VALID» constraint on a table means that current data won't be validated, only new and updated rows.

=# CREATE TABLE test (a int);
CREATE TABLE
=# INSERT INTO test SELECT generate_series(1,100);
INSERT 0 100
=# ALTER TABLE test ADD CHECK (a>100) NOT VALID;
ALTER TABLE
=# INSERT INTO test VALUES (99);
ERROR: new row for relation "test" violates check constraint "test_a_check"
DETAIL: Failing row contains (99).
=# INSERT INTO test VALUES (101);
INSERT 0 1

Then, later, we can validate the whole table:

=# ALTER TABLE test VALIDATE CONSTRAINT test_a_check ;
ERROR: check constraint "test_a_check" is violated by some row

Domains, which are types with added constraints, can also be declared as not valid, and validated later.

Check constraints can also be renamed now:

=# ALTER TABLE test RENAME CONSTRAINT test_a_check TO validate_a;
ALTER TABLE

== NO INHERIT constraints ==

Here is another improvement about constraints: they can be declared as not inheritable, which will be useful in partitioned environments. Let's take PostgreSQL documentation example, and see how it improves the situation:

CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int,
CHECK (logdate IS NULL) NO INHERIT
);

CREATE TABLE measurement_y2006m02 (
CHECK ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
) INHERITS (measurement);
CREATE TABLE measurement_y2006m03 (
CHECK ( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )
) INHERITS (measurement);

INSERT INTO measurement VALUES (1,'2006-02-20',1,1);
ERROR: new row for relation "measurement" violates check constraint "measurement_logdate_check"
DETAIL: Failing row contains (1, 2006-02-20, 1, 1).
INSERT INTO measurement_y2006m02 VALUES (1,'2006-02-20',1,1);
INSERT 0 1

Until now, every check constraint created on measurement would have been inherited by children tables. So adding a constraint forbidding inserts, or allowing only some of them, on the parent table was impossible.

== Reduce ALTER TABLE rewrites ==

A table won't get rewritten anymore during an ALTER TABLE when changing the type of a column in the following cases:

* varchar(x) to varchar(y) when y>=x. It works too if going from varchar(x) to varchar or text (no size limitation)
* numeric(x,z) to numeric(y,z) when y>=x, or to numeric without specifier
* varbit(x) to varbit(y) when y>=x, or to varbit without specifier
* timestamp(x) to timestamp(y) when y>=x or timestamp without specifier
* timestamptz(x) to timestamptz(y) when y>=x or timestamptz without specifier
* interval(x) to interval(y) when y>=x or interval without specifier

== Security barriers and Leakproof ==

This new feature has to do with views security. First, let's explain the problem, with a very simplified example:

=# CREATE TABLE all_data (company_id int, company_data varchar);
CREATE TABLE
# INSERT INTO all_data VALUES (1,'secret_data_for_company_1');
INSERT 0 1
=# INSERT INTO all_data VALUES (2,'secret_data_for_company_2');
INSERT 0 1
=# CREATE VIEW company1_data AS SELECT * FROM all_data WHERE company_id = 1;
CREATE VIEW

This is a quite classical way of giving access to only a part of a table to a user: we'll create a user for company_id 1, grant to him the right to access company1_data, and deny him the right to access all_data.

The plan to this query is the following:

=# explain SELECT * FROM company1_data ;
QUERY PLAN
----------------------------------------------------------
Seq Scan on all_data (cost=0.00..25.38 rows=6 width=36)
Filter: (company_id = 1)

Even if there was more data, a sequential scan could still be forced: just "SET enable_indexscan to OFF" and the likes.

So this query reads all the records from all_data, filters them, and returns to the user only the matching rows. There is a way to display scanned records before they are filtered: just create a function with a very low cost, and call it while doing the query:

CREATE OR REPLACE FUNCTION peek(text) RETURNS boolean LANGUAGE plpgsql AS
$$
BEGIN
RAISE NOTICE '%',$1;
RETURN true;
END
$$
COST 0.1;

This function just has to cost less than the = operator, which costs 1, to be executed first.

The result is this:

=# SELECT * FROM company1_data WHERE peek(company1_data.company_data);
NOTICE: secret_data_for_company_1
NOTICE: secret_data_for_company_2
company_id | company_data
------------+---------------------------
1 | secret_data_for_company_1
(1 row)

We got access to the record from the second company (in the NOTICE messages).

So this is the first new feature: the view can be declared as implementing "security barriers":

=# CREATE VIEW company1_data WITH (security_barrier) AS SELECT * FROM all_data WHERE company_id = 1;
CREATE VIEW
=# SELECT * FROM company1_data WHERE peek(company1_data.company_data);
NOTICE: secret_data_for_company_1
company_id | company_data
------------+---------------------------
1 | secret_data_for_company_1
(1 row)

The view is not leaking anymore. The problem, of course, is that there is a performance impact: maybe the "peek" function could have made the query faster, by filtering lots of rows early in the plan.

This leads to the complementary feature: some function may be declared as "LEAKPROOF", meaning that they won't leak the data they are passed into error or notice messages.

Declaring our peek function as LEAKPROOF is a very bad idea, but let's do it just to demonstrate how it's used:

CREATE OR REPLACE FUNCTION peek(text) RETURNS boolean LEAKPROOF LANGUAGE plpgsql AS
$$
BEGIN
RAISE NOTICE '%',$1;
RETURN true;
END
$$
COST 0.1;

A LEAKPROOF function is executed «normally»:

=# SELECT * FROM company1_data WHERE peek(company1_data.company_data);
NOTICE: secret_data_for_company_1
NOTICE: secret_data_for_company_2
company_id | company_data
------------+---------------------------
1 | secret_data_for_company_1
(1 row)

Of course, in our case, peek isn't LEAKPROOF and shouldn't be declared as such. Only superuser have the permission to declare a LEAKPROOF function.

== New options for pg_dump ==

Until now, one could ask pg_dump to dump a table's data, or a table's meta-data (DDL statements for creating the table's structure, indexes, constraints). Some meta-data is better restored before the data (the table's structure, check constraints), some is better after the data (indexes, unique constraints, foreign keys…), for performance reasons mostly.

So there are now a few more options:

* --section=pre-data: dump what's needed before restoring the data. Of course, this can be combined with a -t for instance, to specify one table
* --section=post-data : dump what's needed after restoring the data.
* --section=data: dump the data
* --exclude-table-data: dump everything, except THIS table's data. It means pg_dump will still dump other tables' data.

[[Category:PostgreSQL 9.2]]

Timestamp Average

2012-07-13T01:19:21Z

Schmiddy: add category tag

{{SnippetInfo|Timestamp Average|version=9.1|lang=PL/pgSQL}}

Here is the code to efficiently compute an average of a timestamp column. I've only tested this on 9.1, but it will probably work on earlier versions as well. Note, you'll need to cast the column to a plain timestamp (e.g. SELECT avg(tstz_col AT TIME ZONE 'UTC') FROM mytable) in order to use it with columns of type 'timestamp with time zone'.

Author: Josh Kupershmidt

<source lang="plsql">
-- In order to have a reasonably efficient accumulator
-- function, we need a state variable keeping a running
-- total of seconds since the epoch, along with the number
-- of elements processed already.
CREATE TYPE ts_accum_typ AS (
running_total numeric,
num_elems bigint
);

-- Accumulator function. Keep a running total of the
-- number of seconds since the epoch (1970-01-01), as well
-- as the number of elements we have processed already.
CREATE OR REPLACE FUNCTION ts_accum (existing ts_accum_typ, newval timestamp)
RETURNS ts_accum_typ AS $$
DECLARE
retval ts_accum_typ;
BEGIN

IF newval IS NULL THEN
RETURN existing;
END IF;

IF existing IS NULL THEN
retval.running_total = EXTRACT(epoch FROM newval);
retval.num_elems = 1;
RETURN retval;
ELSE
existing.running_total = existing.running_total + EXTRACT(epoch FROM newval);
existing.num_elems = existing.num_elems + 1;
RETURN existing;
END IF;
END;
$$
LANGUAGE PLPGSQL IMMUTABLE;

-- Final function for the timestamp 'avg' aggregate.
CREATE OR REPLACE FUNCTION ts_avg (existing ts_accum_typ) RETURNS timestamp AS $$
DECLARE
since_epoch numeric;
BEGIN
-- Handle the case when avg() is called with no rows: answer should be NULL.
IF existing IS NULL THEN
RETURN NULL;
END IF;

since_epoch = existing.running_total / existing.num_elems;
RETURN to_timestamp(since_epoch);
END;
$$
LANGUAGE PLPGSQL IMMUTABLE;

CREATE AGGREGATE avg (timestamp)
(
sfunc = ts_accum,
stype = ts_accum_typ,
finalfunc = ts_avg
);

</source>

[[Category:PL/pgSQL]]

Timestamp Average

2012-07-13T01:17:11Z

Schmiddy: initial code for timestamp average

Show database bloat

2012-07-06T18:27:16Z

Schmiddy: summary of a few shortcomings of the query

{{SnippetInfo|Show database bloat|version=>=8.0|lang=SQL|category=Performance}}

This snippet displays the estimated amount of bloat in your tables and indices. Snippet is taken from Greg Sabino Mullane's excellent [http://bucardo.org/check_postgres/ check_postgres] script.

<source lang="sql">
SELECT
current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,
CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,
iname, /*ituples::bigint, ipages::bigint, iotta,*/
ROUND(CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,
CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
FROM (
SELECT
schemaname, tablename, cc.reltuples, cc.relpages, bs,
CEIL((cc.reltuples*((datahdr+ma-
(CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta,
COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols
FROM (
SELECT
ma,bs,schemaname,tablename,
(datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
(maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
FROM (
SELECT
schemaname, tablename, hdr, ma, bs,
SUM((1-null_frac)*avg_width) AS datawidth,
MAX(null_frac) AS maxfracsum,
hdr+(
SELECT 1+count(*)/8
FROM pg_stats s2
WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
) AS nullhdr
FROM pg_stats s, (
SELECT
(SELECT current_setting('block_size')::numeric) AS bs,
CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
FROM (SELECT version() AS v) AS foo
) AS constants
GROUP BY 1,2,3,4,5
) AS foo
) AS rs
JOIN pg_class cc ON cc.relname = rs.tablename
JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema'
LEFT JOIN pg_index i ON indrelid = cc.oid
LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
) AS sml
ORDER BY wastedbytes DESC
</source>

== Notes ==

This query is for informational purposes only. It provides a [http://archives.postgresql.org/pgsql-admin/2011-06/msg00015.php loose estimate] of table growth activity only, and should not be construed as a 100% accurate portrayal of space consumed by database objects. To obtain more accurate information about database bloat, please refer to the [http://www.postgresql.org/docs/9.1/static/pgstattuple.html pgstattuple] or [http://www.postgresql.org/docs/9.1/static/pgfreespacemap.html pg_freespacemap] contrib modules.

What's new in PostgreSQL 9.0

2012-05-09T20:43:44Z

Schmiddy: /* Use of index to get better statistics on the fly */ I found the 'DO' echoed at the end really confusing

This document showcases many of the latest developments in PostgreSQL 9.0, compared to the last major release – PostgreSQL 8.4. There are more than 200 improvements in this release, so this wiki page covers many of the more important changes in detail. The full list of changes is itemised in [http://www.postgresql.org/docs/9.0/static/release-9-0 Release Notes].

=The two features you can't ignore=

Hot Standby and Streaming Replication are the two new features that mark Version 9.0 as a landmark in PostgreSQL's development and the motivation for allocating a full version number to this release – 9.0 (instead of 8.5). Combined, these two features add built-in, "binary replication" to PostgreSQL.

There is further documention on how to use [[Hot Standby]] on its page, and an extensive [[Binary Replication Tutorial]] is in progress.

===Hot Standby===

This feature allows users to create a 'Standby' database – that is, a second database instance (normally on a separate server) replaying the primary's binary log, while making that standby server available for read-only queries. It is similar to the standby database features of proprietary databases, such as Oracle's Active DataGuard. Queries execute normally while the standby database continuously replays the stream of binary modifications coming from the primary database. Visibility of new data changes follows the MVCC model, so that new changes do not lock out queries.

Enabling Hot Standby is a simple process. On the primary database server add this to the <tt>postgresql.conf</tt> file:
wal_level = 'hot_standby' # Adds the required data in the WAL logs

And on the standby server, add this to its <tt>postgresql.conf</tt> file:
hot_standby = on

Hot Standby works well with the new Streaming Replication feature, though it can also be used with file-based log shipping as available in previous versions and also to create standalone copies that receive no updates at all.

In some cases, changes from the primary database can conflict with queries on the standby. A simple example is when DROP TABLE executes on the master, but the standby is still executing a query against that table. The standby cannot process that DROP statement without first canceling the running query, and the longer it delays doing that the further behind current replication the standby will become. The two options here are to pause the replay or cancel the read-only queries and move forward.

A variety of parameters allow adjusting the conflict resolution mechanism used.

max_standby_archive_delay = 30s # -1= always wait, 0= never wait, else wait for this
max_standby_streaming_delay = 30s # -1= always wait, 0= never wait, else wait for this

The two max_standby_{archive,streaming}_delay settings determine the behaviour of the standby database when conflicts between replay and read-only queries occur. In this situation, the standby database will wait at most until it's lagging behind the primary database by that amount before canceling the conflicting read-only queries. The two parameters allow different lag time tolerance levels for files appearing via regular file archive shipping vs. ones that are streamed via the new 9.0 feature for streaming replication.

On the master it is also possible to avoid conflicts by increasing this parameter

vacuum_defer_cleanup_age = 10000 # Adjust updwards slowly to reduce conflicts

This feature is rich and complex, so it's advisable to read the documentation before planning your server deployments.

===Streaming Replication===

Complementing Hot Standby, Streaming Replication is the second half of the "great leap forward" for PostgreSQL. While there are several third-party replication solutions available for PostgreSQL that meet a range of specific needs, this new release brings a simple, sturdy and integrated version that will probably be used as a default in most High Availability installations using PostgreSQL.

This time, the goal is improving the archiving mechanism to make it as continuous as possible and to not rely on log file shipping. Standby servers can now connect to the master/primary and get sent, whenever they want, what they are missing from the Write Ahead Log, not in terms of complete files ('wal segments'), but in terms of records in the WAL (you can think of them as fragments of these files).

Streaming Replication is an asynchronous mechanism; the standby server lags behind the master. But unlike other replication methods, this lag is very short, and can be as little as a single transaction, depending on network speed, database activity, and Hot Standby settings. Also, the load on the master for each slave is minimal, allowing a single master to support dozens of slaves.

Primary and standby databases are identical at the binary level (well, almost; but don't worry if your datafiles don't have the same checksum).

For Streaming Replication, wal_level should be 'archive' or 'hot standby'.

<tt>postgresql.conf</tt>, Primary:
max_wal_senders = 5 # Maximum 'wal_senders', processes responsible for managing a connection with a standby server
wal_keep_segments # How many WAL segments (=files) should be kept on the primary, whatever may happen (you won't have to copy them manually on the standby if the standby gets too far behind)

On the standby server:

<tt>recovery.conf</tt>, Standby:
standby_mode = 'on'
primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass' # connection string to reach the primary database
<tt>postgresql.conf</tt>, Secondary:
wal_level # same value as on the primary (you'll need this after a failover, to build a new standby)
hot_standby=on/off # Do you want to use Hot Standby at the same time ?
pg_hba.conf file:

There must be an entry here for the replication connections. The fake database is 'replication', the designated user should be superuser. Be careful not to give broad access to this account: a lot of privileged data can be extracted from WAL records.

<tt>pg_hba.conf</tt>, Primary:
host replication foo 192.168.1.100/32 md5
As for Hot Standby, this feature is rich and complex. It's advised to read the documentation. And to perform failover and switchover tests when everything is in place.

One thing should be stressed about these two features: you can use them together. This means you can have a near-realtime standby database, and run read-only queries on it, such as reporting queries. You can also use them independently; a standby database can be Hot Standby with file shipping only, and a Streaming Replication database can stream without accepting queries.

=Other New features=

There are literally hundreds of improvements, updates, and new features in 9.0 ... enough to make it a major release even without binary replication. We'll tour a few of them below, by category, with details on how to use them.

==Security and Authentication==

Of course, as the most secure SQL database (according to the SQL Hacker's Handbook) we're always eager to improve our data security. 9.0 adds several new features in this realm.

===GRANT/REVOKE IN SCHEMA===

One annoying limitation in PostgreSQL has been the lack of global GRANT/REVOKE capabilities. With 9.0 it's now possible to set privileges on all tables, sequences and functions within a schema using without having to write a script or a stored procedure:

GRANT SELECT ON ALL TABLES IN SCHEMA public TO toto;
And reverting this:
REVOKE SELECT ON ALL TABLES IN SCHEMA public FROM toto;

See the [http://www.postgresql.org/docs/9.0/static/sql-grant.html GRANT] documentation page for further details.

Note that the above only works for existing objects. However, it's now also possible to define default permissions for new objects:

===ALTER DEFAULT PRIVILEGES===

This feature also makes permission management more efficient.
ALTER DEFAULT PRIVILEGES FOR ROLE marc GRANT SELECT ON TABLES TO public;
CREATE TABLE test_priv (a int);
\z test_priv
Access privileges
Schema | Name | Type | Access privileges | Column access privileges
--------+------------+-------+-------------------+--------------------------
public | test_priv | table | =r/marc +|
| | | marc=arwdDxt/marc |

This new information is stored in the pg_default_acl system table.

===passwordcheck===

This contrib module can check passwords, and prevent the worst of them from getting in. After having it installed and set up as described in the documentation, here is the result:
marc=# ALTER USER marc password 'marc12';
<marc%marc> ERROR: password is too short
<marc%marc> STATEMENT: ALTER USER marc password 'marc12';
ERROR: password is too short
marc=# ALTER USER marc password 'marc123456';
<marc%marc> ERROR: password must not contain user name
<marc%marc> STATEMENT: ALTER USER marc password 'marc123456';
ERROR: password must not contain user name
This module has limitations, mostly due to PostgreSQL accepting already encrypted passwords to be declared, making correct verification impossible.

Moreover, its code is well documented, and can be easily adapted to suit specific needs (one can activate cracklib very easily, for instance)

==SQL Features==

SQL03 has a huge array of functionality, more than any one DBMS currently implements. But we keep adding SQL features, as well as extending SQL in a compatible way with various little things to make writing queries easier and more powerful.

===Column triggers===

Column triggers fire only when a specific column is explicitly UPDATED. They allow you to avoid adding lots of conditional logic and value comparisons in your trigger code.

Example:
CREATE TRIGGER foo BEFORE UPDATE OF some_column ON table1 FOR EACH ROW EXECUTE PROCEDURE my_trigger();
This trigger fires only when '<tt>some_column</tt>' column of '<tt>table1</tt>' table has been updated.

Column triggers are not executed if columns are set to DEFAULT.

===WHEN Triggers===

Completing PostgreSQL's effort to limit IF ... THEN code in triggers, conditional triggers define simple conditions under which the trigger will be executed. This can dramatically decrease the number of trigger executions and reduce CPU load on the database server.

For example, this trigger would check that an account was correctly balanced only when the balance changes:

CREATE TRIGGER check_update
BEFORE UPDATE ON accounts
FOR EACH ROW
WHEN (OLD.balance IS DISTINCT FROM NEW.balance)
EXECUTE PROCEDURE check_account_update();

And this trigger will only log a row update when the row actually changes. It's very helpful with framework or ORM applications, which may attempt to save unchanged rows:

CREATE TRIGGER log_update
AFTER UPDATE ON accounts
FOR EACH ROW
WHEN (OLD.* IS DISTINCT FROM NEW.*)
EXECUTE PROCEDURE log_account_update();

You could even go further than this and decide not to save a row at all if it hasn't changed:

CREATE TRIGGER log_update
BEFORE UPDATE ON accounts
FOR EACH ROW
WHEN (OLD.* IS NOT DISTINCT FROM NEW.*)
EXECUTE PROCEDURE no_op();

===DEFERRABLE UNIQUE CONSTRAINTS===

This feature will also be very useful. Here is an example, using a primary key instead of a simple unique key:
marc=# CREATE TABLE test (a int primary key);
marc=# INSERT INTO test values (1), (2);
marc=# UPDATE test set a = a+1;
ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (a)=(2) already exists.
That's a pity: at the end of the statement, my data would have been consistent, so far as this constraint is concerned. Even worse, if the table had been physically sorted by descending order, the query would have worked! With 8.4, there was no easy way out; we had to find a trick to update the records in the right order.

We can now do this:
marc=# CREATE TABLE test (a int primary key deferrable);
marc=# INSERT INTO test values (1), (2);
marc=# UPDATE test set a = a+1;
UPDATE 2

With a DEFERRABLE unique index, uniqueness is enforced as of the end of the statement, rather than after each row update as with a simple
index. This is a bit slower sometimes but is a lifesaver if you need to do this sort of update.

It is also possible to have the uniqueness check enforced as of the end of the transaction, rather than after each statement. This helps
if you need to do "conflicting" updates that require more than one SQL statement to complete. For example:
marc=# CREATE TABLE test (a int primary key deferrable, b text);
marc=# INSERT INTO test values (1, 'x'), (2, 'y');
marc=# BEGIN;
marc=# SET CONSTRAINTS ALL DEFERRED;
marc=# UPDATE test SET a = 2 WHERE b = 'x';
marc=# UPDATE test SET a = 1 WHERE b = 'y';
marc=# COMMIT;

If one doesn't want to perform a SET CONSTRAINTS each time, the constraint can also be declared as INITIALLY DEFERRED:
CREATE TABLE test (a int PRIMARY KEY DEFERRABLE INITIALLY DEFERRED);

Keep in mind that the list of records to be checked at the end of the statement or transaction has to be stored somewhere. So be careful of not doing this for millions of records at once. This is one of the reasons that unique indexes aren't DEFERRABLE by default, even though a strict reading of the SQL spec would require it.

===New frame options for window functions===

If you don't know window functions yet, you'd better learn about them. You can start here : [http://www.depesz.com/index.php/2009/01/21/waiting-for-84-window-functions waiting-for-84-window-functions]. They make writing certain kind of queries much easier.

New options have been added for declaring frames of windowing functions. Let's use this table (not having a better example…)
marc=# SELECT * FROM salary ;
entity | name | salary | start_date
-----------+-----------+---------+---------------
R&D | marc | 700.00 | 2010-02-15
Accounting | jack | 800.00 | 2010-05-01
R&D | maria | 700.00 | 2009-01-01
R&D | kevin | 500.00 | 2009-05-01
R&D | john | 1000.00 | 2008-07-01
R&D | tom | 1100.00 | 2005-01-01
Accounting | millicent | 850.00 | 2006-01-01
Here is a window function example, without declaring the frame:
marc=# SELECT entity, name, salary, start_date,
avg(salary) OVER (PARTITION BY entity ORDER BY start_date)
FROM salary;

entity | name | salary | start_date | avg
-----------+-----------+---------+---------------+-----------------------
Accounting | millicent | 850.00 | 2006-01-01 | 850.0000000000000000
Accounting | jack | 800.00 | 2010-05-01 | 825.0000000000000000
R&D | tom | 1100.00 | 2005-01-01 | 1100.0000000000000000
R&D | john | 1000.00 | 2008-07-01 | 1050.0000000000000000
R&D | maria | 700.00 | 2009-01-01 | 933.3333333333333333
R&D | kevin | 500.00 | 2009-05-01 | 825.0000000000000000
R&D | marc | 700.00 | 2010-02-15 | 800.0000000000000000
The frame is the group of records over which the window function is run. Of course, if the frame isn't explicitly declared, there is a default one.

Here is the same query, with an explicit frame:
marc=# SELECT entity, name, salary, start_date,
avg(salary) OVER (PARTITION BY entity ORDER BY start_date
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM salary;

entity | name | salary | start_date | avg
-----------+-----------+---------+---------------+-----------------------
Accounting | millicent | 850.00 | 2006-01-01 | 850.0000000000000000
Accounting | jack | 800.00 | 2010-05-01 | 825.0000000000000000
R&D | tom | 1100.00 | 2005-01-01 | 1100.0000000000000000
R&D | john | 1000.00 | 2008-07-01 | 1050.0000000000000000
R&D | maria | 700.00 | 2009-01-01 | 933.3333333333333333
R&D | kevin | 500.00 | 2009-05-01 | 825.0000000000000000
R&D | marc | 700.00 | 2010-02-15 | 800.0000000000000000
In this example, the frame is a 'range' frame, between the start of the partition (the group of similar rows) and the current row (not exactly the current row, but let's put that aside for now, read the documentation if you want to learn more). One can see, the average (avg) function is evaluated from the frame's first row (grouped together records) and the current row.

First new feature: as of 9.0, the frame can be declared to be between the current row and the end of the partition:
marc=# SELECT entity, name, salary, start_date,
avg(salary) OVER (PARTITION BY entity ORDER BY start_date
RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
FROM salary;

entity | name | salary | start_date | avg
-----------+-----------+---------+---------------+----------------------
Accounting | millicent | 850.00 | 2006-01-01 | 825.0000000000000000
Accounting | jack | 800.00 | 2010-05-01 | 800.0000000000000000
R&D | tom | 1100.00 | 2005-01-01 | 800.0000000000000000
R&D | john | 1000.00 | 2008-07-01 | 725.0000000000000000
R&D | maria | 700.00 | 2009-01-01 | 633.3333333333333333
R&D | kevin | 500.00 | 2009-05-01 | 600.0000000000000000
R&D | marc | 700.00 | 2010-02-15 | 700.0000000000000000
Second new feature: frames can be declared as 'x previous records to y next records'. There is no point with this example, but let's do it anyway::
marc=# SELECT entity, name, salary, start_date,
avg(salary) OVER (PARTITION BY entity ORDER BY start_date
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
FROM salary;

entity | name | salary | start_date | avg
-----------+-----------+---------+---------------+-----------------------
Accounting | millicent | 850.00 | 2006-01-01 | 825.0000000000000000
Accounting | jack | 800.00 | 2010-05-01 | 825.0000000000000000
R&D | tom | 1100.00 | 2005-01-01 | 1050.0000000000000000
R&D | john | 1000.00 | 2008-07-01 | 933.3333333333333333
R&D | maria | 700.00 | 2009-01-01 | 733.3333333333333333
R&D | kevin | 500.00 | 2009-05-01 | 633.3333333333333333
R&D | marc | 700.00 | 2010-02-15 | 600.0000000000000000
The frame is still limited to the partition (see tom's record, for instance: jack's record isn't use for it's average).

If one wanted the same query, with a moving average on three rows, not reset on each partition switch (still no practical use):
marc=# SELECT entity, name, salary, start_date,
avg(salary) OVER (ORDER BY entity, start_date
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
FROM salary;

entity | name | salary | start_date | avg
-----------+-----------+---------+---------------+----------------------
Accounting | millicent | 850.00 | 2006-01-01 | 825.0000000000000000
Accounting | jack | 800.00 | 2010-05-01 | 916.6666666666666667
R&D | tom | 1100.00 | 2005-01-01 | 966.6666666666666667
R&D | john | 1000.00 | 2008-07-01 | 933.3333333333333333
R&D | maria | 700.00 | 2009-01-01 | 733.3333333333333333
R&D | kevin | 500.00 | 2009-05-01 | 633.3333333333333333
R&D | marc | 700.00 | 2010-02-15 | 600.0000000000000000

In short, a powerful tool to be mastered, even if I couldn't provide a good example.

===Sort in aggregates===

This feature is a subtle one: the result of an aggregate function may depend on the order it receives the data.

Of course, we're not talking about count, avg, but of array_agg, string_agg…

This is nice, as this will showcase string_agg, which is another 9.0 feature, killing two birds with one stone.

Let's start again with our salary table. We want the list of employees, concatenated as a single value, grouped by entity. It's going into a spreadsheet…
marc=# SELECT entity,string_agg(name,', ') FROM salary GROUP BY entity;
entity | string_agg
-----------+-------------------------------
Accounting | stephanie, etienne
R&D | marc, maria, kevin, john, tom
That's already nice. But I want them sorted in alphabetical order, because I don't know how to write a macro in my spreadsheet to sort this data.
marc=# SELECT entity,string_agg(name,', ' ORDER BY name) FROM salary GROUP BY entity;
entity | string_agg
-----------+-------------------------------
Accounting | etienne, stephanie
R&D | john, kevin, marc, maria, tom
To use this new feature, the sort clause must be inserted inside the aggregate function, without a comma to separate it from the parameters.

==Database Administration==

DBA is a hard and often thankless job -- especially if that's not your job title. 9.0 includes new and improved features to make that job a bit easier.

===Better VACUUM FULL===

Until now, VACUUM FULL was very slow. This statement can recover free space from a table to reduce its size, mostly when VACUUM itself hasn't been run frequently enough.

It was slow because of the way it operated: records were read and moved one by one from their source bloc to a bloc closer to the beginning of the table. Once the end of the table was emptied, this empty part was removed.

This strategy was very inefficient: moving records one by one creates a lot of random IO. Moreover, during this reorganization, indexes had to be maintained, making everything even more costly, and fragmenting indexes. It was therefore advised to reindex a table just after a VACUUM FULL.

The VACUUM FULL statement, as of version 9.0, creates a new table from the current one, copying all the records sequentially. Once all records are copied, index are created back, and the old table is destroyed and replaced.

This has the advantage of being much faster. VACUUM FULL still needs an AccessExclusiveLock while running though. The only drawback of this method compared to the old one, is that VACUUM FULL can use as much as two times the size of the table on disk, as it is creating a new version of it.

Let's now compare the runtimes of the two methods. In both cases, we prepare the test data as follows (for 8.4 and 9.0)
marc=# CREATE TABLE test (a int);
CREATE TABLE
marc=# CREATE INDEX idxtsta on test (a);
CREATE INDEX
marc=# INSERT INTO test SELECT generate_series(1,1000000);
INSERT 0 1000000
marc=# DELETE FROM test where a%3=0; -- making holes everywhere
DELETE 333333
marc=# VACUUM test;
VACUUM
With 8.4:
marc=# \timing
Timing is on.
marc=# VACUUM FULL test;
VACUUM
Time: 6306,603 ms
marc=# REINDEX TABLE test;
REINDEX
Time: 1799,998 ms
So around 8 seconds.
With 9.0:
marc=# \timing
Timing is on.
marc=# VACUUM FULL test;
VACUUM
Time: 2563,467 ms
That still doesn't mean that VACUUM FULL is a good idea in production. If you need it, it's probably because your VACUUM policy isn't appropriate.

===application_name in pg_stat_activity===

In a monitoring session:
marc=# SELECT * from pg_stat_activity where procpid= 5991;
datid | datname | procpid | usesysid | usename | application_name | client_addr | client_port | backend_start | xact_start | query_start | waiting | current_query
------+---------+---------+----------+---------+------------------+-------------+-------------+-------------------------------+------------+-------------+---------+----------------
16384 | marc | 5991 | 10 | marc | psql | | -1 | 2010-05-16 13:48:10.154113+02 | | | f | <IDLE>
(1 row)
In the '5991' session:
marc=# SET application_name TO 'my_app';
SET
Back to the monitoring session:
>marc=# SELECT * from pg_stat_activity where procpid= 5991;
datid | datname | procpid | usesysid | usename | application_name | client_addr | client_port | backend_start | xact_start | query_start | waiting | current_query
------+---------+---------+----------+---------+------------------+-------------+-------------+-------------------------------+------------+-------------+---------+-----------------+----------------
16384 | marc | 5991 | 10 | marc | my_app | | -1 | 2010-05-16 13:48:10.154113+02 | | 2010-05-16 13:49:13.107413+02 | f | <IDLE>
(1 row)
It's your job to set this up correctly in your program or your sessions. Your DBA will thank you for this, at last knowing who runs what on the database easily.

===Per database+role configuration===
Instead of being able to set up configuration variables per database or per user, one can now set them up for a certain user in a certain database:

marc=# ALTER ROLE marc IN database marc set log_statement to 'all';
ALTER ROLE
To know who has which variables set-up in which user+database, there is a new psql command:
marc=# \drds
List of settings
role | database | settings
-----+----------+-----------------
marc | marc | log_statement=all
(1 row)
There was a catalog change to store this:
Table "pg_catalog.pg_db_role_setting"
Column | Type | Modifier
------------+--------+----------
setdatabase | oid | not null
setrole | oid | not null
setconfig | text |

===Log all changed parameters on a postgresql.conf reload===

Here is an example, the log_line_prefix parameter has been changed:
LOG:  received SIGHUP, reloading configuration files
<%> LOG:  parameter "log_line_prefix" changed to "<%u%%%d> "

===Better unique constraints error messages===

With 8.4:
marc=# INSERT INTO test VALUES (1);
ERROR: duplicate key value violates unique constraint "test_a_key"
With 9.0:
marc=# INSERT INTO test VALUES (1);
ERROR: duplicate key value violates unique constraint "test_a_key"
DETAIL: Key (a)=(1) already exists.
This will make diagnosing constraint violation errors much easier.

===vacuumdb --analyze-only===

As the parameter indicates, one can now use vacuumdb to run analyze only. It may be useful for cronjobs for instance.

==Performance==

It wouldn't be a new version of PostgreSQL if it didn't get faster, now would it? While 9.0 is not a "performance release", it does add new features which make some specific operations up to 1000% faster.

===64 bit binaries for Windows===

It is now possible to compile PostgreSQL for Windows as a 64-bit binary, and the PostgreSQL project is releasing 64-bit packages.

This has a number of advantages for Windows users: better performance on 64-bit number operations (like BIGINT and BIGSERIAL), the ability to use over 2GB of work_mem, and enhanced compatibility with 64-bit versions of PostgreSQL running on Linux. This last is particularly important given Hot Standby.

Note, however, that there is no evidence for now the 500MB shared_buffers size limit before performance degrades seen on the 32 bits version for Windows is solved with this 64 bit version, though. There is also the limitation that many 3rd-party open-source libraries are not available in 64-bit for Windows, so you may not be able to add all PostgreSQL extensions. Test reports welcome!

===Join Removal===

This new optimization allows us to remove unnecessary joins from SQL execution plans.

When using automatically generated SQL, such as from ORM (Object Relation Mapping) tools it is possible for the SQL to be sub-optimal. Removing unnecessary joins can improve query plans by an order of magnitude in some cases.

This is particularly important for databases that use many joins and nested views.

marc=# CREATE TABLE t1 (a int);
CREATE TABLE
marc=# CREATE TABLE t2 (b int);
CREATE TABLE
marc=# CREATE TABLE t3 (c int);
CREATE TABLE
We put a little bit of data with a generate_series…

marc=# EXPLAIN SELECT t1.a,t2.b from t1 join t2 on (t1.a=t2.b) left join t3 on (t1.a=t3.c);
QUERY PLAN
------------------------------------------------------------------------------
Merge Right Join (cost=506.24..6146.24 rows=345600 width=8)
Merge Cond: (t3.c = t1.a)
-> Sort (cost=168.75..174.75 rows=2400 width=4)
Sort Key: t3.c
-> Seq Scan on t3 (cost=0.00..34.00 rows=2400 width=4)
-> Materialize (cost=337.49..853.49 rows=28800 width=8)
-> Merge Join (cost=337.49..781.49 rows=28800 width=8)
Merge Cond: (t1.a = t2.b)
-> Sort (cost=168.75..174.75 rows=2400 width=4)
Sort Key: t1.a
-> Seq Scan on t1 (cost=0.00..34.00 rows=2400 width=4)
-> Sort (cost=168.75..174.75 rows=2400 width=4)
Sort Key: t2.b
-> Seq Scan on t2 (cost=0.00..34.00 rows=2400 width=4)

For now, everything is normal, and we have the same behavior in 8.4. But let's imagine that on t3, there is a UNIQUE constraint on the 'c' column. In this case, the join on t3 doesn't serve any purpose, theoretically speaking: the number of rows returned won't change, neither will their content. It's because the column is UNIQUE, the join is a LEFT JOIN, and no column of t3 is retrieved. If the column wasn't UNIQUE, the join could bring more rows. If that wasn't a LEFT JOIN, the join could ignore some rows.

With 9.0:
marc=# ALTER TABLE t3 ADD UNIQUE (c);
NOTICE: ALTER TABLE / ADD UNIQUE will create implicit index "t3_c_key" for table "t3"
ALTER TABLE
marc=# EXPLAIN SELECT t1.a,t2.b from t1 join t2 on (t1.a=t2.b) left join t3 on (t1.a=t3.c);
QUERY PLAN
------------------------------------------------------------------
Merge Join (cost=337.49..781.49 rows=28800 width=8)
Merge Cond: (t1.a = t2.b)
-> Sort (cost=168.75..174.75 rows=2400 width=4)
Sort Key: t1.a
-> Seq Scan on t1 (cost=0.00..34.00 rows=2400 width=4)
-> Sort (cost=168.75..174.75 rows=2400 width=4)
Sort Key: t2.b
-> Seq Scan on t2 (cost=0.00..34.00 rows=2400 width=4)
(8 rows)

===IS NOT NULL can now use indexes===

For this demonstration, we will compare the 8.4 and 9.0 versions (the table I created contains mostly nulls):

With 8.4:
marc=# EXPLAIN ANALYZE SELECT max(a) from test;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.03..0.04 rows=1 width=0) (actual time=281.320..281.321 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..0.03 rows=1 width=4) (actual time=281.311..281.313 rows=1 loops=1)
-> Index Scan Backward using idxa on test (cost=0.00..29447.36 rows=1001000 width=4) (actual time=281.307..281.307 rows=1 loops=1)
Filter: (a IS NOT NULL)
Total runtime: 281.360 ms
(6 rows)
With 9.0:
marc=# EXPLAIN ANALYZE SELECT max(a) from test;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.08..0.09 rows=1 width=0) (actual time=0.100..0.102 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..0.08 rows=1 width=4) (actual time=0.092..0.093 rows=1 loops=1)
-> Index Scan Backward using idxa on test (cost=0.00..84148.06 rows=1001164 width=4) (actual time=0.089..0.089 rows=1 loops=1)
Index Cond: (a IS NOT NULL)
Total runtime: 0.139 ms
(6 rows)
The difference is that 9.0 only scans the not-null keys in the index. 8.4 has to go check in the table (Filter step, when 9.0 uses an index condition). In this precise use case, the gain is really big.

===Use of index to get better statistics on the fly===

Before starting to explain this new feature, let's talk about histograms: PostgreSQL, like some other databases, uses a statistical optimizer. This means that when planning a query it has (or should have) an approximately correct idea of how many records each step of the query will bring back. In order to do this, it uses statistics, such as the approximate number of records in a table, its size, most common values, and histograms. PostgreSQL use these to get estimates about the number of records brought back by a WHERE clause on a column, depending on the value or range asked in this WHERE clause.

In some cases, these histograms are rapidly out of date, and become a problem, for certain SQL queries. For instance, a log table in which timestamped records would be inserted, and from which we would most of the time want to get the records from the last 5 minutes.

In this specific case, it was impossible before 9.0 to get correct statistics. Now, when PostgreSQL detects while planning that a query asks for a 'range scan' on a value larger than the largest of the histogram (or smaller than the smallest), that is, the largest detected value during the last statistics calculation, and this column has an index, it gets the max (or min) value for this column using the index BEFORE really executing the query, in order to get more realistic statistics. As PostgreSQL uses an index for this, there HAS to be an index, of course.

Here comes an example. The a column of the test table has already been filled with a lot of dates, all in the past. It's statistics are up to date.

It's 13:37, and I haven't inserted anything after 13:37 yet.
marc=# EXPLAIN ANALYZE select * from test where a > '2010-06-03 13:37:00';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Index Scan using idxtsta on test (cost=0.00..8.30 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=1)
Index Cond: (a > '2010-06-03 13:37:00'::timestamp without time zone)
Total runtime: 0.027 ms
(3 rows)
Everything's normal. The upper boundary of the histogram is '2010-06-03 13:36:16.830007' (this information comes from pg_stats). There is no way of guessing how many records are larger than 13:37, and with 8.4, PostgreSQL would have continued estimating '1' until the next analyze.
marc=# DO LANGUAGE plpgsql
$$
DECLARE
i int;
BEGIN
FOR i IN 1..10000 LOOP
INSERT INTO test VALUES (clock_timestamp());
END LOOP;
END
$$
;

(I must say I really like 'DO').
We just inserted 10000 records with a date larger than 13:37.
marc=# EXPLAIN ANALYZE SELECT * FROM test WHERE a > '2010-06-03 13:37:00';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Index Scan using idxtsta on test (cost=0.00..43.98 rows=1125 width=8) (actual time=0.012..13.590 rows=10000 loops=1)
Index Cond: (a > '2010-06-03 13:37:00'::timestamp without time zone)
Total runtime: 23.567 ms
(3 rows)
The estimated rows isn't 0 or 1 anymore. The statistics haven't been updated, though:
marc=# SELECT last_autoanalyze FROM pg_stat_user_tables WHERE relname = 'test';
last_autoanalyze
-------------------------------
2010-06-03 13:36:21.553477+02
(1 row)
We still have a one magnitude error in the evaluation (10 times). But it's not that bad: without this enhancement, it would be of four magnitudes (10,000). Anyway, a much smaller error makes it more likely we'll get a good plan out of this kind of queries.

===Explain buffers, hashing statistics, xml, json, yaml, new optional explain syntax===

Here is EXPLAIN ANALYZE as we all know it:
marc=# EXPLAIN ANALYZE SELECT a, sum(c) FROM pere JOIN fils ON (pere.a = fils.b) WHERE b BETWEEN 1000 AND 300000 GROUP BY a;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=905.48..905.86 rows=31 width=8) (actual time=0.444..0.453 rows=6 loops=1)
-> Nested Loop (cost=10.70..905.32 rows=31 width=8) (actual time=0.104..0.423 rows=6 loops=1)
-> Bitmap Heap Scan on fils (cost=10.70..295.78 rows=31 width=8) (actual time=0.040..0.154 rows=30 loops=1)
Recheck Cond: ((b >= 1000) AND (b <= 300000))
-> Bitmap Index Scan on fils_pkey (cost=0.00..10.69 rows=31 width=0) (actual time=0.023..0.023 rows=30 loops=1)
Index Cond: ((b >= 1000) AND (b <= 300000))
-> Index Scan using pere_pkey on pere (cost=0.00..19.65 rows=1 width=4) (actual time=0.005..0.005 rows=0 loops=30)
Index Cond: (pere.a = fils.b)
Total runtime: 0.560 ms
(9 rows)
To get access to the new available information, use the new syntax::

EXPLAIN [ ( { ANALYZE boolean | VERBOSE boolean | COSTS boolean | BUFFERS boolean | FORMAT { TEXT | XML | JSON | YAML } } [, ...] ) ] instruction
For instance:
marc=# EXPLAIN (ANALYZE true, VERBOSE true, BUFFERS true) SELECT a, sum(c) FROM pere JOIN fils ON (pere.a = fils.b) WHERE b BETWEEN 1000 AND 300000 GROUP BY a;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=905.48..905.86 rows=31 width=8) (actual time=1.326..1.336 rows=6 loops=1)
Output: pere.a, sum(fils.c)
Buffers: shared hit=58 read=40
-> Nested Loop (cost=10.70..905.32 rows=31 width=8) (actual time=0.278..1.288 rows=6 loops=1)
Output: pere.a, fils.c
Buffers: shared hit=58 read=40
-> Bitmap Heap Scan on public.fils (cost=10.70..295.78 rows=31 width=8) (actual time=0.073..0.737 rows=30 loops=1)
Output: fils.b, fils.c
Recheck Cond: ((fils.b >= 1000) AND (fils.b <= 300000))
Buffers: shared hit=4 read=28
-> Bitmap Index Scan on fils_pkey (cost=0.00..10.69 rows=31 width=0) (actual time=0.030..0.030 rows=30 loops=1)
Index Cond: ((fils.b >= 1000) AND (fils.b <= 300000))
Buffers: shared hit=3
-> Index Scan using pere_pkey on public.pere (cost=0.00..19.65 rows=1 width=4) (actual time=0.013..0.014 rows=0 loops=30)
Output: pere.a
Index Cond: (pere.a = fils.b)
Buffers: shared hit=54 read=12
Total runtime: 1.526 ms
(18 rows)
VERBOSE displays the 'Output' lines (it already existed on 8.4).

BUFFERS displays data about buffers (input-output operations performed by the query): hit is the number of blocks obtained directly from shared_buffers, read is the number of blocs asked to the operating system. Here, there was very little data in shared_buffers.

One can also ask for another formatting than plain text. For a user, it's not useful. For people developing GUIs over EXPLAIN, it simplifies development as they can get rid of an 'explain' parser (and its potential bugs), and use a more standard one, such as XML.

Costs display can also be deactivated with COSTS false.

===Per tablespace seq_page_cost/random_page_cost===

marc=# ALTER TABLESPACE pg_default SET ( random_page_cost = 10, seq_page_cost=5);
ALTER TABLESPACE
We just changed random_page_cost and seq_page_cost for all the objects contained in pg_default. What for ?

The use case is when different tablespaces have different performance: for instance, you have some critical data on a SSD drive, or historical data on an older disk array, slower than the brand new array you use for active data. This makes it possible to tell PostgreSQL that all your tablespaces don't always behave the same way, from a performance point of view. This is only useful, of course, for quite big databases.

===Force distinct statistics on a column===

This makes it possible to set the number of different values for a column. This mustn't be used lightly, but only when ANALYZE on this column can't get a good value.

Here's how to do this:
marc=# ALTER TABLE test ALTER COLUMN a SET (n_distinct = 2);
ALTER TABLE
ANALYZE has to be run again for this to be taken into account:
marc=# ANALYZE test;
ANALYZE
Let's try now:
marc=# EXPLAIN SELECT distinct * from test;
QUERY PLAN
------------------------------------------------------------------
HashAggregate (cost=6263.00..6263.02 rows=2 width=8)
-> Seq Scan on test (cost=0.00..5338.00 rows=370000 width=8)
(2 rows)
This is an example of what SHOULDN'T be done : there REALLY is 370 000 distinct values in my table. Now my execution plans may be very bad.

If n_distinct is positive, it's the number of distinct values.

If it's negative (between 0 and -1), it's the multiplying factor regarding the number of estimated records in the table: for instance, -0.2 means that there is a distinct value for each 5 records of the table.

0 brings the behavior back to normal (ANALYZE estimates distinct by itself).

Don't change this parameter, unless you are completely sure you have correctly diagnosed your problem. Else, be assured performance will be degraded.

===Statement logged by auto_explain===

auto_explain contrib module will now print the statement with its plan, which will make it much easier to use.

===Buffers accounting for pg_stat_statements===

This already very useful contrib module now also provides data about buffers. pg_stat_statements, as a reminder, collects statistics on the queries run on the database. Until now, it stored the query's code, number of executions, accumulated runtime, accumulated returned records. It now collects buffer operations too.
marc=# SELECT * from pg_stat_statements order by total_time desc limit 2;
-[ RECORD 1 ]-------+---------------------
userid | 10
dbid | 16485
query | SELECT * from table1 ;
calls | 2
total_time | 0.491229
rows | 420000
shared_blks_hit | 61
shared_blks_read | 2251
shared_blks_written | 0
local_blks_hit | 0
local_blks_read | 0
local_blks_written | 0
temp_blks_read | 0
temp_blks_written | 0
-[ RECORD 2 ]-------+---------------------
userid | 10
dbid | 16485
query | SELECT * from table2;
calls | 2
total_time | 0.141445
rows | 200000
shared_blks_hit | 443
shared_blks_read | 443
shared_blks_written | 0
local_blks_hit | 0
local_blks_read | 0
local_blks_written | 0
temp_blks_read | 0
temp_blks_written | 0
When this contrib is installed, one can now answer these questions:
* Which query has the biggest accumulated runtime ?
* Which query generates the most IO operations ? (we still can't know if data has been found in the Operating System's cache)
* Which query uses mostly the cache (and hence won't be faster if we make it bigger) ?
* Which query modifies the most blocks ?
* Who does sorting ?
'local' and 'temp' are the buffer operations relative to temporary tables and other local operations (sorts, hashes) to a database backend. If there are many reads and writes in them, you might be better to increase temp_buffers (for 'local') or work_mem (for 'temp').

==Stored Procedures==

PostgreSQL isn't just a database, it's a whole application platform. Many of our users write entire applications using stored procedures and functions. So, it's no surprise that 9.0 brings a number of improvements in database procedural code:

===PL/pgSQL by default===

You won't have to add PL/pgSQL in databases, as it will be installed by default. This has been requested for a long time.

===Many improvements on PL languages.===

Many languages have been vastly improved, PLPerl for instance. Read the release notes if you want more details, there are too many to detail here.

===Anonymous Functions (aka Anonymous Blocks)===

This new feature is for creating run-once functions. Effectively, this allows you to run stored procedure code on the command line or dynamically as you can on SQL Server and Oracle. Unlike those, however, PostgreSQL allows you to run an anonymous function in any procedural language which is installed of the more than a dozen which PostgreSQL supports.

This feature will be very useful for schema upgrade scripts for instance. Here is a slightly different version of the 'GRANT SELECT ON ALL TABLES' that will be seen later in this document, giving SELECT rights to a bunch of tables, depending on the table owner, and excluding two schemas:
DO language plpgsql $$
DECLARE
vr record;

BEGIN

FOR vr IN SELECT tablename FROM pg_tables WHERE tableowner = 'marc' AND schemaname NOT IN ('pg_catalog','information_schema')
LOOP
EXECUTE 'GRANT SELECT ON ' || vr.tablename || ' TO toto';
END LOOP;
END
$$;
As of 8.4, this would have required creating a function (with CREATE FUNCTION), running it, then removing it (with DROP FUNCTION). All of this requiring having rights to do this. 9.0 simplifies performing this kind of procedures.

Anonymous functions are also called "anonymous code blocks" in the software industry.

===Named Parameter Calls===

Combined with the Default Parameters introduced in version 8.4, named parameters allow for dynamic calling of functions with variable numbers of arguments, much as they would be inside a programming language. Named parameters are familiar to users of SQL Server or Sybase, but PostgreSQL does one better by supporting both named parameter calls ''and'' function overloading.

The chosen syntax to name parameters is the following:
CREATE FUNCTION test (a int, b text) RETURNS text AS $$
DECLARE
value text;
BEGIN
value := 'a is ' || a::text || ' and b is ' || b;
RETURN value;
END;
$$ LANGUAGE plpgsql;
Until now, we wrote:
SELECT test(1,'foo');
test
-------------------------
a is 1 and b is foo
(1 row)
Now this explicit syntax can be used:
SELECT test( b:='foo', a:=1);
test
-------------------------
a is 1 and b is foo
(1 row)
Named parameters should eliminate the need to write many overloaded "wrapper" functions. Note that this does add a backwards compatibility issue; you are no longer able to rename function parameters using a REPLACE command, but must now drop and recreate the function.

===ALIAS keyword===

ALIAS can now be used. As its name suggests, it can be used to alias variable names to other names.

The syntax is <tt>new_name ALIAS FOR old_name</tt>. This is put in the DECLARE section of PL/pgSQL code.

It has two main use cases:
* to give names to PL function variables:
myparam ALIAS FOR $0
* to rename potentially conflicting variables. In a trigger for instance:

new_value ALIAS FOR new
: (without this, we might have conflicted with the NEW variable in the trigger function).

==Advanced Features==

Some features in PostgreSQL are cutting-edge database features which are pretty much "PostgreSQL only". It's why we're the "most advanced database". These features enable new types of applications.

===Exclusion constraints===

Exclusion constraints are very similar to unique constraints. They could be seen as unique constraints using other operators than '=': A unique constraint defines a set of columns for which two records in the table cannot be identical.

To illustrate this, we will use the example provided by this feature's author, using the temporal data type, that he also developed. This datatype stores time ranges, that is 'the time range from 10:15 to 11:15'.

First, we need to retrieve the temporal module here: http://pgfoundry.org/projects/temporal/ , then compile and install it as a contrib (run the provided sql script). We may also need to install the btree_gist module as a contrib. From source, one can run 'make install' in contrib/btree_gist directory for the same.

CREATE TABLE reservation
(
room TEXT,
professor TEXT,
during PERIOD);

ALTER TABLE reservation ADD CONSTRAINT test_exclude EXCLUDE USING gist (room WITH =,during WITH &&);
Doing this, we declare that a record should be rejected (exclusion constraint) if there already is one verifying the two conditions 'the same room' and 'be in intersection for the time range' (the && operator).
marc=# INSERT INTO reservation (professor,room,during) VALUES ( 'mark', 'tech room', period('2010-06-16 09:00:00', '2010-06-16 10:00:00'));
INSERT 0 1
marc=# INSERT INTO reservation (professor,room,during) VALUES ( 'john', 'chemistry room', period('2010-06-16 09:00:00', '2010-06-16 11:00:00'));
INSERT 0 1
marc=# INSERT INTO reservation (professor,room,during) VALUES ( 'mark', 'chemistry room', period('2010-06-16 10:00:00', '2010-06-16 11:00:00'));
ERROR: conflicting key value violates exclusion constraint "test_exclude"
DETAIL: Key (room, during)=(chemistry room, [2010-06-16 10:00:00+02, 2010-06-16 11:00:00+02)) conflicts with existing key (room, during)=(chemistry room, [2010-06-16 09:00:00+02, 2010-06-16 11:00:00+02)).
The insert is forbidden, as the chemistry room is already reserved from 9 to 11.

Exclusion constraints may also be used with arrays, geographic data, or other non-scalar data in order to implement advanced scientific and calendaring applications. No other database system has this feature.

===Message passing in NOTIFY/pg_notify===

Messages can now be passed using NOTIFY. Here is how:
* Subscribe in session 1 to the 'instant_messenging' queue.
: Session 1:
marc=# LISTEN instant_messenging;
LISTEN
* Send a notification through 'instant_messenging', from another session
: Session 2:
marc=# NOTIFY instant_messenging, 'You just received a message';
NOTIFY
* Check the content of the queue in the first session
: Session 1:
marc=# LISTEN instant_messenging;
LISTEN
Asynchronous notification "instant_messenging" with payload "You just received a message" received from server process with PID 5943.

So we can now associate messages (payloads) with notifications, making NOTIFY even more useful.

Let's also mention the new pg_notify function. With it, the second session's code can also be:
SELECT pg_notify('instant_messenging','You just received a message');
This can simplify some code, in the case of a program managing a lot of different queues.

===Hstore contrib enhancements===

This already powerful contrib module has become even more powerful:
* Keys and values size limit has been removed.
* GROUP BY and DISTINCT can now be used.
* New operators and functions have been added.

An example would take too long, this module has a lot of features. Read the documentation at once !

===Unaccent filtering dictionary===

Filtering dictionaries can now be set up. This is about Full Text Search dictionaries.

These dictionaries' purpose it applying a first filter on words before lexemizing them. The module presented here is the first one to use this mechanism. Filtering can consist in removing words or modifying them.

Unaccent doesn't remove words, it removes accents (all diacritic signs, as a matter of fact), replacing accentuated characters with non-accentuated ones (many people, at least in French, don't type them). Unaccent is a contrib module.

Installing it, as all contrib modules, is as easy as
psql mydb < contribs_path/unaccent.sql.
We'll now follow unaccent's documentation, the example being filtering french words.

Let's create a new 'fr' dictionary (keeping standard 'french' dictionary clean):
marc=# CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
CREATE TEXT SEARCH CONFIGURATION
The next statement alters the 'fr' setup for word and alike lexemes. These now have to go through unaccent and french_stem instead of only french_stem.
marc=# ALTER TEXT SEARCH CONFIGURATION fr
>ALTER MAPPING FOR hword, hword_part, word
>WITH unaccent, french_stem;
>ALTER TEXT SEARCH CONFIGURATION

SELECT to_tsvector('fr','Hôtels de la Mer');
to_tsvector
-------------------
'hotel':1 'mer':4
(1 row)

marc=# SELECT to_tsvector('fr','Hôtel de la Mer') @@ to_tsquery('fr','Hotels');
?column?
----------
t
(1 row)
It's now easy, without changing even one line of code in the client application, and keeping accentuated characters in the database, to look up words without taking accents into account.

===get_bit and set_bit for bit strings===

Here is a very simple example. This tool can manipulate bits in a bit() independently.
marc=# SELECT set_bit('1111'::bit(4),2,0);
set_bit
---------
1101
(1 row)

marc=# SELECT get_bit('1101'::bit(4),2);
get_bit
---------
0
(1 row)

=Backwards Compatibility and Upgrade Issues=

The PostgreSQL project has a commitment not to break backwards compatibility when we can possibly avoid doing so. Sometimes, however, we have to break things in order to add new features or fix longstanding problematic behavior. Some of these issues are documented below.

==PL/pgSQL changes which may cause regressions==

There are two changes in PL/pgSQL which may break code which works in 8.4 or earlier, meaning PL/pgSQL functions should be audited before before migrating to 9.0 to prevent possible runtime errors. A lot of these come about due to uniting the lexer for SQL and PL/pgSQL, which is an important architectural improvement which has made several new features possible.

===Removal of column/variable name ambiguity===

In 8.4 and earlier, PL/PgSQL variables will take preference over a table or view column with the same name. While this behaviour is consistent, it is a potential source of coding errors. 9.0 will throw a runtime error if this situation occurs:

marc=# DO LANGUAGE plpgsql
$$
DECLARE
a int;
BEGIN
SELECT a FROM test;
END
$$
;
ERROR: column reference "a" is ambiguous
LINE 1: select a from test
DETAIL: It could refer to either a PL/pgSQL variable or a table column.
QUERY: select a from test
CONTEXT: PL/pgSQL function "inline_code_block" line 4 at SQL statement

This behaviour can be altered globally in postgresql.conf, or on a per function basis by inserting one of these three options in the function declaration:

#variable_conflict error (default)
#variable_conflict use_variable (variable name name takes precedence - pre-9.0 behaviour)
#variable_conflict use_column (column name takes precedence)

The [http://www.postgresql.org/docs/9.0/static/plpgsql-implementation.html manual] contains more details.

===Reserved words===

From 9.0, use of unquoted reserved words as PL/PgSQL variable names is no longer permitted:

marc=# DO LANGUAGE plpgsql
$$
DECLARE
table int;
BEGIN
table :=table+1;
END
$$
;
ERROR: syntax error at or near "table"
LINE 6: table :=table+1;

The correct syntax is:

marc=# DO LANGUAGE plpgsql
$$
DECLARE
"table" int;
BEGIN
"table" :="table"+1;
END
$$
;
DO

Best practice is of course to avoid reserved words completely.

[[Category:PostgreSQL 9.0]]

Binary Replication Tutorial

2012-04-26T23:47:22Z

Schmiddy: /* Recovery.conf */ typofix for pg_standby command, reported by Glen Robertson

Welcome to the new PostgreSQL 9 replication and standby databases guide. This new set of features implements possibly the longest awaited functionality in PostgreSQL's history. As a result, a lot of people are going to be trying to deploy standby databases for the first time, and find the process rather unintuitive. This guide is here to help.

'''Work in progress: only 40% complete'''

= 5 Minutes to Simple Replication =

This is the easiest way to set up replication between a master and standby. It requires shutting down the master; other methods are detailed later in this guide.

What we're going to do is shut down the master and copy the files we need over to the slave server, creating a cloned copy of the master. Because the master is shut down, we don't have to worry about changes being made to it.

Note: Both the '5 minutes' instructions and the '10 minutes' version which follows do not deal with the complications that arise with a database that uses tablespaces, specifically what to do about the pg_tblspc directory and its contents.

== Prerequisites ==

You must have the right setup to make this work:

* 2 servers with similar operating systems (e.g both Linux 64-bit).
* The same release of PostgreSQL 9.0 installed on both servers.
* PostgreSQL superuser shell access on both servers.
* Knowledge of how to start, stop and reload Postgres.
* PostgreSQL 9.0 running on Server1.
* A database created and loaded on Server1.
* A postgres user or root user who has network

See the full documentation for more information:

* [http://www.postgresql.org/docs/9.0/static/warm-standby.html 9.0 Replication Documentation]
* [http://www.postgresql.org/docs/9.1/static/warm-standby.html 9.1 Replication Documentation]

== Binary Replication in 6 Steps ==

This 6-step guide, and all of the examples in this tutorial, assume that you have a master server at 192.168.0.1 and a standby server at 192.168.0.2 and that your database and its configuration files are installed at /var/lib/postgresql/data. Replace those with whatever your actual server addresses and directories are.

1. Edit postgresql.conf on the master to turn on streaming replication. Change these settings:

listen_address = '*'
wal_level = hot_standby
max_wal_senders = 3

2. Edit pg_hba.conf on the master in order to let the standby connect.

host replication all 192.168.0.2/32 trust

3. Edit recovery.conf and postgresql.conf on the standby to start up replication and hot standby. First, in postgresql.conf, change this line:

hot_standby = on

Then create a file in the standby's '''data directory''' (which is often the same directory as postgresql.conf and pg_hba.conf, except on some Linux distributions such as Debian and Ubuntu), called recovery.conf, with the following lines:

standby_mode = 'on'
primary_conninfo = 'host=192.168.0.1'

4. Shutdown the master and copy the files. You want to copy most but not all files between the two servers, excluding the configuration files and the pg_xlog directory. An example rsync script would be:

rsync -av --exclude pg_xlog --exclude postgresql.conf data/* 192.168.0.2:/var/lib/postgresql/data/

5. Start the standby first, so that they can't get out of sync. (Messages will be logged about not being able to connect to the primary server, that's OK.)

6. Start the master.

== Starting Replication with only a Quick Master Restart ==

Is taking down the master for long enough to copy the files too long? Then you need the 10-minute version.

What we're going to do this time is similar to what we did before, cloning the database by copying the files from the master to the slave server. However, because the database is only going to be shut down for a short period of time, long enough to activate the changes in the configuration file, after we've copied the data files we will need to copy additional files so that the slave will be an up-to-date copy of the master.

So, we will tell the master we're running a backup, copy the data files (not quite the same set of files as before), tell the master the backup is complete, then copy the WAL files in the pg_xlog directory so that when the slave comes up it can make all the changes that were committed to the master database after the backup was started.

First, start with the same prerequisites as above.

1. Set the postgresql.conf variables the same in step (1) as above.

2. Don't close the file yet. You'll need to set two other variables which control the size of your write-ahead-log (WAL). The first is wal_keep_segments, the second is checkpoint_segments. Unless you've already done so, you're going to need to increase these, which is usually a good idea for performance anyway. You want the WAL to be big enough to not get used up in 15 or 20 minutes. If you don't have a clear idea of that, here's some reasonable values, based on how busy and how large your database is. Also, a database with large blob objects may require a much larger setting. Remember, these logs will take up disk space, so make sure that you have enough available - space requirements are below.

checkpoint_segments = 8
wal_keep_segments = 8
# light load 500MB

checkpoint_segments = 16
wal_keep_segments = 32
# moderately busy 1.5GB

checkpoint_segments = 64
wal_keep_segments = 128
# busy server 5GB

You don't ''have'' to increase checkpoint_segments in order to increase wal_keep_segments, but it's generally a good idea. Now save the file.

3. Edit pg_hba.conf as in (2) in the "Six Steps" above.

4. Now you need to restart the master. Given the interruption in service, you should probably plan this ahead.

5. Edit postgresql.conf and recovery.conf on the standby as in (3) above.

6. Now, we're going to need to copy the files from the master and start the standby. Unlike in the 6-step version, this needs to be done quickly or the standby will fail to sync and you'll need to try again. First step, you need to tell the master you're starting a backup (see below for a more detailed explanation of this). Log in to psql as the database superuser.

psql -U postgres
# select pg_start_backup('clone',true);

Note that the string you use as a backup label doesn't matter; use any string you want.

7. Now, quickly copy all the database files. This rsync is slightly different from the 6-step version:

rsync -av --exclude pg_xlog --exclude postgresql.conf --exclude postgresql.pid \
data/* 192.168.0.2:/var/lib/postgresql/data/

8. As soon as that's done you need to stop the backup on the master:

# select pg_stop_backup();

9. As soon as that completes, you need to quickly copy the WAL files from the master to the standby.

rsync -av data/pg_xlog 192.168.0.2:/var/lib/postgresql/data/

10. Now, start the standby.

If you've done this quickly enough, then the standby should catch up with the master and you should be replicating. If not, you'll get this message:

(Future Revisions note: Message needs to go here)

... which means you need to try again, possibly with checkpoint_segments and wal_keep_segments higher. If that still doesn't work, you're going to need to use the even more complex archiving method described below.

Now, the rest of the guide will explain how to deal with more complex situations, such as archive logs, handling security, and maintaining availability, failover and standby promotion.

= Introduction to Binary Replication =

Binary replication is also called "Hot Standby" and "Streaming Replication" which are two separate, but complimentary, features of PostgreSQL 9.0 and later. Here's some general information about how they work and what they are for.

== What Can You Do With Binary Replication? ==

* Have a simple and complete replica of your production database, preventing all but a couple seconds of data loss even under catastrophic circumstances.
* Load-balance between your read/write master server and multiple read-only servers.
* Run reporting or other long-running queries on a replica server, taking them off your main transaction-processing server.
* Replicate all DDL, including table and index changes, and even creating new databases.
* Replicate a hosted multi-tenant database, making no specific requirements for primary keys or database changes of your users.

== What Can't You Do With Binary Replication? ==

* Replicate a specific table, schema, or database. Binary replication is the entire Postgres instance (or "cluster").
* Multi-master replication. Multi-master binary replication is probably technically impossible.
* Replicate between different versions of PostgreSQL, or between different platforms.
* Set up replication without administration rights on the server. Sorry, working on it.
* Replicate data synchronously, guaranteeing zero data loss. But ... this is coming in PostgreSQL 9.1.

For the reasons above, we expect that Slony-I, Londiste, Bucardo, pgPool2 and other systems will continue to be used.

== Transaction Logs and Log Shipping ==

Users who are already familiar with the PostgreSQL transaction log and warm standby can skip this section.

An individual "instance", "server", or (confusingly) "cluster" of PostgreSQL (hereafter Server) consists of a single postmaster server process connected to a single initialized PostgreSQL data directory (PGDATA), which in turn contains several databases. Each running Server has a transaction log, located in the PGDATA/pg_xlog directory. This transaction log consists of binary snapshots of data, written to record synchronously each change to all databases' data, in case of unexpected shutdown of the database server (such as in a power failure). This ensures that data is not corrupted and no completed transaction is lost.

You can also use this log to allow a copy of the original database to replicate changes made to a master database. This was first implemented with the PITR feature in PostgreSQL 8.0, and is known as "log shipping". Log shipping is required for most forms of binary replication.

This log consists of 16MB segments full of new data pages (8K segments) of the database, and not of SQL statements. For this reason there is no before and after auditing possible via this log, as you cannot know exactly what has changed. Also, the log is treated as a buffer, being deleted as it is no longer needed for crash recovery. More importantly, the data page format of the log means that log segments can only be applied to a database which is binary-identical to the database which created the log.

== PITR, Warm Standby, Hot Standby, and Streaming Replication ==

For the rest of this tutorial, we will refer to the active read-write instance of the Server which generates transaction logs as the "Master" and the passive, read-only or offline instance (or instances) of the Server which receives transaction logs as the "Standby" (or "Standbys"). The term Master/Standby is equivalent to other terminology which may be used in the database industry, such as Master/Slave, Primary/Secondary or Primary/Replica.

=== PITR ===

In Point-In-Time Recovery (PITR), transaction logs are copied and saved to storage until needed. Then, when needed, the Standby server can be "brought up" (made active) and transaction logs applied, either stopping when they run out or at a prior point indicated by the administrator. PITR has been available since PostgreSQL version 8.0, and as such will not be documented here.

PITR is primarily used for database forensics and recovery. It is also useful when you need to back up a very large database, as it effectively supports incremental backups, which pg_dump does not.

=== Warm Standby ===

In Warm Standby, transaction logs are copied from the Master and applied to the Standby immediately after they are received, or at a short delay. The Standby is offline (in "recovery mode") and not available for any query workload. This allows the Standby to be brought up to full operation very quickly. Warm Standby has been available since version 8.3, and will not be fully documented here.

Warm Standby requires Log Shipping. It is primary used for database failover.

=== Hot Standby ===

Hot Standby is identical to Warm Standby, except that the Standby is available to run read-only queries. This offers all of the advantages of Warm Standby, plus the ability to distribute some business workload to the Standby server(s). Hot Standby by itself requires Log Shipping.

Hot Standby is used both for database failover, and can also be used for load-balancing. In contrast to Streaming Replication, it places no load on the master (except for disk space requirements) and is thus theoretically infinitely scalable. A WAL archive could be distributed to dozens or hundreds of servers via network storage. The WAL files could also easily be copied over a poor quality network connection, or by SFTP.

However, since Hot Standby replicates by shipping 16MB logs, it is at best minutes behind and sometimes more than that. This can be problematic both from a failover and a load-balancing perspective.

=== Streaming Replication ===

Streaming Replication improves either Warm Standby or Hot Standby by opening a network connection between the Standby and the Master database, instead of copying 16MB log files. This allows data changes to be copied over the network almost immediately on completion on the Master.

In Streaming Replication, the master and the standby have special processes called the walsender and walreceiver which transmit modified data pages over a network port. This requires one fairly busy connection per standby, imposing an incremental load on the master for each additional standby. Still, the load is quite low and a single master should be able to support multiple standbys easily.

Streaming replication does not require log shipping in normal operation. It may, however, require log shipping to start replication, and can utilize log shipping in order to catch up standbys which fall behind.

= How to Replicate =

== Cloning a Live Database ==

If your workload doesn't allow you to take the master down (and whose does?), things get a bit more complicated. You need to somehow take a "coherent snapshot" of the master, so that you don't have an inconsistent or corrupt database on the standby. Now, in some cases this can be done via filesystem snapshotting tools or similar tricks, but as that approach is tricky and platform-dependant, we're not going to cover it here.

Instead, we're going to cover the built-in method, which involves keeping a log of all changes applied to the database which happen during the copying process. The steps are essentially the same, regardless of whether you're planning to use just hot standby, streaming replication, or both. There are two parts:

* Cloning the database files
* Copying the archive logs

Unintuitive as it is, the latter needs to be set up first, so we're going to start with that.

== Setting Up Archiving On The Master ==

Archiving is the process of making an extra copy of each WAL file as it is completed. These log files then need to somehow be accessed by the standby. There are three basic ways to handle this, and you should decide in advance what method you're going to use:

# Manually
# Automatic file copying from master to standby using rsync or simiar
# Writing them to a common shared network file location

The first method is only appropriate if you're archiving logs only to jump-start streaming replication, and you have a fairly low-traffic database or the ability to stop all writes. The third method is probably the easiest to manage if you have an appropriate network share; it can even be used to support multiple standbys with some extra thought and scripting. All of these methods will be explained below.

This needs to be turned on on the master, which if it's never been done before may require a restart (sorry, working on it), and will certainly require a reload. You'll need to set the following parameters:

wal_level = hot_standby
archive_mode = on
archive_command = 'some command'

What archive command you use depends on which archiving approach you are taking, of course. Here are three examples of commands you might use. Note that you will need to create the "archive" directories.

# Manual: cp -f %p /var/lib/postgresql/data/archive/%f </dev/null
# Automatic Copy: rsync -a %p 192.168.0.2:/var/lib/pgsql/data/archive/%f
# Network Share: cp -f %p /shares/walarchive/archive/%f </dev/null

In these commands, %p is replace by postgres at invocation time with the full path and name of the WAL file, and %f with the name of the file alone. There are more escapes and parameters dealing with WAL archiving which will be detailed later in the tutorial. Note that, in real production, you are unlikely to want to use any commands as simple as the above. In general, you will want to have archive_command call an executable script which traps errors and can be disabled. Examples of such scripts are available in this tutorial.

Now, if archive_mode was originally "off" or if you had to change wal_level, you're going to need to restart the master (sorry, this will be fixed in a later version). If you just needed to change the archive_command, however, only a reload is required.

Once you've restarted or reloaded, check the master's logs to make sure archiving is working. If it's failing, the master will complain extensively. You might also check that archive log files are being created; run the command "SELECT pg_switch_xlog();" as the superuser to force a new log to be written.

== Setting Up Archiving on the Standby ==

The standby needs to be configured to consume logs. This is simpler than the master's setup, and doesn't really change no matter what archive copying strategy you're using.

== Recovery.conf ==

On the standby, replication configuration is controlled through a file called, for historical reasons, recovery.conf. If this file is present in PostgreSQL's data directory when PostgreSQL is started, that server will assume it is a standby and attempt to obey it. Generally, there is an example file installed with the other PostgreSQL shared docs. However, that example file covers all of the various replication options at once, so it's often simpler to write your own file, from scratch. Any change to recovery.conf requires a restart of the standby.

In recovery.conf, you need to add a command to copy the archived WAL files to the standby's on pg_xlog directory. This is the mirror image of the archive_command on the master. Generally, a simple cp command is sufficient:

restore_command = 'cp -f /var/lib/postgresql/data/archive/%f %p </dev/null'
restore_command = 'cp -f /shares/walarchive/%f %p </dev/null'

Again, you might want to use a simple shell script which traps error messages, and, importantly, deletes archive files which are no longer needed. If you will be doing only hot standby and not using streaming replication, you probably want to compile the pg_standby binary provided in PostgreSQL's additional modules or "contrib", and use it instead:

restore_command = 'pg_standby /shares/walarchive %f %p %r'

More detail on pg_standby is in its documentation.

== Cloning a Snapshot of the Master ==

Once you have archiving working, you're ready to clone the master database. At this point, it's a simple process:

# As superuser, issue the command "SELECT pg_start_backup('backup');" on the master.
# Copy all of the database files to the standby.
# Start the standby database.
# Issue the command "SELECT pg_stop_backup();" on the master.

Of course, each of those steps deserves a little more elaboration. pg_start_backup and pg_stop_backup are special commands you issue on the master in order to create, hold open, and close, a "snapshot" which is how we make sure your copy of the database is not inconsistent. They also write special files to the archive log which tell the standby when it has a complete snapshot.

If you are using the "manual" method of synching the archive logs, immediately after step 4 you need to do one last rsync or copy of the archive logs to the standby.

When you're done with the cloning, you should see output similar to the below:

This means that you're up and replicating, and should now be able to run queries on the standby.

== Failing Over To The Standby ==

Of course, one of the major reasons to have a standby is in case something (planned or unplanned) causes the master server to shut down. Then you want to "fail over", or stop replication and change the standby to a full read-write master.

The recommended method is the same regardless of the type of replication or standby: via "trigger file". First, you need to set a configuration option in recovery.conf on the standby:

trigger_file = '/var/lib/postgresql/data/failover'

Then, when it's time to fail over, you just create an empty file with that name, such as by using the "touch" command. The standby will notice the file, attempt to apply any remaining WAL records or files it has received, and then switch to read-write or "master" mode. When this happens, you will see a message like this in the Postgres log:

PostgreSQL will also rename the recovery.conf file to recovery.done in order to prevent having the new master fail on restart. For this reason, the recovery.conf file should be owned by the same user which the server runs as (usually "postgres").

The alternative to using a trigger file is to failover manually, by deleting or renaming the recovery.conf file and restarting the standby. This method is inferior because it requires a restart which would interrupt any read-only connections to the standby currently in use.

In a high-availability system, the above activity should be managed automatically in order to avoid downtime. PostgreSQL itself supplies no tools to do this, but numerous third-party utilities such as "Linux heartbeat" are compatible with PostgreSQL replication.

It's important to prevent the original master from restarting after failover, lest you end up with a "split brain" problem and data loss. There is a substantial body of literature on this, and third-party tools, so we won't discuss them here at this time.

== Load Balancing ==

== Managing Archive Logs ==

== Tuning and Configuration of Binary Replication ==

== Monitoring Replication ==

[[Category:Replication]]

Priorities

2012-04-07T21:14:11Z

Schmiddy: /* Prioritizing CPU */ Remove text about using "nice" (should have been "renice") only manually, and added link and description of "prioritize" module

== Prioritizing users, queries, or databases ==

PostgreSQL has no facilities to limit what resources a particular user, query, or database consumes, or correspondingly to set priorities such that one user/query/database gets more resources than others. It's necessary to use operating system facilities to achieve what limited prioritization is possible.

There are three main resources that PostgreSQL users, queries, and databases will contend for:

* Memory
* CPU
* Disk I/O

Of these, disk I/O is commonly a bottleneck for database applications, but that's not always the case. Some schema designs and queries are particularly CPU heavy. Others really benefit from having lots of memory to work with, typically for sorting.

== Are priorities really the problem? ==

Before struggling too much with prioritizing your queries/users/databases, it's worthwhile to optimize your queries and [[Tuning_Your_PostgreSQL_Server|tune your database]]. You may find that you can get perfectly acceptable performance without playing with priorities or taking extreme measures, using techniques such as:

* [[Using_EXPLAIN|Improving your queries]]
* Tune autovacuum to reduce bloat
* [[Performance_Optimization|Generally polishing your cluster's performance]]
* Avoiding use of [[VACUUM FULL]]. That can lead to bloated indexes that eat lots of memory and take forever to scan, wasting disk I/O bandwidth. See the wiki page on [[VACUUM FULL]] for more information.

=== Is CPU really the bottleneck? ===

People often complain of pegged (100%) CPU and assume that's the cause of database slowdowns. That's not necessarily the case - a system may show an apparent 100% CPU use, but in fact be mainly limited by I/O bandwidth. Consider the following test, which starts 20 `dd' processes, each reading a different 1Gb block from the hard disk at 1Gb offsets.

<pre>
for i in `seq 1 20`; do
( dd if=/dev/md0 bs=1M count=1000 skip=$(($i * 1000)) of=/dev/null &)
done
</pre>

results in `top' output of:

<pre>
top - 14:51:55 up 3 days, 2:09, 5 users, load average: 10.92, 4.94, 2.93
Tasks: 259 total, 3 running, 256 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.6%us, 15.0%sy, 0.0%ni, 0.0%id, 78.6%wa, 0.8%hi, 4.0%si, 0.0%st
Mem: 4055728k total, 3843408k used, 212320k free, 749448k buffers
Swap: 2120544k total, 4144k used, 2116400k free, 2303356k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
33 root 15 -5 0 0 0 R 5 0.0 0:26.67 kswapd0
904 root 20 0 4152 1772 628 D 5 0.0 0:00.62 dd
874 root 20 0 4152 1768 628 D 3 0.0 0:00.74 dd
908 root 20 0 4152 1768 628 D 3 0.0 0:00.80 dd
888 root 20 0 4152 1772 628 D 3 0.0 0:00.44 dd
906 root 20 0 4152 1772 628 D 3 0.0 0:00.56 dd
894 root 20 0 4152 1768 628 D 2 0.0 0:00.49 dd
902 root 20 0 4152 1772 628 D 2 0.0 0:00.46 dd
.... etc ....
</pre>

... which could be confused for a busy CPU, but is really load caused by disk I/O. The key warning sign here is the presence of a high iowait cpu percentage ("%wa"), indicating that much of the apparent load is actually caused by delays in the I/O subsystem. Most of the `dd' processes are in 'D' state - ie uninterruptable sleep in a system call - and if you check "wchan" with "ps" you'll see that they're sleeping waiting for I/O.

Rather than assuming that CPU contention is the issue, it's a good idea to use the available [[Performance Analysis Tools]] to get a better idea of where your system bottlenecks really are.

== Prioritizing CPU ==

For adjusting the CPU priority of PostgreSQL processes, you can use "renice" (on UNIX systems), but it's a bit clumsy to do since you need to "renice" the backend of interest, not the client program connected to that backend. You can get the backend process id using the SQL query "SELECT pg_backend_pid()" or by looking at the pg_stat_activities view.

One significant limitation of "renice", or any approach based on the setpriority() call, is that on most UNIX-like platforms one must be root to lower the numerical priority value (i.e. schedule the process to run more urgently) of a process.

Increasing the priority of important backends, via a root user's call to "renice", instead of lowering the priority of unimportant ones, may be more effective.

== prioritize module ==
The [http://pgxn.org/dist/prioritize/ prioritize] extension lets users adjust the CPU priority, in the same way that "renice" does, via the SQL function set_backend_priority(). Normal users may increase the priority value of any backend process running under the same username. Superusers may increase the priority value of any backend process. Just like with using "renice" manually, it is not possible to lower a backend's priority value, since PostgreSQL will not be running as the "root" user.

If you know your application will be running an unimportant CPU-heavy query, you could have it call set_backend_priority(pg_backend_pid(), 20) after installing the "prioritize" module, so that the process is scheduled for the lowest possible urgency.

== Prioritizing I/O ==

I/O is harder. Some operating systems offer I/O priorities for
processes, like Linux's ionice, and you'd think you could use these in a
similar way to how you use 'nice'. Unfortunately, that won't work particularly well,
because a lot of the work PostgreSQL does - especially disk writes - are
done via a separate background writer process working from memory shared
by all backends. Similarly, the write-ahead logs are managed by their
own process via shared memory. Because of those two, it's very hard to effectively give one
user priority over another for writes. ionice should be moderately
effective for reads, though.

As with "nice", effective control on a per-connection level will require the addition of appropriate helper
functions, and user co-operation is required to achieve per-user priorities.

Better separation of I/O workloads will require [[Prioritizing_databases_by_separating_into_multiple_clusters|cluster separation]], which has its own costs and is only effective on the per-database level.

== Prioritizing memory ==

PostgreSQL does have some [[Tuning_Your_PostgreSQL_Server|tunable parameters]] for memory use that are per-client, particularly <code>[http://www.postgresql.org/docs/current/static/runtime-config-resource.html#GUC-WORK-MEM work_mem]</code> and <code>[http://www.postgresql.org/docs/current/static/runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM maintenance_work_mem]</code>. These may be set within a given connection to allow that backend to use more than the usual amount of memory for things like sorts and index creation. You can set these to conservative, low values in <code>postgresql.conf</code> then use the <code>SET</code> command to assign higher values to them for a particular backend, eg <code>SET work_mem = '100MB';</code>.

You can set different values for <code>work_mem</code> and <code>maintenance_work_mem</code> using per-user GUC variables. For example:

<pre>
ALTER USER myuser SET work_mem = '50MB';
</pre>

You cannot affect the shared memory allocation done with settings like shared_buffers this way, that value is fixed at database startup time and can't be changed without restarting it.

There's no easy way in most operating systems to prioritize memory allocations, so that for example the OS would prefer to swap one backend's memory out instead of another's.

== External links ==

* [http://www.cs.cmu.edu/~harchol/Papers/actual-icde-submission.pdf CMU article studying CPU priorities on Postgres and DB2 on TPC-C and TPC-W workloads]

== Credits ==
Page initially by [[User:Ringerc|Ringerc]] 02:34, 26 November 2009 (UTC)

[[Category:FAQ]] [[Category:Performance]]

Index Maintenance

2011-07-06T21:09:47Z

Schmiddy: Per suggestion from John Pierce, the second query will not work without explict casts to text

One day, you will probably need to cope with [http://www.postgresql.org/docs/current/static/routine-reindex.html routine reindexing] on your database, particularly if you don't use VACUUM aggressively enough. A particularly handy command in this area is [http://www.postgresql.org/docs/8.3/static/sql-cluster.html CLUSTER], which can help with other types of cleanup.

Avoid using [[VACUUM FULL]].

== Index summary ==

Here's a sample query to pull the number of rows, indexes, and some info about those indexes for each table. (Only works on 8.3; ditch the pg_size_pretty if you’re on an earlier version)

{{SnippetInfo|Index summary|lang=SQL|version=>=8.1|category=Performance}}
<source lang="sql">
SELECT
pg_class.relname,
pg_size_pretty(pg_class.reltuples::bigint) AS rows_in_bytes,
pg_class.reltuples AS num_rows,
count(indexname) AS number_of_indexes,
CASE WHEN x.is_unique = 1 THEN 'Y'
ELSE 'N'
END AS UNIQUE,
SUM(case WHEN number_of_columns = 1 THEN 1
ELSE 0
END) AS single_column,
SUM(case WHEN number_of_columns IS NULL THEN 0
WHEN number_of_columns = 1 THEN 0
ELSE 1
END) AS multi_column
FROM pg_namespace
LEFT OUTER JOIN pg_class ON pg_namespace.oid = pg_class.relnamespace
LEFT OUTER JOIN
(SELECT indrelid,
max(CAST(indisunique AS integer)) AS is_unique
FROM pg_index
GROUP BY indrelid) x
ON pg_class.oid = x.indrelid
LEFT OUTER JOIN
( SELECT c.relname AS ctablename, ipg.relname AS indexname, x.indnatts AS number_of_columns FROM pg_index x
JOIN pg_class c ON c.oid = x.indrelid
JOIN pg_class ipg ON ipg.oid = x.indexrelid )
AS foo
ON pg_class.relname = foo.ctablename
WHERE
pg_namespace.nspname='public'
AND pg_class.relkind = 'r'
GROUP BY pg_class.relname, pg_class.reltuples, x.is_unique
ORDER BY 2;
</source>

== Index size/usage statistics ==

Table & index sizes along which indexes are being scanned and how many tuples are fetched. See [[Disk Usage]] for another view that includes both table and index sizes.

{{SnippetInfo|Index statistics|lang=SQL|version=>=8.1|category=Performance}}
<source lang="sql">
SELECT
t.tablename,
indexname,
c.reltuples AS num_rows,
pg_size_pretty(pg_relation_size(t.tablename::text)) AS table_size,
pg_size_pretty(pg_relation_size(indexrelname::text)) AS index_size,
CASE WHEN x.is_unique = 1 THEN 'Y'
ELSE 'N'
END AS unique,
idx_scan AS number_of_scans,
idx_tup_read AS tuples_read,
idx_tup_fetch AS tuples_fetched
FROM pg_tables t
LEFT OUTER JOIN pg_class c ON t.tablename=c.relname
LEFT OUTER JOIN
(SELECT indrelid,
max(CAST(indisunique AS integer)) AS is_unique
FROM pg_index
GROUP BY indrelid) x
ON c.oid = x.indrelid
LEFT OUTER JOIN
( SELECT c.relname as ctablename, ipg.relname as indexname, x.indnatts as number_of_columns, idx_scan, idx_tup_read, idx_tup_fetch,indexrelname FROM pg_index x
JOIN pg_class c ON c.oid = x.indrelid
JOIN pg_class ipg ON ipg.oid = x.indexrelid
JOIN pg_stat_all_indexes psai ON x.indexrelid = psai.indexrelid )
as foo
ON t.tablename = foo.ctablename
WHERE t.schemaname='public'
order by 1,2;
</source>

== Index Bloat ==

One of the common needs for a REINDEX is when indexes become bloated due to either sparse deletions or use of VACUUM FULL. An estimator for the amount of bloat in a table has been included in the [http://bucardo.org/wiki/Check_postgres check_postgres] script, which you can call directly or incorporate into a larger monitoring system. Scripts based on this code and/or its concepts from other sources include:
* [http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html bloat view] (Dimitri Fontaine)
* [http://www.pgcon.org/2009/schedule/events/153.en.html Visualizing Postgres] - index_byte_sizes view (Michael Glaesemann, myYearbook)
* [http://labs.omniti.com/trac/pgtreats/browser/trunk/tools OmniTI Tasty Treats for PostgreSQL] - shell and Perl pg_bloat_report scripts

== Unused Indexes ==

Since indexes add significant overhead to any table change operation, they should be removed if they are not being used for either queries or constraint enforcement (such as making sure a value is unique). How to find such indexes:

* [http://www.xzilla.net/blog/2008/Jul/Index-pruning-techniques.html Index pruning techniques]
* [http://hype-free.blogspot.com/2008/09/finding-unused-indexes-in-postgresql.html Finding unused indexes]
* [http://it.toolbox.com/blogs/database-soup/finding-useless-indexes-28796 Finding useless indexes]
* [http://radek.cc/2009/09/05/psqlrc-tricks-indexes/ Missing and unused indexes]

== References ==

* Index statistics queries from [http://www.baconandtech.com/2009/06/06/book-review-part-i-refactoring-sql-applications-with-bonus-queries/ "Refactoring SQL Applications" review]

[[Category:Administration]][[Category:Performance]]

Todo

2011-05-25T01:36:12Z

Schmiddy: /* psql */ The \dd command was missing more than just constraint comments, add links to relevant discussions

<div style="margin: 1ex 1em; float: right;">
__TOC__
</div>

This list contains '''all known PostgreSQL bugs and feature requests'''. If you would like to work on an item, please read the [[Developer FAQ]] first. There is also a [[Development_information|development information page]].

* {{TodoPending}} - marks ordinary, incomplete items
* {{TodoEasy}} - marks items that are easier to implement
* {{TodoDone}} - marks changes that are done, and will appear in the PostgreSQL 9.1 release.

For help on editing this list, please see [[Talk:Todo]]. <b>Please do not add items here without discussion on the mailing list.</b>

<div style="padding: 1ex 4em;">
== Administration ==

{{TodoItem
|Allow administrators to cancel multi-statement idle transactions
|This allows locks to be released, but it is complex to report the cancellation back to the client.
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg01340.php <nowiki>Cancelling idle in transaction state</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg00441.php <nowiki>Re: Cancelling idle in transaction state</nowiki>]
}}

{{TodoItem
|Check for unreferenced table files created by transactions that were in-progress when the server terminated abruptly
* [http://archives.postgresql.org/pgsql-patches/2006-06/msg00096.php <nowiki>Removing unreferenced files</nowiki>]
}}

{{TodoItem
|Set proper permissions on non-system schemas during db creation
|Currently all schemas are owned by the super-user because they are copied from the template1 database. However, since all objects are inherited from the template database, it is not clear that setting schemas to the db owner is correct.}}

{{TodoItem
|Allow log_min_messages to be specified on a per-module basis
|This would allow administrators to see more detailed information from specific sections of the backend, e.g. checkpoints, autovacuum, etc. Another idea is to allow separate configuration files for each module, or allow arbitrary SET commands to be passed to them. See also [[Logging Brainstorm]].}}

{{TodoItem
|Simplify creation of partitioned tables
|This would allow creation of partitioned tables without requiring creation of triggers or rules for INSERT/UPDATE/DELETE, and constraints for rapid partition selection. Options could include range and hash partition selection. See also [[Table partitioning]]
}}

{{TodoItemDone
|Allow auto-selection of partitioned tables for min/max() operations
|There was a patch on -hackers from July 2009, but it has not been merged: [http://archives.postgresql.org/pgsql-hackers/2009-07/msg01115.php <nowiki>MIN/MAX optimization for partitioned table</nowiki>]}}

{{TodoItem
|Allow custom variables to appear in pg_settings()
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00850.php <nowiki>Re: count(*) performance improvement ideas</nowiki>]
}}

{{TodoItem
|Have custom variables be transaction-safe
* {{MessageLink|4B577E9F.8000505@dunslane.net|Custom GUCs still a bit broken}}
}}

{{TodoItem
|Implement the SQL-standard mechanism whereby REVOKE ROLE revokes only the privilege granted by the invoking role, and not those granted by other roles
* [http://archives.postgresql.org/pgsql-bugs/2007-05/msg00010.php <nowiki>Re: Grantor name gets lost when grantor role dropped</nowiki>]
}}

{{TodoItemDone
|Improve server security options
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01875.php <nowiki>Re: [0/4] Proposal of SE-PostgreSQL patches</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00000.php <nowiki>Re: [0/4] Proposal of SE-PostgreSQL patches</nowiki>]
}}

{{TodoItem
|Prevent query cancel packets from being replayed by an attacker, especially when using SSL
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg00345.php <nowiki>Replay attack of query cancel</nowiki>]
}}

{{TodoItem
|Provide a way to query the log collector subprocess to determine the name of the currently active log file
* [http://archives.postgresql.org/pgsql-general/2008-11/msg00418.php <nowiki>Current log files when rotating?</nowiki>]
}}

{{TodoItemDone
|Allow the client to authenticate the server in a Unix-domain socket connection, e.g., using SO_PEERCRED
* http://archives.postgresql.org/message-id/20090401173756.GB21229@svana.org
}}

{{TodoItem
|Allow simpler reporting of the unix domain socket directory and allow easier configuration of its default location
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg01555.php
}}

{{TodoItem
|Allow custom daemons to be automatically stopped/started along with the postmaster
|This allows easier administration of daemons like user job schedulers or replication-related daemons.
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01701.php <nowiki>Re: scheduler in core</nowiki>]
}}

{{TodoItemDone
|Increase maximum values for max_standby_streaming_delay and log_min_duration_statement
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg01517.php
* Committed: http://archives.postgresql.org/pgsql-committers/2011-03/msg00210.php
}}

{{TodoItem
|Improve logging of prepared transactions recovered during startup
* [http://archives.postgresql.org/pgsql-hackers/2006-11/msg00092.php <nowiki>"recovering prepared transaction" after server restart message</nowiki>]
}}

=== Configuration files ===
{{TodoSubsection}}

{{TodoItemDone
|Allow pg_hba.conf to specify host names along with IP addresses
|Host name lookup could occur when the postmaster reads the pg_hba.conf file, or when the backend starts. Another solution would be to reverse lookup the connection IP and check that hostname against the host names in pg_hba.conf. We could also then check that the host name maps to the IP address.
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00569.php <nowiki>TODO Item: Allow pg_hba.conf to specify host names along with IP addresses</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00613.php
}}

{{TodoItem
|Allow postgresql.conf file values to be changed via an SQL API, perhaps using SET GLOBAL
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00764.php
}}

{{TodoItem
|Consider normalizing fractions in postgresql.conf, perhaps using '%'
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00550.php <nowiki>Fractions in GUC variables</nowiki>]
}}

{{TodoItem
|Allow Kerberos to disable stripping of realms so we can check the username@realm against multiple realms
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00009.php <nowiki>krb_match_realm patch</nowiki>]
}}

{{TodoItem
|Improve LDAP authentication configuration options
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01745.php <nowiki>Proposed Patch - LDAPS support for servers on port 636 w/o TLS</nowiki>]
}}

{{TodoItem
|Add external tool to auto-tune some postgresql.conf parameters
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00000.php <nowiki>Re: Overhauling GUCS</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00033.php <nowiki>Simple postgresql.conf wizard</nowiki>]
}}

{{TodoItem
|Add 'hostgss' pg_hba.conf option to allow GSS link-level encryption
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg01454.php <nowiki>Re: Plans for 8.4</nowiki>]
}}

{{TodoItem
|Process pg_hba.conf keywords as case-insensitive
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00432.php <nowiki>More robust pg_hba.conf parsing/error logging</nowiki>]
}}

{{TodoItemDone
|Have pg_hba.conf consider "replication" special only in the database field
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00632.php
}}

{{TodoItemDone
|Rename unix domain socket 'ident' connections to 'peer', to avoid confusion with TCP 'ident'
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg01053.php
}}

{{TodoItem
|Create utility to compute accurate random_page_cost value}}

{{TodoItem
|Allow configuration files to be independently validated
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01831.php
}}

{{TodoItem
|Allow postgresql.conf settings to be accepted by backends even if some settings are invalid for those backends
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00330.php
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00375.php
}}

{{TodoItem
|Allow all backends to receive postgresql.conf setting changes at the same time
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00330.php
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00375.php
}}

{{TodoEndSubsection}}

=== Tablespaces ===
{{TodoSubsection}}

{{TodoItem
|Allow a database in tablespace t1 with tables created in tablespace t2 to be used as a template for a new database created with default tablespace t2
|Currently all objects in the default database tablespace must have default tablespace specifications. This is because new databases are created by copying directories. If you mix default tablespace tables and tablespace-specified tables in the same directory, creating a new database from such a mixed directory would create a new database with tables that had incorrect explicit tablespaces. To fix this would require modifying pg_class in the newly copied database, which we don't currently do.}}

{{TodoItem
|Allow reporting of which objects are in which tablespaces
|This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.}}

{{TodoItem
|Allow WAL replay of CREATE TABLESPACE to work when the directory structure on the recovery computer is different from the original}}

{{TodoItem
|Allow per-tablespace quotas}}

{{TodoItem
|Allow tablespaces on RAM-based partitions for unlogged tables
* http://archives.postgresql.org/pgsql-advocacy/2011-05/msg00033.php
}}

{{TodoItem
|Allow toast tables to be moved to a different tablespace
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00980.php
}}

{{TodoEndSubsection}}

=== Statistics Collector ===
{{TodoSubsection}}

{{TodoItem
|Allow statistics last vacuum/analyze execution times to be displayed without requiring track_counts to be enabled
* [http://archives.postgresql.org/pgsql-docs/2007-04/msg00028.php <nowiki>row-level stats and last analyze time</nowiki>]
}}

{{TodoItem
|Clear table counters on TRUNCATE
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00169.php <nowiki>Small TRUNCATE glitch</nowiki>]
}}

{{TodoItemDone
| Allow the clearing of cluster-level statistics
* [http://archives.postgresql.org/pgsql-hackers/2009-03/msg00917.php <nowiki>Resetting cluster-wide statistics</nowiki>]
* ''pg_stat_reset_shared('bgwriter')'' (9.0) now handles the ''pg_stat_bgwriter'' subset of this
}}

{{TodoEndSubsection}}

=== SSL ===
{{TodoSubsection}}

{{TodoItem
|Allow SSL authentication/encryption over unix domain sockets
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00924.php <nowiki>Re: Spoofing as the postmaster</nowiki>]
}}

{{TodoItem
|Allow SSL key file permission checks to be optionally disabled when sharing SSL keys with other applications
* [http://archives.postgresql.org/pgsql-bugs/2007-12/msg00069.php <nowiki>BUG #3809: SSL "unsafe" private key permissions bug</nowiki>]
}}

{{TodoItem
|Allow SSL CRL files to be re-read during configuration file reload, rather than requiring a server restart
|Unlike SSL CRT files, CRL (Certificate Revocation List) files are updated frequently
* [http://archives.postgresql.org/pgsql-general/2008-12/msg00832.php <nowiki>Automatic CRL reload</nowiki>]
Alternatively or additionally supporting OCSP (online certificate security protocol) would provide real-time revocation discovery without reloading
}}

{{TodoItem
| Allow automatic selection of SSL client certificates from a certificate store
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00406.php <nowiki>Allow multiple certificates or keys in the postgresql.crt/.key files</nowiki>]
}}

{{TodoItem
| Send the full certificate server chain to the client
* [http://archives.postgresql.org/pgsql-bugs/2009-12/msg00145.php BUG #5245: Full Server Certificate Chain Not Sent to client]
}}

{{TodoEndSubsection}}

=== Point-In-Time Recovery (PITR) ===
{{TodoSubsection}}

{{TodoItemEasy
|Create dump tool for write-ahead logs for use in determining transaction id for point-in-time recovery
|This is useful for checking PITR recovery.}}

{{TodoItemDone
|Allow recovery.conf to support the same syntax as postgresql.conf, including quoting
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00497.php <nowiki>recovery.conf parsing problems</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg00684.php
}}

{{TodoItem
|Allow archive_mode to be changed without server restart?
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01655.php <nowiki>Enabling archive_mode without restart</nowiki>]
}}

{{TodoItem
|Consider avoiding WAL switching via archive_timeout if there has been no database activity
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01469.php <nowiki>archive_timeout behavior for no activity</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00395.php <nowiki>Re: archive_timeout behavior for no activity</nowiki>]
}}

{{TodoItemEasy
|Expose pg_controldata via an SQL interface
|Helpful for monitoring replicated databases
* http://archives.postgresql.org/message-id/4B901D73.8030003@agliodbs.com
* [http://archives.postgresql.org/message-id/4B959D7A.6010907@joeconway.com initial patch]
}}

{{TodoEndSubsection}}

=== Standby server mode ===
{{TodoSubsection}}

{{TodoItem
| Allow pg_xlogfile_name() to be used in recovery mode
* [http://archives.postgresql.org/message-id/3f0b79eb1001190135vd9f62f1sa7868abc1ea61d12@mail.gmail.com <nowiki>Streaming replication and pg_xlogfile_name()</nowiki>]
}}

{{TodoItem
| Prevent variables inherited from the server environment from begin used for making streaming replication connections.
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01011.php <nowiki>Re: Parameter name standby_mode</nowiki>]
}}

{{TodoItemDone
| Add a new privilege for connecting for streaming replication
* [http://archives.postgresql.org/message-id/3f0b79eb1003040247p6b092241of91784a505e9abd8@mail.gmail.com <nowiki>Streaming replication and privilege</nowiki>]
}}

{{TodoItemDone
| Add support for synchronous replication.
}}

{{TodoItemDone
| Add capability to take and send a base backup over the streaming replication connection, making it possible to initialize a new standby server from a running primary server without a WAL archive or other access to the primary server.
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg00136.php
}}

{{TodoItem
| Allow hot file system backups on standby servers
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01727.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01490.php
}}

{{TodoItemDone
| Allow the automatic removal of old directories when streaming base backups
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00558.php
}}

{{TodoItem
| Change walsender so that it applies per-role settings
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg00642.php
}}

{{TodoItem
| Add more control over waiting for synchronous commit
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01611.php
}}

{{TodoItem
| Restructure configuration parameters for standby mode
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg01820.php
}}

{{TodoItem
| Allow time-delayed application of logs on the standby
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00992.php
}}

{{TodoEndSubsection}}

== Data Types ==

{{TodoItemDone
|Reduce storage space for small NUMERICs
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg01331.php <nowiki>Saving space for common kinds of numeric values</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-02/msg00505.php <nowiki>Numeric patch to add special-case representations for < 8 bytes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php <nowiki>Re: Reducing NUMERIC size for 8.3</nowiki>]
}}

{{TodoItem
|Fix data types where equality comparison is not intuitive, e.g. box}}

{{TodoItem
|Add support for public SYNONYMs
* [http://archives.postgresql.org/pgsql-hackers/2006-03/msg00519.php <nowiki>Proposal for SYNONYMS</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg02043.php
* http://archives.postgresql.org/pgsql-general/2010-12/msg00139.php
}}

{{TodoItem
|Add support for SQL-standard GENERATED/IDENTITY columns
* [http://archives.postgresql.org/pgsql-hackers/2006-07/msg00543.php <nowiki>Re: Three weeks left until feature freeze</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-08/msg00038.php <nowiki>GENERATED ... AS IDENTITY, Was: Re: Feature Freeze</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00344.php <nowiki>Behavior of GENERATED columns per SQL2003</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00076.php <nowiki>Re: [HACKERS] Behavior of GENERATED columns per SQL2003</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00604.php <nowiki>IDENTITY/GENERATED patch</nowiki>]
}}

{{TodoItem
|Consider placing all sequences in a single table, or create a system view
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php <nowiki>Re: newbie: renaming sequences task</nowiki>]
}}

{{TodoItem
|Consider a special data type for regular expressions
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg01067.php <nowiki>Why is there a tsquery data type?</nowiki>]
}}

{{TodoItem
|Reduce BIT data type overhead using short varlena headers
* [http://archives.postgresql.org/pgsql-general/2007-12/msg00273.php <nowiki>storage size of "bit" data type..</nowiki>]
}}

{{TodoItemDone
|Allow adding enumerated values to an existing enumerated data type
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01718.php <nowiki>Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing</nowiki>]
}}

{{TodoItem
|Allow renaming and deleting enumerated values from an existing enumerated data type
}}

{{TodoItem
|Support scoped IPv6 addresses in the inet type
* [http://archives.postgresql.org/pgsql-bugs/2007-05/msg00111.php <nowiki>strange problem with ip6</nowiki>]
}}

{{TodoItem
|Add a JSON (JavaScript Object Notation) data type
|This would behave similar to the XML data type, which is stored as text, but allows element lookup and conversion functions.
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg01494.php <nowiki>PATCH: Add hstore_to_json()</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg00001.php <nowiki>Re: PATCH: Add hstore_to_json()</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-03/msg01092.php <nowiki>Proposal: Add JSON support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00057.php <nowiki>Re: Proposal: Add JSON support</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00481.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01694.php
}}

{{TodoItem
|Considering improving performance of computing CHAR() value lengths
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00900.php <nowiki>char() overhead on read-only workloads not so insignifcant as the docs claim it is...</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01787.php <nowiki>Re: [PATCH] backend: compare word-at-a-time in bcTruelen</nowiki>]
}}

{{TodoItem
|Add overlaps geometric operators that ignore point overlaps
* http://archives.postgresql.org/pgsql-hackers/2010-03/msg00861.php
}}

=== Domains ===
{{TodoSubsection}}

{{TodoItem
|Allow functions defined as casts to domains to be called during casting
* [http://archives.postgresql.org/pgsql-hackers/2006-05/msg00072.php <nowiki>bug? non working casts for domain</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-09/msg01681.php <nowiki>TODO: Fix CREATE CAST on DOMAINs</nowiki>]
}}

{{TodoItem
|Allow values to be cast to domain types
* [http://archives.postgresql.org/pgsql-hackers/2003-06/msg01206.php <nowiki>Domain casting still doesn't work right</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00289.php <nowiki>domain casting?</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00812.php
}}

{{TodoItem
|Make domains work better with polymorphic functions
* [http://archives.postgresql.org/message-id/4887.1228700773@sss.pgh.pa.us Polymorphic types vs. domains]
* [http://archives.postgresql.org/message-id/15535.1238774571@sss.pgh.pa.us some difficulties with fixing it]
}}

{{TodoEndSubsection}}

=== Dates and Times ===
{{TodoSubsection}}

{{TodoItem
|Allow infinite intervals just like infinite timestamps}}

{{TodoItem
|Allow TIMESTAMP WITH TIME ZONE to store the original timezone information, either zone name or offset from UTC
|If the TIMESTAMP value is stored with a time zone name, interval computations should adjust based on the time zone rules.
* [http://archives.postgresql.org/pgsql-hackers/2004-10/msg00705.php <nowiki>timestamp with time zone a la sql99</nowiki>]
}}

{{TodoItem
|Have timestamp subtraction not call justify_hours()?
* [http://archives.postgresql.org/pgsql-sql/2006-10/msg00059.php <nowiki>timestamp subtraction (was Re: formatting intervals with to_char)</nowiki>]
}}

{{TodoItem
|Improve TIMESTAMP WITH TIME ZONE subtraction to be DST-aware
|Currently subtracting one date from another that crosses a daylight savings time adjustment can return '1 day 1 hour', but adding that back to the first date returns a time one hour in the future. This is caused by the adjustment of '25 hours' to '1 day 1 hour', and '1 day' is the same time the next day, even if daylight savings adjustments are involved.}}

{{TodoItem
|Fix interval display to support values exceeding 2^31 hours}}

{{TodoItem
|Add overflow checking to timestamp and interval arithmetic}}

{{TodoItem
|Add function to allow the creation of timestamps using parameters
* http://archives.postgresql.org/pgsql-performance/2010-06/msg00232.php
}}

{{TodoEndSubsection}}

=== Arrays ===
{{TodoSubsection}}

{{TodoItem
|Add support for arrays of domains
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00114.php <nowiki>Re: updated WIP: arrays of composites</nowiki>]
}}

{{TodoItem
|Allow single-byte header storage for array elements}}

{{TodoItem
|Add function to detect if an array is empty
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00475.php <nowiki>Re: array_length()</nowiki>]
}}

{{TodoItem
|Improve handling of empty arrays
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01033.php <nowiki>So what's an "empty" array anyway?</nowiki>]
}}

{{TodoItem
|Improve handling of NULLs in arrays
* [http://archives.postgresql.org/pgsql-bugs/2008-11/msg00009.php <nowiki>BUG #4509: array_cat's null behaviour is inconsistent</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg01040.php
}}

{{TodoEndSubsection}}

=== Binary Data ===
{{TodoSubsection}}

{{TodoItem
|Improve vacuum of large objects, like contrib/vacuumlo?}}

{{TodoItem
|Auto-delete large objects when referencing row is deleted
|contrib/lo offers this functionality.}}

{{TodoItem
|Allow read/write into TOAST values like large objects
|This requires the TOAST column to be stored EXTERNAL.}}

{{TodoItem
|Add API for 64-bit large object access
* [http://archives.postgresql.org/pgsql-hackers/2005-09/msg00781.php <nowiki>64-bit API for large objects</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg01790.php
}}

{{TodoEndSubsection}}

=== MONEY Data Type ===
{{TodoSubsection}}

{{TodoItem
|Add locale-aware MONEY type, and support multiple currencies
* [http://archives.postgresql.org/pgsql-general/2005-08/msg01432.php <nowiki>A real currency type</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01181.php <nowiki>Money type todos?</nowiki>]
}}

{{TodoItem
|MONEY dumps in a locale-specific format making it difficult to restore to a system with a different locale}}

{{TodoItemDone
|Allow MONEY to be easily cast to/from other numeric data types}}

{{TodoEndSubsection}}

=== Text Search ===
{{TodoSubsection}}

{{TodoItem
|Allow dictionaries to change the token that is passed on to later dictionaries
* [http://archives.postgresql.org/pgsql-patches/2007-11/msg00081.php <nowiki>a tsearch2 (8.2.4) dictionary that only filters out stopwords</nowiki>]
}}

{{TodoItem
|Consider a function-based API for '@@' searches
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00511.php <nowiki>Simplifying Text Search</nowiki>]
}}

{{TodoItem
|Improve text search error messages
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00966.php <nowiki>Poorly designed tsearch NOTICEs</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg01146.php <nowiki>Re: Poorly designed tsearch NOTICEs</nowiki>]
}}

{{TodoItem
|Consider changing error to warning for strings larger than one megabyte
* [http://archives.postgresql.org/pgsql-bugs/2008-02/msg00190.php <nowiki>BUG #3975: tsearch2 index should not bomb out of 1Mb limit</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-03/msg00062.php <nowiki>Re: [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit</nowiki>]
}}

{{TodoItem
|tsearch and tsdicts regression tests fail in Turkish locale on glibc
* [http://archives.postgresql.org/message-id/49749645.5070801@gmx.net tsearch with Turkish locale]
}}

{{TodoItem
|tsquery negator operator treated as part of lexeme
* [http://archives.postgresql.org/pgsql-bugs/2009-06/msg00346.php BUG #4887: inclusion operator (@>) on tsqeries behaves not conforming to documentation]
}}

{{TodoItem
|Improve handling of plus signs in email address user names, and perhaps improve URL parsing
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00772.php
}}

{{TodoItem
|Improve default parser, to more easily allow adding new tokens
* http://archives.postgresql.org/message-id/23485.1297727826@sss.pgh.pa.us
}}

{{TodoEndSubsection}}

=== XML ===
{{TodoSubsection}}

{{TodoItem
|Allow XML arrays to be cast to other data types
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00981.php <nowiki>proposal casting from XML[] to int[], numeric[], text[]</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00231.php <nowiki>Re: proposal casting from XML[] to int[], numeric[], text[]</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00471.php <nowiki>Re: proposal casting from XML[] to int[], numeric[], text[]</nowiki>]
}}

{{TodoItem
|Add XML Schema validation and xmlvalidate functions (SQL:2008)}}

{{TodoItem
|Add xmlvalidatedtd variant to support validating against a DTD?}}

{{TodoItem
|Relax-NG validation; libxml2 supports this already}}

{{TodoItem
|Allow reliable XML operation non-UTF8 server encodings (xpath(), in particular, is known to not work)
* [http://archives.postgresql.org/pgsql-bugs/2009-01/msg00135.php <nowiki>BUG #4622: xpath only work in utf-8 server encoding</nowiki>]
* http://archives.postgresql.org/message-id/4110.1238973350@sss.pgh.pa.us}}

{{TodoItem
|Add functions from SQL:2006: XMLDOCUMENT, XMLCAST, XMLTEXT}}

{{TodoItem
|Add XMLNAMESPACES support in XMLELEMENT and elsewhere}}

{{TodoItem
|Move XSLT from contrib/xml2 to a more reasonable location
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00539.php
}}

{{TodoItem
|Report errors returned by the XSLT library
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00562.php
}}

{{TodoItem
|Improve the XSLT parameter passing API
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00416.php
}}

{{TodoItem
|XML Canonical: Convert XML documents to canonical form to compare them. libxml2 has support for this.}}

{{TodoItem
|Add pretty-printed XML output option
|Parse a document and serialize it back in some indented form. libxml2 might support this.}}

{{TodoItem
|Add XMLQUERY (from the SQL/XML standard)}}

{{TodoItem
|Allow XML sthredding
|In some cases shredding could be better option (if there is no need to keep XML docs entirely, e.g. if we have already developed tools that understand only relational data. This would be a separate module that implements annotated schema decomposition technique, similar to DB2 and SQL Server functionality.}}

{{TodoItem
|Fix Nested or repeated xpath() that apparently mess up namespaces [http://archives.postgresql.org/pgsql-bugs/2008-03/msg00097.php] [http://archives.postgresql.org/pgsql-bugs/2008-03/msg00144.php] [http://archives.postgresql.org/pgsql-general/2008-03/msg00295.php] [http://archives.postgresql.org/pgsql-bugs/2008-07/msg00054.php] [http://archives.postgresql.org/message-id/004f01c90e91$138e9d10$3aabd730$@anstett@iaas.uni-stuttgart.de]}}

{{TodoItem
|XPath: Adding the <x> at the root causes problems [http://archives.postgresql.org/pgsql-bugs/2008-05/msg00184.php] [http://archives.postgresql.org/pgsql-bugs/2008-07/msg00054.php] [http://archives.postgresql.org/pgsql-general/2008-07/msg00613.php]}}

{{TodoItem
|xpath_table needs to be implemented/implementable to get rid of contrib/xml2 [http://archives.postgresql.org/pgsql-general/2008-05/msg00823.php]}}

{{TodoItem
|xpath_table is pretty broken anyway [http://archives.postgresql.org/pgsql-hackers/2010-02/msg02424.php]}}

{{TodoItem
|better handling of XPath data types [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00616.php] [http://archives.postgresql.org/message-id/004a01c90e90$4b986d90$e2c948b0$@anstett@iaas.uni-stuttgart.de]}}

{{TodoItemDone
|xpath_exists() is needed
|This checks whether or not the path specified exists in the XML value. Without this function we need to use the weird "array_dims(xpath(...)) IS NOT NULL" syntax.}}

{{TodoItem
|Improve handling of PIs and DTDs in xmlconcat() [http://archives.postgresql.org/message-id/200904211211.n3LCB09p008988@wwwmaster.postgresql.org]}}

{{TodoItem
|Restructure XML and /contrib/xml2 functionality
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg02314.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg00017.php
}}

{{TodoEndSubsection}}

== Functions ==

{{TodoItem
|Allow INET subnet comparisons using non-constants to be indexed}}

{{TodoItem
|Add an INET overlaps operator, for use by exclusion constraints
* http://archives.postgresql.org/pgsql-hackers/2010-03/msg00845.php
}}

{{TodoItem
|Enforce typmod for function inputs, function results and parameters for spi_prepare'd statements called from PLs
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg01403.php <nowiki>Re: BUG #2917: spi_prepare doesn't accept typename aliases</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-11/msg01160.php <nowiki>RFC for adding typmods to functions</nowiki>]
}}

{{TodoItem
|Fix IS OF so it matches the ISO specification, and add documentation
* [http://archives.postgresql.org/pgsql-patches/2003-08/msg00060.php <nowiki>Re: [HACKERS] IS OF</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00060.php <nowiki>ToDo: add documentation for operator IS OF</nowiki>]
}}

{{TodoItem
|Implement Boyer-Moore searching in LIKE queries
* {{messageLink|27645.1220635769@sss.pgh.pa.us|TODO item: Implement Boyer-Moore searching (First time hacker)}}
}}

{{TodoItem
|Prevent malicious functions from being executed with the permissions of unsuspecting users
|Index functions are safe, so VACUUM and ANALYZE are safe too. Triggers, CHECK and DEFAULT expressions, and rules are still vulnerable.
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00268.php <nowiki>Some notes about the index-functions security vulnerability</nowiki>]
}}

{{TodoItem
|Reduce memory usage of aggregates in set returning functions
* [http://archives.postgresql.org/pgsql-performance/2008-01/msg00031.php <nowiki>Re: Performance of aggregates over set-returning functions</nowiki>]
}}

{{TodoItem
|Fix /contrib/ltree operator
* [http://archives.postgresql.org/pgsql-bugs/2007-11/msg00044.php <nowiki>BUG #3720: wrong results at using ltree</nowiki>]
}}

{{TodoItem
|Fix /contrib/btree_gist's implementation of inet indexing
* [http://archives.postgresql.org/pgsql-bugs/2010-10/msg00099.php <nowiki>BUG #5705: btree_gist: Index on inet changes query result</nowiki>]
}}

{{TodoItem
|<nowiki>Fix inconsistent precedence of =, >, and < compared to <>, >=, and <=</nowiki>
* [http://archives.postgresql.org/pgsql-bugs/2007-12/msg00145.php <nowiki>BUG #3822: Nonstandard precedence for comparison operators</nowiki>]
}}

{{TodoItem
|Fix regular expression bug when using complex back-references
* [http://archives.postgresql.org/pgsql-bugs/2007-10/msg00000.php <nowiki>BUG #3645: regular expression back references seem broken</nowiki>]
}}

{{TodoItem
|Have /contrib/dblink reuse unnamed connections
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00895.php <nowiki>dblink un-named connection doesn't get re-used</nowiki>]
}}

{{TodoItem
|Improve formatting of pg_get_viewdef() output
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg01648.php <nowiki>pg_get_viewdef formattiing</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01885.php <nowiki>Re: pretty print viewdefs</nowiki>]
}}

{{TodoItemDone
|Add printf()-like functionality
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00367.php <nowiki>RfD: more powerful "any" types</nowiki>]
}}

{{TodoItem
|Add function to dump pg_depend information cleanly
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00226.php <nowiki>Elementary dependency look-up</nowiki>]
}}

{{TodoItem
|Improve relation size functions such as pg_relation_size() to avoid producing an error when called against a no longer visible relation
* [http://archives.postgresql.org/message-id/28488.1286461610@sss.pgh.pa.us pg_relation_size / could not open relation with OID #]
}}

=== Character Formatting ===

{{TodoSubsection}}
{{TodoItem
|Allow to_date() and to_timestamp() to accept localized month names}}

{{TodoItem
|Add missing parameter handling in to_char()
* [http://archives.postgresql.org/pgsql-hackers/2005-12/msg00948.php <nowiki>Re: to_char and i18n</nowiki>]
}}

{{TodoItem
|Throw an error from to_char() instead of printing a string of "#" when a number doesn't fit in the desired output format.
* discussed in [http://archives.postgresql.org/message-id/37ed240d0907290836w42187222n18664dfcbcb445b1@mail.gmail.com "to_char, support for EEEE format"]
}}

{{TodoItem
|Allow to_char() on interval values to accumulate the highest unit requested
|2= Some special format flag would be required to request such accumulation. Such functionality could also be added to EXTRACT. Prevent accumulation that crosses the month/day boundary because of the uneven number of days in a month.
* to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
* to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
* to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
* to_char(INTERVAL '3 years 5 months','MM') => 41
}}

{{TodoItem
|Fix to_number() handling for values not matching the format string
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg01447.php <nowiki>Re: numeric_to_number() function skipping some digits</nowiki>]
}}

{{TodoEndSubsection}}

== Multi-Language Support ==

{{TodoItem
|Add NCHAR (as distinguished from ordinary varchar),}}

{{TodoItemDone
|Allow more fine-grained collation selection
|Right now the collation is fixed at database creation time.
* [http://archives.postgresql.org/pgsql-hackers/2005-03/msg00932.php <nowiki>Re: Patch for collation using ICU</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2005-08/msg00039.php <nowiki>FW: Win32 unicode vs ICU</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2005-08/msg00309.php <nowiki>Re: FW: Win32 unicode vs ICU</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2005-09/msg00110.php <nowiki>Proof of concept COLLATE support with patch</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2005-09/msg00020.php <nowiki>For review: Initial support for COLLATE</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2005-12/msg01121.php <nowiki>Proposed COLLATE implementation</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-01/msg00767.php <nowiki>TODO item: locale per database patch (new iteration)</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-03/msg00233.php <nowiki>Re: FW: Win32 unicode vs ICU</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-09/msg00662.php <nowiki>Re: Fixed length data types issue</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00557.php <nowiki>[WIP] collation support revisited (phase 1)</nowiki>]
* [[Todo:Collate]]
* [[Todo:ICU]]
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg01362.php <nowiki>WIP patch: Collation support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg00012.php <nowiki>Re: WIP patch: Collation support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00868.php <nowiki>PGDay.it collation discussion notes</nowiki>]
* [http://www.unicode.org/unicode/reports/tr10/ Unicode collation algorithm]
}}

{{TodoItem
|Add a cares-about-collation column to pg_proc, so that unresolved-collation errors can be thrown at parse time
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01520.php <nowiki>Open issues for collations</nowiki>]
}}

{{TodoItem
|Integrate collations with text search configurations
* [http://archives.postgresql.org/message-id/28887.1303579034@sss.pgh.pa.us <nowiki>Some TODO items for collations</nowiki>]
}}

{{TodoItem
|Integrate collations with to_char() and related functions
* [http://archives.postgresql.org/message-id/28887.1303579034@sss.pgh.pa.us <nowiki>Some TODO items for collations</nowiki>]
}}

{{TodoItem
|Add a LOCALE option to CREATE DATABASE, as a shorthand
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00119.php <nowiki> Re: 8.4 open items list</nowiki>]
}}

{{TodoItem
|Support multiple simultaneous character sets, per SQL:2008}}

{{TodoItem
|Improve UTF8 combined character handling?}}

{{TodoItem
|Add octet_length_server() and octet_length_client()}}

{{TodoItem
|Make octet_length_client() the same as octet_length()?}}

{{TodoItem
|Fix problems with wrong runtime encoding conversion for NLS message files}}

{{TodoItem
|Add URL to more complete multi-byte regression tests
* [http://archives.postgresql.org/pgsql-hackers/2005-07/msg00272.php <nowiki>Multi-byte and client side character encoding tests for copy command..</nowiki>]
}}

{{TodoItem
|Fix contrib/fuzzystrmatch to work with multibyte encodings
* [http://archives.postgresql.org/pgsql-bugs/2009-04/msg00047.php <nowiki> soundex function returns UTF-16 characters</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00138.php <nowiki> dmetaphone woes</nowiki>]
}}

{{TodoItemDone
|Set client encoding based on the client operating system encoding
|Currently client_encoding is set in postgresql.conf, which defaults to the server encoding.
* [http://archives.postgresql.org/pgsql-hackers/2006-08/msg01696.php <nowiki>Re: [GENERAL] invalid byte sequence ?</nowiki>]
}}

{{TodoItem
|Change memory allocation for multi-byte functions so memory is allocated inside conversion functions
|Currently we preallocate memory based on worst-case usage.}}

{{TodoItem
|Add ability to use case-insensitive regular expressions on multi-byte characters
|Currently it works for UTF-8, but not other multi-byte encodings
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00433.php <nowiki>Regexps vs. locale</nowiki>]
* {{MessageLink|20091201210024.B1393753FB7@cvs.postgresql.org|A partial solution for UTF-8}}
}}

{{TodoItem
|Improve encoding of connection startup messages sent to the client
|Currently some authentication error messages are sent in the server encoding
* [http://archives.postgresql.org/pgsql-general/2008-12/msg00801.php <nowiki>encoding of PostgreSQL messages</nowiki>]
* [http://archives.postgresql.org/pgsql-general/2009-01/msg00005.php <nowiki>Re: encoding of PostgreSQL messages</nowiki>]
}}

{{TodoItem
|Have pg_stat_activity display query strings in the correct client encoding
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00131.php <nowiki>pg_stats queries versus per-database encodings</nowiki>]
}}

{{TodoItem
|More sensible support for Unicode combining characters, normal forms
* http://archives.postgresql.org/message-id/200904141532.44618.peter_e@gmx.net
}}

== Views / Rules ==

{{TodoItem
|Automatically create rules on views so they are updateable, per SQL:2008
|We can only auto-create rules for simple views. For more complex cases users will still have to write rules manually.
* [http://archives.postgresql.org/pgsql-hackers/2006-03/msg00586.php <nowiki>Proposal for updatable views</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-08/msg00255.php <nowiki>Updatable views</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg01746.php <nowiki>Re: [COMMITTERS] pgsql: Automatic view update rules Bernd Helmle</nowiki>]
* http://wiki.postgresql.org/wiki/Updatable_views
}}

{{TodoItem
|Add the functionality of the WITH CHECK OPTION clause to CREATE VIEW}}

{{TodoItem
|Allow VIEW/RULE recompilation when the underlying tables change
|This is both difficult and controversial.
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg01723.php Re: About "Allow VIEW/RULE recompilation when the underlying tables change"]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg01724.php Re: About "Allow VIEW/RULE recompilation when the underlying tables change"]
}}

{{TodoItem
|Make it possible to use RETURNING together with conditional DO INSTEAD rules, such as for partitioning setups
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00577.php <nowiki>RETURNING and DO INSTEAD ... Intentional or not?</nowiki>]
}}

{{TodoItem
|Add the ability to automatically create materialized views
|Right now materialized views require the user to create triggers on the main table to keep the summary table current. SQL syntax should be able to manage the triggers and summary table automatically. A more sophisticated implementation would automatically retrieve from the summary table when the main table is referenced, if possible. See [[Materialized Views]] for implementation details
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00479.php <nowiki>GSoC - proposal - Materialized Views in PostgreSQL</nowiki>]
}}

{{TodoItem
|Improve ability to modify views via ALTER TABLE
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00691.php <nowiki>Re: idea: storing view source in system catalogs</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg01410.php <nowiki>modifying views</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg00300.php <nowiki>Re: patch: Add columns via CREATE OR REPLACE VIEW</nowiki>]
}}

{{TodoItem
|Prevent low-cost functions from seeing unauthorized view rows
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01346.php <nowiki>Using views for row-level access control is leaky</nowiki>]
}}

== SQL Commands ==

{{TodoItem
|Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT}}

{{TodoItem
|Improve type determination of unknown (NULL or quoted literal) result columns for UNION/INTERSECT/EXCEPT
* [http://archives.postgresql.org/message-id/9799.1302719551@sss.pgh.pa.us <nowiki>UNION construct type cast gives poor error message</nowiki>]
}}

{{TodoItem
|Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php <nowiki>WIP: grouping sets support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00466.php <nowiki>Implementation of GROUPING SETS (T431: Extended grouping capabilities)</nowiki>]
}}

{{TodoItemDone
|Fix TRUNCATE ... RESTART IDENTITY so its effect on sequences is rolled back on transaction abort
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00550.php <nowiki>Re: [PATCHES] TRUNCATE TABLE with IDENTITY</nowiki>]
}}

{{TodoItem
|Allow prepared transactions with temporary tables created and dropped in the same transaction, and when an ON COMMIT DELETE ROWS temporary table is accessed
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00047.php <nowiki>Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables</nowiki>]
* [http://archives.postgresql.org/message-id/492543D5.9050904@enterprisedb.com A suggestion on how to implement this]
}}

{{TodoItem
|Add a GUC variable to warn about non-standard SQL usage in queries}}

{{TodoItem
|Add SQL-standard MERGE/REPLACE/UPSERT command
|MERGE is typically used to merge two tables. REPLACE or UPSERT command does UPDATE, or on failure, INSERT. See [[SQL MERGE]] for notes on the implementation details.
}}

{{TodoItem
|Add NOVICE output level for helpful messages
|For example, have it warn about unjoined tables. This could also control automatic sequence/index creation messages.
}}

{{TodoItem
|Allow NOTIFY in rules involving conditionals}}

{{TodoItem
|Allow EXPLAIN to identify tables that were skipped because of constraint_exclusion
}}

{{TodoItemDone
|Enable standard_conforming_strings by default
|When this is done, backslash-quote should be prohibited in non-E<nowiki>''</nowiki> strings because of possible confusion over how such strings treat backslashes. Basically, <nowiki>''</nowiki> is always safe for a literal single quote, while \' might or might not be based on the backslash handling rules.}}

{{TodoItem
|Simplify dropping roles that have objects in several databases}}

{{TodoItem
|Allow the count returned by SELECT, etc to be represented as an int64 to allow a higher range of values}}

{{TodoItem
|Add support for WITH RECURSIVE ... CYCLE
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00291.php <nowiki>WITH RECURSIVE ... CYCLE in vanilla SQL: issues with arrays of rows</nowiki>]}}

{{TodoItem
|Add DEFAULT .. AS OWNER so permission checks are done as the table owner
|This would be useful for SERIAL nextval() calls and CHECK constraints.}}

{{TodoItem
|Allow DISTINCT to work in multiple-argument aggregate calls}}

{{TodoItem
|Add column to pg_stat_activity that shows the progress of long-running commands like CREATE INDEX and VACUUM
* [http://archives.postgresql.org/pgsql-patches/2008-04/msg00203.php <nowiki>EXPLAIN progress info</nowiki>]
}}

{{TodoItemDone
|Allow INSERT/UPDATE/DELETE ... RETURNING in common table expressions
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg00472.php <nowiki>Writeable CTEs and side effects</nowiki>]
}}

{{TodoItem
|Add comments on system tables/columns using the information in catalogs.sgml
|Ideally the information would be pulled from the SGML file automatically.}}

{{TodoItem
|Prevent the specification of conflicting transaction read/write options
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00684.php <nowiki>Re: SET TRANSACTION and SQL Standard</nowiki>]
}}

{{TodoItem
|Support LATERAL subqueries
|Lateral subqueries can reference columns of tables defined outside the subquery at the same level, i.e. ''laterally''.
For example, a LATERAL subquery in a FROM clause could reference tables defined in the same FROM clause.
Currently only the columns of tables defined ''above'' subqueries are recognized.
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00292.php <nowiki>LATERAL</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg00991.php <nowiki>Re: LATERAL</nowiki>]
}}

{{TodoItemDone
|Add support for functional dependencies
|This would allow omitting GROUP BY columns when grouping by the primary key.
}}

{{TodoItem
|Prevent temporary tables created with ON COMMIT DELETE ROWS from repeatedly truncating the table on every commit if the table is already empty
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg00842.php
* http://archives.postgresql.org/pgsql-performance/2010-03/msg00392.php
* http://archives.postgresql.org/pgsql-performance/2010-04/msg00046.php
}}

{{TodoItem
|Allow DELETE and UPDATE to be used with LIMIT and ORDER BY
* http://archives.postgresql.org/pgadmin-hackers/2010-04/msg00078.php
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg01997.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00021.php
}}

{{TodoItem
|Allow finer control over the caching of prepared query plans
|Currently anonymous (un-named) queries prepared via the wire protocol are replanned every time bind parameters are supplied --- allow SQL PREPARE to do the same. Also, allow control over replanning prepared queries either manually or automatically when statistics for execute parameters differ dramatically from those used during planning.
* http://archives.postgresql.org/message-id/201002151911.o1FJBYh22763@momjian.us
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00597.php
}}

{{TodoItem
|Allow PREPARE of cursors}}

{{TodoItem
|Have DISCARD PLANS discard plans cached by functions
|DISCARD all should do the same.
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00431.php
}}

=== CREATE ===
{{TodoSubsection}}

{{TodoItem
|Allow CREATE TABLE AS to determine column lengths for complex expressions like SELECT col1 || col2}}

{{TodoItem
|Have WITH CONSTRAINTS also create constraint indexes
* [http://archives.postgresql.org/pgsql-patches/2007-04/msg00149.php <nowiki>Re: CREATE TABLE LIKE INCLUDING INDEXES support</nowiki>]
}}

{{TodoItem
|Move NOT NULL constraint information to pg_constraint
|Currently NOT NULL constraints are stored in pg_attribute without any designation of their origins, e.g. primary keys. One manifest problem is that dropping a PRIMARY KEY constraint does not remove the NOT NULL constraint designation. Another issue is that we should probably force NOT NULL to be propagated from parent tables to children, just as CHECK constraints are. (But then does dropping PRIMARY KEY affect children?)
* http://archives.postgresql.org/message-id/19768.1238680878@sss.pgh.pa.us
* http://archives.postgresql.org/message-id/200909181005.n8IA5Ris061239@wwwmaster.postgresql.org
}}

{{TodoItem
|Prevent concurrent CREATE TABLE from sometimes returning a cryptic error message
* [http://archives.postgresql.org/pgsql-bugs/2007-10/msg00169.php <nowiki>BUG #3692: Conflicting create table statements throw unexpected error</nowiki>]
}}

{{TodoItemDone
|Allow CREATE TABLE to optionally create a table if it does not already exist, without throwing an error
|The fact that tables contain data makes this more complex than other CREATE OR REPLACE operations.
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg01300.php <nowiki>Add column if not exists (CINE)</nowiki>]
}}

{{TodoItem
|Add CREATE SCHEMA ... LIKE that copies a schema}}

{{TodoItem
|Fix CREATE OR REPLACE FUNCTION to not leave objects depending on the function in inconsistent state
* [http://archives.postgresql.org/pgsql-general/2008-08/msg00985.php indexes on functions and create or replace function]
}}

{{TodoItem
|Allow temporary tables to exist as empty by default in all sessions
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00006.php <nowiki>what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg01329.php <nowiki>idea: global temp tables</nowiki>]
* [http://archives.postgresql.org//pgsql-hackers/2009-05/msg00016.php <nowiki>Re: idea: global temp tables</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg01098.php <nowiki>global temporary tables</nowiki>]
}}

{{TodoItem
|Allow the creation of "distinct" types
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01647.php <nowiki>Distinct types</nowiki>]
}}

{{TodoItem
|Consider analyzing temporary tables when they are first used in a query
|Autovacuum cannot analyze or vacuum temporary tables.
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00416.php <nowiki>autovacuum and temp tables support</nowiki>]
}}

{{TodoItem
|Allow an unlogged table to be changed to logged
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00315.php
}}

{{TodoEndSubsection}}

=== UPDATE ===
{{TodoSubsection}}

{{TodoItem
|<nowiki>Allow UPDATE tab SET ROW (col, ...) = (SELECT...)</nowiki>
* [http://archives.postgresql.org/pgsql-hackers/2006-07/msg01308.php <nowiki>Re: [PATCHES] extension for sql update</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00865.php <nowiki>UPDATE using sub selects</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-04/msg00315.php <nowiki>UPDATE using sub selects</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-03/msg00237.php <nowiki>Re: UPDATE using sub selects</nowiki>]
}}

{{TodoItem
|Research self-referential UPDATEs that see inconsistent row versions in read-committed mode
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00507.php <nowiki>Concurrently updating an updatable view</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00016.php <nowiki>Re: Do we need a TODO? (was Re: Concurrently updating anupdatable view)</nowiki>]
}}

{{TodoItem
|Improve performance of EvalPlanQual mechanism that rechecks already-updated rows
|This is related to the previous item, which questions whether it even has the right semantics
* [http://archives.postgresql.org/pgsql-bugs/2008-09/msg00045.php <nowiki>BUG #4401: concurrent updates to a table blocks one update indefinitely</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2009-07/msg00302.php <nowiki>BUG #4945: Parallel update(s) gone wild</nowiki>]
}}

{{TodoEndSubsection}}

=== ALTER ===
{{TodoSubsection}}

{{TodoItem
|Have ALTER TABLE RENAME of a SERIAL column rename the sequence
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php <nowiki>Re: newbie: renaming sequences task</nowiki>]
}}

{{TodoItem
|Have ALTER SEQUENCE RENAME rename the sequence name stored in the sequence table
* [http://archives.postgresql.org/pgsql-bugs/2007-09/msg00092.php <nowiki>BUG #3619: Renaming sequence does not update its 'sequence_name' field</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2007-10/msg00007.php <nowiki>Re: BUG #3619: Renaming sequence does not update its 'sequence_name' field</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php <nowiki>Re: newbie: renaming sequences task</nowiki>]
}}

{{TodoItemEasy
|Allow ALTER TABLE ... ALTER CONSTRAINT ... RENAME or ALTER TABLE RENAME CONSTRAINT
* [http://archives.postgresql.org/pgsql-patches/2006-02/msg00168.php <nowiki>ALTER CONSTRAINT RENAME patch reverted</nowiki>]
}}

{{TodoItem
|Add ALTER DOMAIN to modify the underlying data type}}

{{TodoItemDone
|Allow ALTER TABLE to change constraint deferrability}}

{{TodoItemDone
|Add missing object types for ALTER ... SET SCHEMA}}

{{TodoItem
|Allow ALTER TABLESPACE to move the tablespace to different directories}}

{{TodoItem
|Allow moving system tables to other tablespaces, where possible
|Currently non-global system tables must be in the default database tablespace. Global system tables can never be moved.}}

{{TodoItem
|Have ALTER INDEX update the name of a constraint using that index}}

{{TodoItem
|Allow column display reordering by recording a display, storage, and permanent id for every column?
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00782.php <nowiki>Re: column ordering, was Re: [PATCHES] Enums patch v2</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg01029.php <nowiki>Column reordering in pg_dump</nowiki>]
}}

{{TodoItemDone
|Allow an existing index to be marked as a table's primary key
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00500.php <nowiki>Setting a pre-existing index as a primary key</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00642.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00265.php
}}

{{TodoItemDone
|Allow ALTER TYPE on composite types to perform operations similar to ALTER TABLE
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00245.php <nowiki>ALTER composite type does not work, but ALTER TABLE which ROWTYPE is used as a type - works fine</nowiki>]
}}

{{TodoItemDone
|Don't require table rewrite on ALTER TABLE ... ALTER COLUMN TYPE, when the old and new data types are binary compatible
* http://archives.postgresql.org/message-id/200903040137.n241bAUV035002@wwwmaster.postgresql.org
* [http://archives.postgresql.org/pgsql-patches/2006-10/msg00154.php <nowiki>Eliminating phase 3 requirement for varlen increases via ALTER COLUMN</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg02360.php
}}

{{TodoItem
|Allow deactivating (and reactivating) indexes via ALTER TABLE
|{{messageLink|<87hbegz5ir.fsf@cbbrowne.afilias-int.info>|In discussion on FK activation/deactivation}}
}}

{{TodoItemDone
|Reduce locking required for ALTER commands
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg00533.php <nowiki>ALTER TABLE SET STATISTICS requires AccessExclusiveLock</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01083.php <nowiki>Re: ALTER TABLE SET STATISTICS requires AccessExclusiveLock</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2008-10/msg01248.php
* http://archives.postgresql.org/pgsql-hackers/2008-10/msg00242.php
}}

{{TodoItemDone
|Fix removal of NULL constraints in inherited tables
* http://archives.postgresql.org/pgsql-hackers/2010-06/msg00919.php
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg01773.php
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00329.php
}}

{{TodoEndSubsection}}

=== CLUSTER ===
{{TodoSubsection}}

{{TodoItem
|Automatically maintain clustering on a table
|This might require some background daemon to maintain clustering during periods of low usage. It might also require tables to be only partially filled for easier reorganization. Another idea would be to create a merged heap/index data file so an index lookup would automatically access the heap data too. A third idea would be to store heap rows in hashed groups, perhaps using a user-supplied hash function.
* [http://archives.postgresql.org/pgsql-performance/2004-08/msg00350.php <nowiki>Equivalent praxis to CLUSTERED INDEX?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00155.php <nowiki>Re: Grouped Index Tuples</nowiki>]
* http://community.enterprisedb.com/git/
* [http://archives.postgresql.org/pgsql-performance/2009-10/msg00346.php <nowiki>Re: maintain_cluster_order_v5.patch</nowiki>]
}}

{{TodoItemDone
|Improve CLUSTER performance by sorting to reduce random I/O
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php <nowiki>Our CLUSTER implementation is pessimal</nowiki>]
}}

{{TodoItemDone
|Make CLUSTER VERBOSE more verbose.
|It is also used by new VACUUM FULL VERBOSE.}}

{{TodoEndSubsection}}

=== COPY ===
{{TodoSubsection}}

{{TodoItem
|Allow COPY to report error lines and continue
|This requires the use of a savepoint before each COPY line is processed, with ROLLBACK on COPY failure.
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00572.php <nowiki>Re: VLDB Features</nowiki>]
}}

{{TodoItem
|Allow COPY on a newly-created table to skip WAL logging
|On crash recovery, the table involved in the COPY would be removed or have its heap and index files truncated. One issue is that no other backend should be able to add to the table at the same time, which is something that is currently allowed. This currently is done if the table is created inside the same transaction block as the COPY because no other backends can see the table.}}

{{TodoItem
|Allow COPY FROM to create index entries in bulk
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00811.php <nowiki>Batch update of indexes on data loading</nowiki>]
}}

{{TodoItem
|Allow COPY in CSV mode to control whether a quoted zero-length string is treated as NULL
|Currently this is always treated as a zero-length string, which generates an error when loading into an integer column
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00905.php <nowiki>Re: [PATCHES] allow CSV quote in NULL</nowiki>]
}}

{{TodoItem
|Improve COPY performance
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00954.php <nowiki>Re: 8.3 / 8.2.6 restore comparison</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01882.php
}}

{{TodoItem
|Allow COPY to report errors sooner
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01169.php <nowiki>Timely reporting of COPY errors</nowiki>]
}}

{{TodoItem
|Allow COPY to handle other number formats
|E.g. the German notation. Best would be something like WITH DECIMAL ','.
}}

{{TodoItem
|Allow a stalled COPY to exit if the backend is terminated
* [http://archives.postgresql.org/pgsql-bugs/2009-04/msg00067.php <nowiki>Re: possible bug not in open items</nowiki>]
}}

{{TodoEndSubsection}}

=== GRANT/REVOKE ===
{{TodoSubsection}}

{{TodoItem
|Allow SERIAL sequences to inherit permissions from the base table?}}

{{TodoItem
|Allow dropping of a role that has connection rights
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00736.php <nowiki>DROP ROLE dependency tracking ...</nowiki>]
}}
{{TodoEndSubsection}}

=== DECLARE CURSOR ===
{{TodoSubsection}}

{{TodoItem
|Prevent DROP TABLE from dropping a table referenced by its own open cursor?}}

{{TodoItem
|Provide some guarantees about the behavior of cursors that invoke volatile functions
* [http://archives.postgresql.org/message-id/20997.1244563664@sss.pgh.pa.us Re: Cursor with hold emits the same row more than once across commits in 8.3.7]
}}

{{TodoEndSubsection}}

=== INSERT ===
{{TodoSubsection}}

{{TodoItem
|Allow INSERT/UPDATE of the system-generated oid value for a row}}

{{TodoItem
|In rules, allow VALUES() to contain a mixture of 'old' and 'new' references}}

{{TodoEndSubsection}}

=== SHOW/SET ===
{{TodoSubsection}}

{{TodoItem
|Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM ANALYZE, and CLUSTER}}

{{TodoItem
|Rationalize the discrepancy between settings that use values in bytes and SHOW that returns the object count
* [http://archives.postgresql.org/pgsql-docs/2008-07/msg00007.php <nowiki>Re: [ADMIN] shared_buffers and shmmax</nowiki>]
}}

{{TodoEndSubsection}}

=== ANALYZE ===
{{TodoSubsection}}

{{TodoItem
|Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage}}

{{TodoItem
|Have EXPLAIN ANALYZE report rows as floating-point numbers
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg01363.php <nowiki>explain analyze rows=%.0f</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00108.php <nowiki>Re: explain analyze rows=%.0f</nowiki>]
}}

{{TodoItem
|Improve how ANALYZE computes in-doubt tuples
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00771.php <nowiki>VACUUM/ANALYZE counting of in-doubt tuples</nowiki>]
}}

{{TodoEndSubsection}}

=== Window Functions ===
See {{messageLink|357.1230492361@sss.pgh.pa.us|TODO items for window functions}}.
{{TodoSubsection}}
{{TodoItem
|Support creation of user-defined window functions
|We have the ability to create new window functions written in C. Is it
worth the effort to create an API that would let them be written in PL/pgsql, etc?}}

{{TodoItem
|Implement full support for window framing clauses
|In addition to done clauses described in the [http://developer.postgresql.org/pgdocs/postgres/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS latest doc], these clauses are not implemented yet.
* RANGE BETWEEN ... PRECEDING/FOLLOWING
* EXCLUDE
}}

{{TodoItem
|Investigate tuplestore performance issues
|The tuplestore_in_memory() thing is just a band-aid, we ought to try to solve it properly. tuplestore_advance seems like a weak spot as well.
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00152.php <nowiki>tuplestore potential performance problem</nowiki>]
}}

{{TodoItem|Do we really need so much duplicated code between Agg and WindowAgg?}}

{{TodoItem
|Teach planner to evaluate multiple windows in the optimal order
|Currently windows are always evaluated in the query-specified order.
* http://archives.postgresql.org/message-id/3CDAD71E9D70417290FCF66F0178D1E1@amd64
}}

{{TodoItem
|Implement DISTINCT clause in window aggregates
|Some proprietary RDBMSs have implemented it already, so it helps with porting from those.}}

{{TodoEndSubsection}}

== Integrity Constraints ==
=== Keys ===

{{TodoSubsection}}

{{TodoItem
|Improve deferrable unique constraints for cases with many conflicts
|The current implementation fires a trigger for each potentially conflicting row. This might not scale well for an update that changes many key values at once.
}}

{{TodoEndSubsection}}

=== Referential Integrity ===
{{TodoSubsection}}

{{TodoItem
|Add MATCH PARTIAL referential integrity}}

{{TodoItem
|Change foreign key constraint for array -> element to mean element in array?
* [http://archives.postgresql.org/pgsql-hackers/2010-10/msg01814.php <nowiki>foreign keys for array/period contains relationships</nowiki>]
}}

{{TodoItem
|Fix problem when cascading referential triggers make changes on cascaded tables, seeing the tables in an intermediate state
* [http://archives.postgresql.org/pgsql-hackers/2005-09/msg00174.php <nowiki>Re: [PATCHES] Work-in-progress referential action trigger timing</nowiki>]
}}

{{TodoItem
|Optimize referential integrity checks
* [http://archives.postgresql.org/pgsql-performance/2005-10/msg00458.php <nowiki>Re: Effects of cascading references in foreign keys</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00744.php <nowiki>Can't ri_KeysEqual() consider two nulls as equal?</nowiki>]
}}

{{TodoEndSubsection}}

== Server-Side Languages ==

{{TodoItem
|Add support for polymorphic arguments and return types to languages other than PL/PgSQL}}

{{TodoItem
|Add support for OUT and INOUT parameters to languages other than PL/PgSQL}}

{{TodoItem
|Add more fine-grained specification of functions taking arbitrary data types
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00367.php <nowiki>RfD: more powerful "any" types</nowiki>]
}}

{{TodoItem
|Implement stored procedures
|This might involve the control of transaction state and the return of multiple result sets
* [http://archives.postgresql.org/pgsql-general/2008-10/msg00454.php <nowiki>PL/pgSQL stored procedure returning multiple result sets (SELECTs)?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg01375.php <nowiki>Proposal: real procedures again (8.4)</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg00542.php
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg01149.php <nowiki>Gathering specs and discussion on feature (post 9.1)</nowiki>]
}}

{{TodoItem
|Allow holdable cursors in SPI}}

{{TodoItemEasy
|Add SPI_gettypmod() to return a field's typemod from a TupleDesc
* http://archives.postgresql.org/pgsql-hackers/2005-11/msg00250.php
}}

=== SQL-Language Functions ===
{{TodoSubsection}}

{{TodoItem
|Allow SQL-language functions to reference parameters by parameter name
|Currently SQL-language functions can only refer to dollar parameters, e.g. $1
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01479.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01519.php
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00221.php
}}

{{TodoItem
|Rethink query plan caching and timing of parse analysis within SQL-language functions
|They should work more like plpgsql functions do ...
* [http://archives.postgresql.org/pgsql-bugs/2011-05/msg00078.php <nowiki>Re: BUG #6019: invalid cached plan on inherited table</nowiki>]
}}

{{TodoEndSubsection}}

=== PL/pgSQL ===
{{TodoSubsection}}

{{TodoItem
|Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]}}

{{TodoItem
|<nowiki>Allow listing of record column names, and access to record columns via variables, e.g. columns := r.(*), tval2 := r.(colname)</nowiki>
* [http://archives.postgresql.org/pgsql-patches/2005-07/msg00458.php <nowiki>Re: PL/PGSQL: Dynamic Record Introspection</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-05/msg00302.php <nowiki>Re: PL/PGSQL: Dynamic Record Introspection</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-06/msg00031.php <nowiki>Re: PL/PGSQL: Dynamic Record Introspection</nowiki>]
}}

{{TodoItem
|Allow row and record variables to be set to NULL constants, and allow NULL tests on such variables
|Because a row is not scalar, do not allow assignment from NULL-valued scalars.
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg00070.php <nowiki>NULL and plpgsql rows</nowiki>]
}}

{{TodoItem
|Consider keeping separate cached copies when search_path changes
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg01009.php <nowiki>pl/pgsql Plan Invalidation and search_path</nowiki>]
}}

{{TodoItem
|Improve handling of NULL row values vs. NULL rows
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg01758.php <nowiki>Null row vs. row of nulls in plpgsql</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg01973.php
}}

{{TodoEndSubsection}}

=== PL/Perl ===
{{TodoSubsection}}

{{TodoItem
|Allow regex operations in plperl using UTF8 characters in non-UTF8 encoded databases}}

{{TodoEndSubsection}}

=== PL/Python ===
{{TodoSubsection}}

{{TodoItemDone
|Add table function support}}

{{TodoItemDone
|Add tracebacks
* [http://archives.postgresql.org/pgsql-patches/2006-02/msg00288.php <nowiki>Re: plpython tracebacks</nowiki>]
}}

{{TodoItem
|Develop a trusted variant of PL/Python.}}

{{TodoItem
|Create a new restricted execution class that will allow passing function arguments in as locals. Passing them as globals means functions cannot be called recursively.
* [http://archives.postgresql.org/pgsql-hackers/2011-02/msg01468.php <nowiki>Re: pl/python do not delete function arguments</nowiki>]
}}

{{TodoItem
|Improve documentation}}

{{TodoItem
|Add a DB-API compliant interface on top of the SPI interface}}

{{TodoItem
|Improve behaviour of exception functions and types
* http://archives.postgresql.org/pgsql-docs/2010-11/msg00022.php
* http://archives.postgresql.org/pgsql-docs/2010-11/msg00031.php
}}
{{TodoItem
|For functions returning a setof record with a composite type, cache the I/O functions for the composite type
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg02007.php
}}

{{TodoEndSubsection}}

=== PL/Tcl ===
{{TodoSubsection}}

{{TodoItem
|Add table function support}}

{{TodoItem
|Check encoding validity of values passed back to Postgres in function returns, trigger tuple changes, and SPI calls.}}

{{TodoEndSubsection}}

== Clients ==

{{TodoItem
|Add a function like pg_get_indexdef() that report more detailed index information
* [http://archives.postgresql.org/pgsql-bugs/2007-12/msg00166.php <nowiki>BUG #3829: Wrong index reporting from pgAdmin III (v1.8.0 rev 6766-6767)</nowiki>]
}}

{{TodoItem
|Split out pg_resetxlog output into pre- and post-sections
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg02040.php
}}

=== pg_ctl ===
{{TodoSubsection}}

{{TodoItem
|Allow pg_ctl to work properly with configuration files located outside the PGDATA directory
|pg_ctl can not read the pid file because it isn't located in the config directory but in the PGDATA directory. The solution is to allow pg_ctl to read and understand postgresql.conf to find the data_directory value.
* [http://archives.postgresql.org/pgsql-bugs/2009-10/msg00024.php <nowiki>BUG #5103: "pg_ctl -w (re)start" fails with custom unix_socket_directory</nowiki>]
}}

{{TodoItem
|Modify pg_ctl behavior and exit codes to make it easier to write an LSB conforming init script
|It may be desirable to condition some of the changes on a command-line switch, to avoid breaking existing scripts. A Linux shell (sh) script is referenced which has been tested and seems to provide a high degree of conformance in multiple environments. Study of this script might suggest areas where pg_ctl could be modified to make writing an LSB conforming script easier; however, some aspects of that script would be unnecessary with other suggested changes to pg_ctl, and discussion on the lists did not reach consensus on support for all aspects of this script. Further discussion of particular changes is needed before beginning any work.
* [[Lsb_conforming_init_script|LSB conforming init script]]
These threads should be studied for other ideas on improvements:
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01390.php <nowiki>We should Axe /contrib/start-scripts</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01843.php <nowiki>Linux LSB init script</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00008.php <nowiki>Re: Linux LSB init script</nowiki>]
}}

{{TodoEndSubsection}}

=== psql ===
{{TodoSubsection}}

{{TodoItem
|Have psql \ds show all sequences and their settings
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00916.php <nowiki>Re: TODO item: Have psql show current values for a sequence</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00401.php <nowiki>Quick patch: Display sequence owner</nowiki>]
}}

{{TodoItem
|Have \d on a sequence indicate if the sequences is owned by a table}}

{{TodoItem
|Move psql backslash database information into the backend, use mnemonic commands?
|This would allow non-psql clients to pull the same information out of the database as psql.
* [http://archives.postgresql.org/pgsql-hackers/2004-01/msg00191.php <nowiki>Re: psql \d option list overloaded</nowiki>]
}}

{{TodoItem
|Make psql's \d commands more consistent in its handling of schemas
* [http://archives.postgresql.org/pgsql-hackers/2004-11/msg00014.php <nowiki>Re: psql and schemas</nowiki>]
}}

{{TodoItem
|Consistently display privilege information for all objects in psql}}

{{TodoItem
|Add "auto" expanded mode that outputs in expanded format if "wrapped" mode can't wrap the output to the screen width
|Consider using auto-expanded mode for backslash commands like \df+.
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00417.php <nowiki>Re: psql wrapped format default for backslash-d commands</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg01638.php
}}

{{TodoItem
|Prevent tab completion of SET TRANSACTION from querying the database and therefore preventing the transaction isolation level from being set.
|Currently SET <tab> causes a database lookup to check all supported session variables. This query causes problems because setting the transaction isolation level must be the first statement of a transaction.}}

{{TodoItem
|Add a \set variable to control whether \s displays line numbers
|Another option is to add \# which lists line numbers, and allows command execution.
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00255.php <nowiki>Re: psql possible TODO</nowiki>]
}}

{{TodoItem
|Have \d show child tables that inherit from the specified parent}}

{{TodoItem
|Include the symbolic SQLSTATE name in verbose error reports
* [http://archives.postgresql.org/pgsql-general/2007-09/msg00438.php <nowiki>Re: Checking is TSearch2 query is valid</nowiki>]
}}

{{TodoItem
|Add prompt escape to display the client and server versions
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00310.php <nowiki>WIP patch for TODO Item: Add prompt escape to display the client and server versions</nowiki>]
}}

{{TodoItem
|Add option to wrap column values at whitespace boundaries, rather than chopping them at a fixed width.
|Currently, "wrapped" format chops values into fixed widths. Perhaps the word wrapping could use the same algorithm documented in the W3C specification.
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00404.php <nowiki>Re: psql wrapped format default for backslash-d commands</nowiki>]
* http://www.w3.org/TR/CSS21/tables.html#auto-table-layout}}

{{TodoItem
|Support the ReST table output format
|Details about the ReST format: http://docutils.sourceforge.net/rst.html#reference-documentation
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg01007.php <nowiki>Proposal: new border setting in psql</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00518.php <nowiki>Re: Proposal: new border setting in psql</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00609.php <nowiki>Re: Proposal: new border setting in psql</nowiki>]
}}

{{TodoItem
|Add option to print advice for people familiar with other databases
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01845.php <nowiki>MySQL-ism help patch for psql</nowiki>]
}}

{{TodoItemDone
|Consider showing TOAST and index sizes in \dt+
* [http://archives.postgresql.org/pgsql-general/2010-01/msg00912.php <nowiki>\dt+ sizes don't include TOAST data</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00485.php <nowiki>Re: psql \dt and table size</nowiki>]
}}

{{TodoItem
|\dd is missing comments for several types of objects
|Comments are not handled at all for some object types, and are handled by both \dd and the individual backslash command for others. Consider a system view like pg_comments to manage this mess.
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00436.php <nowiki>Re: More robust pg_hba.conf parsing/error logging</nowiki>]
* [http://archives.postgresql.org/pgsql-general/2009-09/msg00199.php <nowiki>comment on constraint</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-09/msg01080.php <nowiki>pg_comments</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2011-05/msg00885.php <nowiki>patch: Allow \dd to show constraint comments</nowiki>]
}}

{{TodoItem
|Add ability to edit views with \ev
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00023.php <nowiki>Adding \ev view editor?</nowiki>]
}}

{{TodoItemDone
|Add \dL to show languages
* [http://archives.postgresql.org/pgsql-hackers/2009-07/msg00915.php <nowiki>Re: [PATCH] Psql List Languages</nowiki>]
}}

{{TodoItemDone
|Distinguish between unique indexes and unique constraints in \d+
* http://archives.postgresql.org/message-id/8780.1271187360@sss.pgh.pa.us
}}

{{TodoItem
|Fix FETCH_COUNT to handle SELECT ... INTO and WITH queries
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01565.php
* http://archives.postgresql.org/pgsql-bugs/2010-05/msg00192.php
}}

{{TodoItem
|Prevent psql from sending remaining single-line multi-statement queries after reconnecting
* http://archives.postgresql.org/pgsql-bugs/2010-05/msg00159.php
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01283.php
}}

{{TodoItemEasy
|Add \i option to bring in the specified file as a quoted literal. This would be useful for creating functions and other areas. Details still need to be worked out.
* http://archives.postgresql.org/pgsql-bugs/2011-02/msg00016.php
* http://archives.postgresql.org/pgsql-bugs/2011-02/msg00020.php
}}

{{TodoItem
|Consider having psql -c read .psqlrc, for consistency
|psql -f already reads .psqlrc
}}

{{TodoItem
|Allow processing of multiple -f (file) options
}}

{{TodoItem
|Improve line drawing characters
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00386.php
}}

{{TodoEndSubsection}}

=== pg_dump / pg_restore ===
{{TodoSubsection}}

{{TodoItemEasy
|<nowiki>Add full object name to the tag field. eg. for operators we need '=(integer, integer)', instead of just '='.</nowiki>}}

{{TodoItem
|Add pg_dumpall custom format dumps?
* [http://archives.postgresql.org/pgsql-general/2010-05/msg00509.php pg_dumpall custom format]
|}}

{{TodoItem
|Avoid using platform-dependent locale names in pg_dumpall output
|Using native locale names puts roadblocks in the way of porting a dump to another platform. One possible solution is to get
CREATE DATABASE to accept some agreed-on set of locale names and fix them up to meet the platform's requirements.
* http://archives.postgresql.org/message-id/21396.1241716688@sss.pgh.pa.us
}}

{{TodoItem
|Allow selection of individual object(s) of all types, not just tables}}

{{TodoItem
|In a selective dump, allow dumping of an object and all its dependencies}}

{{TodoItem
|Add options like pg_restore -l and -L to pg_dump}}

{{TodoItem
|Add support for multiple pg_restore -t options, like pg_dump
|pg_restore's -t switch is less useful than pg_dump's in quite a few ways: no multiple switches, no pattern matching, no ability to pick up indexes and other dependent items for a selected table. It should be made to handle this switch just like pg_dump does.}}

{{TodoItem
|Stop dumping CASCADE on DROP TYPE commands in clean mode}}

{{TodoItem
|Allow pg_dump --clean to drop roles that own objects or have privileges
|tgl says: if this is about pg_dumpall, it's done as of 8.4. If it's really about pg_dump, what does it mean? pg_dump has no business dropping roles.}}

{{TodoItem
|Remove unnecessary function pointer abstractions in pg_dump source code}}

{{TodoItem
|Allow pg_dump to utilize multiple CPUs and I/O channels by dumping multiple objects simultaneously
|The difficulty with this is getting multiple dump processes to produce a single dump output file. It also would require several sessions to share the same snapshot.
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php <nowiki>pg_dump additional options for performance</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00135.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00040.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg02454.php
}}

{{TodoItem
|Allow pg_restore to load different parts of the COPY data for a single table simultaneously}}

{{TodoItem
|Remove support for dumping from pre-7.3 servers
|In 7.3 and later, we can get accurate dependency information from the server. pg_dump still contains a lot of crufty code
to try to deal with the lack of dependency info in older servers, but the usefulness of maintaining that code grows small.}}

{{TodoItem
|Allow pre/data/post files when schema and data are dumped separately, for performance reasons
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php <nowiki>pg_dump additional options for performance</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-07/msg00185.php <nowiki>Re: pg_dump additional options for performance</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00821.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00135.php
}}

{{TodoItem
|Refactor handling of database attributes between pg_dump and pg_dumpall
|Currently only pg_dumpall emits database attributes, such as ALTER DATABASE SET commands and database-level GRANTs.
Many people wish that pg_dump would do that. One proposal is to let pg_dump issue such commands if the -C switch was used,
but it's unclear whether that will satisfy the demand.
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg01031.php <nowiki>ALTER DATABASE vs pg_dump</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2010-05/msg00010.php summary of the issues]
}}

{{TodoItem
|Change pg_dump so that a comment on the dumped database is applied to the loaded database, even if the database has a different name.
|This will require new backend syntax, perhaps COMMENT ON CURRENT DATABASE. This is related to the previous item.}}

{{TodoItem
|Allow parallel restore of tar dumps
* [http://archives.postgresql.org/pgsql-hackers/2009-02/msg01154.php <nowiki>Re: parallel restore</nowiki>]
}}

{{TodoItem
|Allow pg_dumpall to output restorable ALTER USER/DATABASE SET settings
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00916.php
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00394.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg02359.php
}}

{{TodoEndSubsection}}

=== ecpg ===
{{TodoSubsection}}

{{TodoItem
|Docs
|Document differences between ecpg and the SQL standard and information about the Informix-compatibility module.}}

{{TodoItem
|Solve cardinality > 1 for input descriptors / variables?}}

{{TodoItem
|Add a semantic check level, e.g. check if a table really exists}}

{{TodoItem
|fix handling of DB attributes that are arrays}}

{{TodoItem
|Fix nested C comments}}

{{TodoItemEasy
|sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified}}

{{TodoItem
|Make SET CONNECTION thread-aware, non-standard?}}

{{TodoItem
|Allow multidimensional arrays}}

{{TodoItem
|Implement COPY FROM STDIN}}

{{TodoItem
|Provide a way to specify size of a bytea parameter
* [http://archives.postgresql.org/message-id/200906192131.n5JLVoMo044178@wwwmaster.postgresql.org <nowiki>BUG #4866: ECPG and BYTEA</nowiki>]
}}

{{TodoItemEasy
|Fix small memory leaks in ecpg
|Memory leaks in a short running application like ecpg are not really a problem, but make debugging more complicated}}

{{TodoItem
|Allow reuse of cursor name variables
* [http://archives.postgresql.org/message-id/20100329113435.GA3430@feivel.credativ.lan <nowiki>Problems with variable cursorname in ecpg</nowiki>]
}}

{{TodoEndSubsection}}

=== libpq ===
{{TodoSubsection}}

{{TodoItem
|Prevent PQfnumber() from lowercasing unquoted column names
|PQfnumber() should never have been doing lowercasing, but historically it has so we need a way to prevent it}}

{{TodoItem
|Allow statement results to be automatically batched to the client
|Currently all statement results are transferred to the libpq client before libpq makes the results available to the application. This feature would allow the application to make use of the first result rows while the rest are transferred, or held on the server waiting for them to be requested by libpq. One complexity is that a statement like SELECT 1/col could error out mid-way through the result set.}}

{{TodoItem
|Consider disallowing multiple queries in PQexec() as an additional barrier to SQL injection attacks
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg00184.php <nowiki>Re: InitPostgres and flatfiles question</nowiki>]
}}

{{TodoItem
|Add PQexecf() that allows complex parameter substitution
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01803.php <nowiki>Last minute mini-proposal (I know, know) for PQexecf()</nowiki>]
}}

{{TodoItem
|Add SQLSTATE and severity to errors generated within libpq itself
* [http://archives.postgresql.org/pgsql-interfaces/2007-11/msg00015.php <nowiki>v8.1: Error severity on libpq PGconn*</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01425.php
}}

{{TodoItemDone
|Add code to detect client encoding and locale from the operating system environment
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg01040.php <nowiki>Determining client_encoding from client locale</nowiki>]
}}

{{TodoItem
|Add support for interface/ipaddress binding to libpq
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01811.php <nowiki>SR/libpq - outbound interface/ipaddress binding</nowiki>]
}}

{{TodoEndSubsection}}

== Triggers ==

{{TodoItem
|Improve storage of deferred trigger queue
|Right now all deferred trigger information is stored in backend memory. This could exhaust memory for very large trigger queues. This item involves dumping large queues into files, or doing some kind of join to process all the triggers, some bulk operation, or a bitmap.
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00876.php <nowiki>Re: BUG #4204: COPY to table with FK has memory leak</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg00464.php <nowiki>Scaling up deferred unique checks and the after trigger queue</nowiki>]
}}

{{TodoItem
|Allow triggers to be disabled in only the current session.
|This is currently possible by starting a multi-statement transaction, modifying the system tables, performing the desired SQL, restoring the system tables, and committing the transaction. ALTER TABLE ... TRIGGER requires a table lock so it is not ideal for this usage.}}

{{TodoItem
|With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN KEY
|If the dump is known to be valid, allow foreign keys to be added without revalidating the data.}}

{{TodoItem
|Allow statement-level triggers to access modified rows}}

{{TodoItem
|When statement-level triggers are defined on a parent table, have them fire only on the parent table, and fire child table triggers only where appropriate
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg01883.php <nowiki>Statement-level triggers and inheritance</nowiki>]
}}

{{TodoItem
|Allow AFTER triggers on system tables
|System tables are modified in many places in the backend without going through the executor and therefore not causing triggers to fire. To complete this item, the functions that modify system tables will have to fire triggers.
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01665.php
* http://wiki.postgresql.org/wiki/DDL_Triggers
}}

{{TodoItem
|Tighten trigger permission checks
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00564.php <nowiki>Security leak with trigger functions?</nowiki>]
}}

{{TodoItem
|Allow BEFORE INSERT triggers on views
* [http://archives.postgresql.org/pgsql-general/2007-02/msg01466.php <nowiki>Re: Why can't I put a BEFORE EACH ROW trigger on a view?</nowiki>]
}}

{{TodoItem
|Add database and transaction-level triggers
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00451.php <nowiki>Proposal for db level triggers</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00620.php <nowiki>triggers on prepare, commit, rollback... ?</nowiki>]
}}

{{TodoItem
|Reduce locking requirements for creating a trigger
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00635.php <nowiki>Re: Change lock requirements for adding a trigger</nowiki>]
}}

{{TodoItem
|Avoid requirement for "AFTER" trigger functions to return a value
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg02384.php
}}

== Inheritance ==

{{TodoItem
|Allow inherited tables to inherit indexes, UNIQUE constraints, and primary/foreign keys
* [http://archives.postgresql.org/pgsql-hackers/2010-05/msg00285.php <nowiki>Partitioning/inherited tables vs FKs</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00039.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00305.php
}}

{{TodoItem
|Honor UNIQUE INDEX on base column in INSERTs/UPDATEs on inherited table, e.g. INSERT INTO inherit_table (unique_index_col) VALUES (dup) should fail
|The main difficulty with this item is the problem of creating an index that can span multiple tables.}}

{{TodoItem
|Determine whether ALTER TABLE / SET SCHEMA should work on inheritance hierarchies (and thus support ONLY). If yes, implement it.}}

{{TodoItem
|ALTER TABLE variants sometimes support recursion and sometimes not, but this is poorly/not documented, and the ONLY marker would then be silently ignored. Clarify the documentation, and reject ONLY if it is not supported.}}

== Indexes ==

{{TodoItem
|Prevent index uniqueness checks when UPDATE does not modify the column
|Uniqueness (index) checks are done when updating a column even if the column is not modified by the UPDATE.
However, HOT already short-circuits this in common cases, so more work might not be helpful.}}

{{TodoItem
|Allow the creation of on-disk bitmap indexes which can be quickly combined with other bitmap indexes
|Such indexes could be more compact if there are only a few distinct values. Such indexes can also be compressed. Keeping such indexes updated can be costly.
* [http://archives.postgresql.org/pgsql-patches/2005-07/msg00512.php <nowiki>Re: Bitmap index AM</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg01107.php <nowiki>Bitmap index thoughts</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00265.php <nowiki>Stream bitmaps</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01214.php <nowiki>Re: Bitmapscan changes - Requesting further feedback</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00013.php <nowiki>Updated bitmap index patch</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00741.php <nowiki>Reviewing new index types (was Re: [PATCHES] Updated bitmap indexpatch)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01023.php <nowiki>Bitmap Indexes: request for feedback</nowiki>]
* http://archives.postgresql.org/message-id/800923.27831.qm@web29010.mail.ird.yahoo.com
}}

{{TodoItem
|Allow accurate statistics to be collected on indexes with more than one column or expression indexes, perhaps using per-index statistics
* [http://archives.postgresql.org/pgsql-performance/2006-10/msg00222.php <nowiki>Re: Simple join optimized badly?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01131.php <nowiki>Stats for multi-column indexes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00741.php <nowiki>Cross-column statistics revisited</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg01431.php <nowiki>Multi-Dimensional Histograms</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00913.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg02179.php
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00459.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg02054.php
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg01731.php
}}

{{TodoItem
|Consider having a larger statistics target for indexed columns and expression indexes.
}}

{{TodoItem
|Consider smaller indexes that record a range of values per heap page, rather than having one index entry for every heap row
|This is useful if the heap is clustered by the indexed values.
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00341.php <nowiki>Grouped Index Tuples</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg01264.php <nowiki>Grouped Index Tuples</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00465.php <nowiki>Grouped Index Tuples / Clustered Indexes</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-03/msg00163.php <nowiki>Bitmapscan changes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00014.php <nowiki>Re: GIT patch</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00487.php <nowiki>Re: Index Tuple Compression Approach?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01589.php <nowiki>Re: Index AM change proposals, redux</nowiki>]
}}

{{TodoItem
|Add REINDEX CONCURRENTLY, like CREATE INDEX CONCURRENTLY
|This is difficult because you must upgrade to an exclusive table lock to replace the existing index file. CREATE INDEX CONCURRENTLY does not have this complication. This would allow index compaction without downtime.
* [http://archives.postgresql.org/pgsql-performance/2007-08/msg00289.php <nowiki>Re: When/if to Reindex</nowiki>]
}}

{{TodoItem
|Allow multiple indexes to be created concurrently, ideally via a single heap scan
|pg_restore allows parallel index builds, but it is done via subprocesses, and there is no SQL interface for this.
}}

{{TodoItem
|Consider sorting entries before inserting into btree index
* [http://archives.postgresql.org/pgsql-general/2008-01/msg01010.php <nowiki>Re: ATTN: Clodaldo was Performance problem. Could it be related to 8.3-beta4?</nowiki>]
}}

{{TodoItem
|Allow index scans to return matching index keys, not just the matching heap locations
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01657.php <nowiki>Re: Is this TODO item done?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01477.php <nowiki>Index-only quals</nowiki>]
}}

{{TodoItem
|Allow creation of an index that can do comparisons to test if a value is between two column values
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00757.php <nowiki>Proposal: temporal extension "period" data type</nowiki>]
}}

{{TodoItem
|Consider using "effective_io_concurrency" for index scans
* Currently only bitmap scans use this, which might be fine because most multi-row index scans use bitmap scans.
}}

=== GIST ===
{{TodoSubsection}}

{{TodoItem
|Add more GIST index support for geometric data types}}

{{TodoItem
|Allow GIST indexes to create certain complex index types, like digital trees (see Aoki)}}

{{TodoItem
|Fix performance issues in contrib/seg and contrib/cube GiST support
* [http://archives.postgresql.org/message-id/alpine.DEB.2.00.0904161633160.4053@aragorn.flymine.org GiST index performance]
* [http://archives.postgresql.org/message-id/alpine.DEB.2.00.0904221704470.22330@aragorn.flymine.org draft patch]
* [http://archives.postgresql.org/pgsql-performance/2009-05/msg00069.php <nowiki>Re: GiST index performance</nowiki>]
* [http://archives.postgresql.org/pgsql-performance/2009-06/msg00068.php <nowiki>GiST index performance</nowiki>]
}}

{{TodoEndSubsection}}

=== GIN ===
{{TodoSubsection}}

{{TodoItemDone
|Support empty indexed values (such as zero-element arrays) properly
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00237.php contrib/intarray vs empty arrays]
* [http://archives.postgresql.org/pgsql-bugs/2009-05/msg00118.php BUG #4806: Bug with GiST index and empty integer array]
}}

{{TodoItemDone
|Behave correctly for cases where some elements of an indexed value are NULL
* [http://archives.postgresql.org/pgsql-hackers/2009-03/msg01003.php <nowiki>GIN versus zero-key queries</nowiki>]
}}

{{TodoItemDone
|Support queries that require a full scan
* [http://archives.postgresql.org/pgsql-general/2009-05/msg00402.php Issue report]
* [http://archives.postgresql.org/pgsql-general/2007-06/msg01132.php Older issue report]
* [http://archives.postgresql.org/pgsql-hackers/2010-10/msg00521.php Still another complaint]
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg01581.php Previous partial fix]
}}

{{TodoItemDone
|Improve GIN's handling of NULL array values
* http://archives.postgresql.org/pgsql-bugs/2010-12/msg00032.php
}}

{{TodoEndSubsection}}

=== Hash ===
{{TodoSubsection}}

{{TodoItem
|Add UNIQUE capability to hash indexes}}

{{TodoItem
|Add hash WAL logging for crash recovery}}

{{TodoItem
|Allow multi-column hash indexes}}

{{TodoEndSubsection}}

== Sorting ==

{{TodoItem
|Consider whether duplicate keys should be sorted by block/offset
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00558.php <nowiki>Remove hacks for old bad qsort() implementations?</nowiki>]
}}

{{TodoItem
|Consider being smarter about memory and external files used during sorts
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg01101.php <nowiki>Sorting Improvements for 8.4</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00045.php <nowiki>Re: Sorting Improvements for 8.4</nowiki>]
}}

{{TodoItem
|Consider detoasting keys before sorting}}

{{TodoItem
|Allow sorts to use more available memory
* http://archives.postgresql.org/pgsql-hackers/2007-11/msg01026.php
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg01123.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg01957.php
}}

== Fsync ==

{{TodoItem
|Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options and whether fsync does anything
|Ideally this requires a separate test program like /contrib/pg_test_fsync that can be run at initdb time or optionally later.}}

{{TodoItem
|Consider sorting writes during checkpoint
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00541.php <nowiki>Sorted writes in checkpoint</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-07/msg00050.php <nowiki>Re: Sorting writes during checkpoint</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg02012.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg00278.php
}}

== Cache Usage ==

{{TodoItem
|Speed up COUNT(*)
|We could use a fixed row count and a +/- count to follow MVCC visibility rules, or a single cached value could be used and invalidated if anyone modifies the table. Another idea is to get a count directly from a unique index, but for this to be faster than a sequential scan it must avoid access to the heap to obtain tuple visibility information.}}

{{TodoItem
|Provide a way to calculate an "estimated COUNT(*)"
|Perhaps by using the optimizer's cardinality estimates or random sampling.
* [http://archives.postgresql.org/pgsql-hackers/2005-11/msg00943.php <nowiki>Re: Improving count(*)</nowiki>]
}}

{{TodoItem
|Allow data to be pulled directly from indexes
|Currently indexes do not have enough tuple visibility information to allow data to be pulled from the index without also accessing the heap. The idea is to use the visibility map used for vacuum to avoid heap lookups on pages where all tuples are visible.
* [http://wiki.postgresql.org/wiki/Index-only_scans Index-Only Scans wiki]
}}

{{TodoItem
|Consider automatic caching of statements at various levels:
* Parsed query tree
* Query execute plan
* Query results

:
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00823.php <nowiki>Cached Query Plans (was: global prepared statements)</nowiki>]
}}

{{TodoItem
|Consider increasing internal areas (NUM_CLOG_BUFFERS) when shared buffers is increased
* [http://archives.postgresql.org/pgsql-hackers/2005-10/msg01419.php <nowiki>Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00030.php <nowiki>clog_buffers to 64 in 8.3?</nowiki>]
* [http://archives.postgresql.org/pgsql-performance/2007-08/msg00024.php <nowiki>CLOG Patch</nowiki>]
}}

{{TodoItem
|Consider decreasing the amount of memory used by PrivateRefCount
|
* [http://archives.postgresql.org/pgsql-hackers/2006-11/msg00797.php <nowiki>PrivateRefCount (for 8.3)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg00752.php <nowiki>Re: PrivateRefCount (for 8.3)</nowiki>]
}}

{{TodoItem
|Consider allowing higher priority queries to have referenced buffer cache pages stay in memory longer
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00562.php <nowiki>Re: How to keep a table in memory?</nowiki>]
}}

== Vacuum ==

{{TodoItem
|Auto-fill the free space map by scanning the buffer cache or by checking pages written by the background writer
* [http://archives.postgresql.org/pgsql-hackers/2006-02/msg01125.php <nowiki>Dead Space Map</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-03/msg00011.php <nowiki>Re: Automatic free space map filling</nowiki>]
}}

{{TodoItem
|Allow concurrent inserts to use recently created pages rather than creating new ones
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg00853.php
}}

{{TodoItem
|Consider having single-page pruning update the visibility map
* <nowiki>https://commitfest.postgresql.org/action/patch_view?id=75</nowiki>
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg02344.php <nowiki>Re: visibility maps and heap_prune</nowiki>]
}}

{{TodoItem
|Improve tracking of total relation tuple counts now that vacuum doesn't always scan the whole heap
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00531.php Partial vacuum versus pg_class.reltuples]
}}

{{TodoItem
|Bias FSM towards returning free space near the beginning of the heap file, in hopes that empty pages at the end can be truncated by VACUUM
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg01124.php <nowiki>FSM search modes</nowiki>]
}}

{{TodoItem
|Consider a more compact data representation for dead tuple locations within VACUUM
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00143.php <nowiki>Re: Have vacuum emit a warning when it runs out of maintenance_work_mem</nowiki>]
}}

{{TodoItem
|Provide more information in order to improve user-side estimates of dead space bloat in relations
* [http://archives.postgresql.org/pgsql-general/2009-05/msg01039.php <nowiki>Re: Bloated Table</nowiki>]
}}

{{TodoItem
|Improve locking behaviour of vacuum during trailing page truncation
* http://archives.postgresql.org/pgsql-bugs/2011-03/msg00319.php
* http://archives.postgresql.org/message-id/4D8DF88E.7080205@Yahoo.com
}}

=== Auto-vacuum ===
{{TodoSubsection}}

{{TodoItemEasy
|Issue log message to suggest VACUUM FULL if a table is nearly empty?}}

{{TodoItem
|Prevent long-lived temporary tables from causing frozen-xid advancement starvation
|The problem is that autovacuum cannot vacuum them to set frozen xids; only the session that created them can do that.
* [http://archives.postgresql.org/pgsql-general/2007-06/msg01645.php <nowiki>Re: AutoVacuum Behaviour Question</nowiki>]
}}

{{TodoItem
|Prevent autovacuum from running if an old transaction is still running from the last vacuum
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00899.php <nowiki>Re: Autovacuum and OldestXmin</nowiki>]
}}

{{TodoItem
|Have autoanalyze of parent tables occur when child tables are modified
* http://archives.postgresql.org/pgsql-performance/2010-06/msg00137.php
* http://archives.postgresql.org/pgsql-performance/2010-10/msg00271.php
}}

{{TodoEndSubsection}}

== Locking ==

{{TodoItem
|Fix priority ordering of read and write light-weight locks
* [http://archives.postgresql.org/pgsql-hackers/2004-11/msg00893.php <nowiki>lwlocks and starvation</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2004-11/msg00905.php <nowiki>Re: lwlocks and starvation</nowiki>]
}}

{{TodoItem
|Fix problem when multiple subtransactions of the same outer transaction hold different types of locks, and one subtransaction aborts
* [http://archives.postgresql.org/pgsql-hackers/2006-11/msg01011.php <nowiki>FOR SHARE vs FOR UPDATE locks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00001.php <nowiki>Re: FOR SHARE vs FOR UPDATE locks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00435.php <nowiki>Re: [PATCHES] [pgsql-patches] Phantom Command IDs, updated patch</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00773.php <nowiki>Re: savepoints and upgrading locks</nowiki>]
}}

{{TodoItem
|Allow UPDATEs on only non-referential integrity columns not to conflict with referential integrity locks
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00073.php <nowiki>Referential Integrity and SHARE locks</nowiki>]
}}

{{TodoItem
|Add idle_in_transaction_timeout GUC so locks are not held for long periods of time}}

{{TodoItem
|Improve deadlock detection when a page cleaning lock conflicts with a shared buffer that is pinned
* [http://archives.postgresql.org/pgsql-bugs/2008-01/msg00138.php <nowiki>BUG #3883: Autovacuum deadlock with truncate?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00873.php <nowiki>Thoughts about bug #3883</nowiki>]
* [http://archives.postgresql.org/pgsql-committers/2008-01/msg00365.php <nowiki>Re: pgsql: Add checks to TRUNCATE, CLUSTER, and REINDEX to prevent</nowiki>]
}}

{{TodoItem
|Detect deadlocks involving LockBufferForCleanup()
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00873.php <nowiki>Thoughts about bug #3883</nowiki>]
}}

{{TodoItem
|Allow finer control over who is cancelled in a deadlock
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01727.php
}}

{{TodoItem
|Consider a lock timeout parameter
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00485.php <nowiki>SELECT ... FOR UPDATE [WAIT integer | NOWAIT] for 8.5</nowiki>]
}}

{{TodoItemDone
|Consider improving serialized transaction behavior to avoid anomalies
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00217.php <nowiki>Serializable Isolation without blocking</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg01136.php <nowiki>User-facing aspects of serializable transactions</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00035.php <nowiki>Re: User-facing aspects of serializable transactions</nowiki>]
}}

== Startup Time Improvements ==

{{TodoItem
|Experiment with multi-threaded backend for backend creation
|This would prevent the overhead associated with process creation. Most operating systems have trivial process creation time compared to database startup overhead, but a few operating systems (Win32, Solaris) might benefit from threading. Also explore the idea of a single session using multiple threads to execute a statement faster.}}

{{TodoItem
|Allow backends to change their database without restart
|This allows for faster server startup.
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00843.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00336.php
}}

== Write-Ahead Log ==

{{TodoItem
|Eliminate need to write full pages to WAL before page modification
|Currently, to protect against partial disk page writes, we write full page images to WAL before they are modified so we can correct any partial page writes during recovery. These pages can also be eliminated from point-in-time archive files.
* [http://archives.postgresql.org/pgsql-hackers/2002-06/msg00655.php <nowiki>Re: Index Scans become Seq Scans after VACUUM ANALYSE</nowiki>]
}}

{{TodoItem
|When full page writes are off, write CRC to WAL and check file system blocks on recovery
|If CRC check fails during recovery, remember the page in case a later CRC for that page properly matches.}}

{{TodoItem
|Write full pages during file system write and not when the page is modified in the buffer cache
|This allows most full page writes to happen in the background writer. It might cause problems for applying WAL on recovery into a partially-written page, but later the full page will be replaced from WAL.}}

{{TodoItem
|Reduce WAL traffic so only modified values are written rather than entire rows
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01589.php <nowiki>Reduction in WAL for UPDATEs</nowiki>]
}}

{{TodoItem
|Allow WAL information to recover corrupted pg_controldata
* [http://archives.postgresql.org/pgsql-patches/2006-06/msg00025.php <nowiki>Re: [HACKERS] pg_resetxlog -r flag</nowiki>]
}}

{{TodoItem
|Find a way to reduce rotational delay when repeatedly writing last WAL page
|Currently fsync of WAL requires the disk platter to perform a full rotation to fsync again. One idea is to write the WAL to different offsets that might reduce the rotational delay.
* [http://archives.postgresql.org/pgsql-hackers/2002-11/msg00483.php <nowiki>500 tpsQL + WAL log implementation</nowiki>]
}}

{{TodoItemDone
|Allow WAL logging to be turned off for a table, but the table might be dropped or truncated during crash recovery
|Allow tables to bypass WAL writes and just fsync() dirty pages on commit. This should be implemented using ALTER TABLE, e.g. <nowiki>ALTER TABLE PERSISTENCE [ DROP | TRUNCATE | DEFAULT ]</nowiki>. Tables using non-default logging should not use referential integrity with default-logging tables. A table without dirty buffers during a crash could perhaps avoid the drop/truncate.
* [http://archives.postgresql.org/pgsql-hackers/2005-12/msg01016.php <nowiki>Re: [Bizgres-general] WAL bypass for INSERT, UPDATE and</nowiki>]
}}

{{TodoItem
|Speed WAL recovery by allowing more than one page to be prefetched
|This should be done utilizing the same infrastructure used for prefetching in general to avoid introducing complex error-prone code in WAL replay.
* [http://archives.postgresql.org/pgsql-general/2007-12/msg00683.php <nowiki>Slow PITR restore</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00497.php <nowiki>Re: [GENERAL] Slow PITR restore</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg01279.php <nowiki>Read-ahead and parallelism in redo recovery</nowiki>]
}}

{{TodoItem
|Improve WAL concurrency by increasing lock granularity
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00556.php <nowiki>Reworking WAL locking</nowiki>]
}}

{{TodoItem
|Be more aggressive about creating WAL files
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg01325.php <nowiki>Re: PANIC caused by open_sync on Linux</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2004-07/msg01075.php <nowiki>PreallocXlogFiles</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2005-04/msg00556.php <nowiki>WAL/PITR additional items</nowiki>]
}}

{{TodoItem
|Have resource managers report the duration of their status changes
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg01468.php <nowiki>Recovery of Multi-stage WAL actions</nowiki>]
}}

{{TodoItem
|Move pgfoundry's xlogdump to /contrib and have it rely more closely on the WAL backend code
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00035.php <nowiki>xlogdump</nowiki>]
}}

{{TodoItem
|Close deleted WAL files held open in *nix by long-lived read-only backends
* [http://archives.postgresql.org/pgsql-hackers/2009-11/msg01754.php <nowiki>Deleted WAL files held open by backends in Linux</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg00060.php <nowiki>Re: Deleted WAL files held open by backends in Linux</nowiki>]
}}

== Optimizer / Executor ==

{{TodoItem
|Improve selectivity functions for geometric operators}}

{{TodoItem
|Consider increasing the default values of from_collapse_limit, join_collapse_limit, and/or geqo_threshold
* [http://archives.postgresql.org/message-id/4136ffa0905210551u22eeb31bn5655dbe7c9a3aed5@mail.gmail.com from_collapse_limit vs. geqo_threshold]
}}

{{TodoItem
|Improve ability to display optimizer analysis using OPTIMIZER_DEBUG}}

{{TodoItem
|Log statements where the optimizer row estimates were dramatically different from the number of rows actually found?}}

{{TodoItem
|Consider compressed annealing to search for query plans
|This might replace GEQO.
* http://archives.postgresql.org/message-id/15658.1241278636%40sss.pgh.pa.us
}}

{{TodoItem
|Improve use of expression indexes for ORDER BY
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01553.php <nowiki>Resjunk sort columns, Heikki's index-only quals patch, and bug #5000</nowiki>]
}}

{{TodoItem
|Modify the planner to better estimate caching effects
* http://archives.postgresql.org/pgsql-performance/2010-11/msg00117.php
}}

=== Hashing ===
{{TodoSubsection}}

{{TodoItem
|Consider using a hash for joining to a large IN (VALUES ...) list
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00450.php <nowiki>Planning large IN lists</nowiki>]
}}

{{TodoItem
|Allow single batch hash joins to preserve outer pathkeys
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg00806.php Re: Potential Join Performance Issue]
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|"lazy" hash tables - look up only the tuples that are actually requested
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|Avoid building the same hash table more than once during the same query
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|Avoid hashing for distinct and then re-hashing for hash join
* [http://archives.postgresql.org/message-id/4136ffa0902191346g62081081v8607f0b92c206f0a@mail.gmail.com Re: Fixing Grittner's planner issues]
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItemDone
|Allow hashing to be used on arrays, if the element type is hashable
* http://archives.postgresql.org/message-id/11087.1244905821@sss.pgh.pa.us
}}

{{TodoEndSubsection}}

== Background Writer ==

{{TodoItem
|Consider having the background writer update the transaction status hint bits before writing out the page
|Implementing this requires the background writer to have access to system catalogs and the transaction status log.}}

{{TodoItem
|Consider adding buffers the background writer finds reusable to the free list
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00781.php <nowiki>Background LRU Writer/free list</nowiki>]
}}

{{TodoItem
|Automatically tune bgwriter_delay based on activity rather then using a fixed interval
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00781.php <nowiki>Background LRU Writer/free list</nowiki>]
}}

{{TodoItem
|Consider whether increasing BM_MAX_USAGE_COUNT improves performance
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg01007.php <nowiki>Bgwriter LRU cleaning: we've been going at this all wrong</nowiki>]
}}

{{TodoItem
|Test to see if calling PreallocXlogFiles() from the background writer will help with WAL segment creation latency
* [http://archives.postgresql.org/pgsql-patches/2007-06/msg00340.php <nowiki>Re: Load Distributed Checkpoints, final patch</nowiki>]
}}

== Concurrent Use of Resources ==

{{TodoItem
|Do async I/O for faster random read-ahead of data
|Async I/O allows multiple I/O requests to be sent to the disk with results coming back asynchronously.
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg00820.php <nowiki>Asynchronous I/O Support</nowiki>]
* [http://archives.postgresql.org/pgsql-performance/2007-09/msg00255.php <nowiki>Re: random_page_costs - are defaults of 4.0 realistic for SCSI RAID 1</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00027.php <nowiki>There's random access and then there's random access</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-01/msg00170.php <nowiki>Bitmap index scan preread using posix_fadvise (Was: There's random access and then there's random access)</nowiki>]
The above patch is already applied as of 8.4, but it still remains to figure out how to handle plain indexscans effectively.
* [http://archives.postgresql.org//pgsql-hackers/2009-01/msg00806.php Problems with the patch submitted for posix_fadvise in index scans]
}}

{{TodoItem
|Experiment with multi-threaded backend for better I/O utilization
|This would allow a single query to make use of multiple I/O channels simultaneously. One idea is to create a background reader that can pre-fetch sequential and index scan pages needed by other backends. This could be expanded to allow concurrent reads from multiple devices in a partitioned table.
* http://archives.postgresql.org/pgsql-performance/2011-02/msg00123.php
}}

{{TodoItem
|Experiment with multi-threaded backend for better CPU utilization
|This would allow several CPUs to be used for a single query, such as for sorting or query execution.
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00945.php <nowiki>Multi CPU Queries - Feedback and/or suggestions wanted!</nowiki>]
}}

{{TodoItem
|SMP scalability improvements
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00439.php <nowiki>Straightforward changes for increased SMP scalability</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00206.php <nowiki>Re: Reducing Transaction Start/End Contention</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00361.php <nowiki>Re: Reducing Transaction Start/End Contention</nowiki>]
}}

== TOAST ==

{{TodoItem
|Allow user configuration of TOAST thresholds
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00213.php <nowiki>Re: Proposed adjustments in MaxTupleSize and toastthresholds</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00082.php <nowiki>pg_lzcompress strategy parameters</nowiki>]
}}

{{TodoItem
|Reduce unnecessary cases of deTOASTing
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00895.php <nowiki>Re: [PATCHES] Eliminate more detoast copies for packed varlenas</nowiki>]
}}

{{TodoItem
|Reduce costs of repeat de-TOASTing of values
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg01096.php <nowiki>WIP patch: reducing overhead for repeat de-TOASTing</nowiki>]
}}

== Miscellaneous Performance ==

{{TodoItem
|Use mmap() rather than SYSV for shared buffers?
|This would remove the requirement for SYSV SHM but would introduce portability issues. Anonymous mmap (or mmap to /dev/zero) is required to prevent I/O overhead. We could also consider mmap() for writing WAL.
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00750.php
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00756.php
}}

{{TodoItem
|Rather than consider mmap()-ing in 8k pages, consider mmap()'ing entire files into a backend?
|Doing I/O to large tables would consume a lot of address space or require frequent mapping/unmapping. Extending the file also causes mapping problems that might require mapping only individual pages, leading to thousands of mappings. Another problem is that there is no way to _prevent_ I/O to disk from the dirty shared buffers so changes could hit disk before WAL is written.
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01239.php
}}

{{TodoItem
|Consider ways of storing rows more compactly on disk:
* Reduce the row header size?
* Consider reducing on-disk varlena length from four bytes to two because a heap row cannot be more than 64k in length}}

{{TodoItem
|Consider transaction start/end performance improvements
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00948.php <nowiki>Reducing Transaction Start/End Contention</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00361.php <nowiki>Re: Reducing Transaction Start/End Contention</nowiki>]
}}

{{TodoItem
|Allow configuration of backend priorities via the operating system
|Though backend priorities make priority inversion during lock waits possible, research shows that this is not a huge problem.
* [http://archives.postgresql.org/pgsql-general/2007-02/msg00493.php <nowiki>Priorities for users or queries?</nowiki>]
}}

{{TodoItem
|Consider increasing the minimum allowed number of shared buffers
* [http://archives.postgresql.org/pgsql-bugs/2008-02/msg00157.php <nowiki>Re: [PATCH] Don't bail with legitimate -N/-B options</nowiki>]
}}

{{TodoItem
|Consider if CommandCounterIncrement() can avoid its AcceptInvalidationMessages() call
* [http://archives.postgresql.org/pgsql-committers/2007-11/msg00585.php <nowiki>pgsql: Avoid incrementing the CommandCounter when</nowiki>]
}}

{{TodoItem
|Consider Cartesian joins when both relations are needed to form an indexscan qualification for a third relation
* [http://archives.postgresql.org/pgsql-performance/2007-12/msg00090.php <nowiki>Re: TB-sized databases</nowiki>]
}}

{{TodoItem
|Consider not storing a NULL bitmap on disk if all the NULLs are trailing
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00624.php <nowiki>Proposal for Null Bitmap Optimization(for Trailing NULLs)</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-12/msg00109.php <nowiki>Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)</nowiki>]
}}

{{TodoItem
|Sort large UPDATE/DELETEs so it is done in heap order
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg01119.php <nowiki>Possible future performance improvement: sort updates/deletes by ctid</nowiki>]
}}

{{TodoItem
|Allow one transaction to see tuples using the snapshot of another transaction
|This would assist multiple backends in working together.
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00400.php <nowiki>Transaction Snapshot Cloning</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00135.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00260.php
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00466.php
}}

{{TodoItem
|Consider decreasing the I/O caused by updating tuple hint bits
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00847.php <nowiki>Hint Bits and Write I/O</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-07/msg00199.php <nowiki>Re: [HACKERS] Hint Bits and Write I/O</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00695.php
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00792.php
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg01063.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01408.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01453.php
}}

{{TodoItem
|Avoid the requirement of freezing pages that are infrequently modified
|If all rows on a page are visible, it is possible to set a bit in the visibility map (once the visibility map is 100% reliable) and not need to freeze the page, avoiding a page rewrite
* http://archives.postgresql.org/message-id/4BF701CF.2090205@agliodbs.com
* http://archives.postgresql.org/pgsql-hackers/2010-06/msg00082.php
}}

{{TodoItem
|Avoid reading in b-tree pages when replaying vacuum records in hot standby mode
* [http://archives.postgresql.org/message-id/1272571938.4161.14739.camel@ebony <nowiki>Hot Standby tuning for btree_xlog_vacuum()</nowiki>]
}}

{{TodoItem
|Restructure truncation logic to be more resistant to failure
|This also involves not writing dirty buffers for a truncated or dropped relation
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01032.php
}}

{{TodoItem
|Consider adding logic to increase large tables by more than 8k
|This would reduce file system fragmentation
* http://archives.postgresql.org/pgsql-bugs/2011-03/msg00337.php
}}

== Miscellaneous Other ==

{{TodoItem
|Deal with encoding issues for filenames in the server filesystem
* {{MessageLink|20090413184335.39BE.52131E4D@oss.ntt.co.jp|a proposed patch here}}
* {{MessageLink|8484.1244655656@sss.pgh.pa.us|some issues about it here}}
* {{MessageLink|20100107103740.97A5.52131E4D@oss.ntt.co.jp|Windows-specific patch here}}
}}

{{TodoItem
|Deal with encoding issues in the output of localeconv()
* [http://archives.postgresql.org/message-id/40c6d9160904210658y590377cfw6dbbecb53d2b8be0@mail.gmail.com bug report]
* [http://archives.postgresql.org/message-id/49EF8DA0.90008@tpf.co.jp draft patch]
* [http://archives.postgresql.org/message-id/21710.1243620986@sss.pgh.pa.us review of patch]
}}

{{TodoItem
|Provide schema name and other fields available from SQL GET DIAGNOSTICS in error reports
* [http://archives.postgresql.org/message-id/dcc563d10810211907n3c59a920ia9eb7cd2a6d5ea58@mail.gmail.com <nowiki>How to get schema name which violates fk constraint</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-11/msg00846.php <nowiki>patch - Report the schema along table name in a referential failure error message</nowiki>]
* {{MessageLink|3191.1263306359@sss.pgh.pa.us|Re: NOT NULL violation and error-message}}
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg00213.php <nowiki>the case for machine-readable error fields</nowiki>]
}}

{{TodoItemEasy
| Provide [http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html#LIBPQ-CONNECT-FALLBACK-APPLICATION-NAME fallback_application_name] in contrib/pgbench, oid2name, and dblink.
* {{MessageLink|w2g9837222c1004070216u3bc46b3ahbddfdffdbfb46212@mail.gmail.com|fallback_application_name and pgbench}}
}}

{{TodoItem
|Add 64-bit support to /contrib/pgbench
* http://archives.postgresql.org/pgsql-hackers/2010-07/msg00153.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg00705.php
}}

== Source Code ==

{{TodoItem
|Add use of 'const' for variables in source tree}}

{{TodoItemEasy
|Remove warnings created by -Wcast-align}}

{{TodoItem
|Move platform-specific ps status display info from ps_status.c to ports}}

{{TodoItem
|Add optional CRC checksum to heap and index pages
|One difficulty is how to prevent hint bit changes from affecting the computed CRC checksum.
* http://archives.postgresql.org/message-id/19934.1226601952%40sss.pgh.pa.us
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00002.php <nowiki>Re: Block-level CRC checks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01028.php <nowiki>double-buffering page writes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00524.php <nowiki>Re: Block-level CRC checks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg01101.php <nowiki>Re: Block-level CRC checks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg00011.php <nowiki>Re: Block-level CRC checks</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00249.php
}}

{{TodoItem
|Consider a faster CRC32 algorithm
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01112.php
}}

{{TodoItem
|Allow cross-compiling by generating the zic database on the target system}}

{{TodoItem
|Improve NLS maintenance of libpgport messages linked onto applications}}

{{TodoItemDone
|Improve the module installation experience (/contrib, etc)
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00132.php <nowiki>modules</nowiki>]
* {{messageLink|ca33c0a30807231640n6fb4035dod8121a18aa1fa29c@mail.gmail.com|Re: PostgreSQL extensions packaging}}
* {{messageLink|ca33c0a30804061349s41b4d8fcsa9c579454b27ecd2@mail.gmail.com|Database owner installable modules patch}}
* [http://archives.postgresql.org//pgsql-hackers/2009-03/msg00855.php <nowiki>Re: contrib function naming, and upgrade issues</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00912.php <nowiki>search_path vs extensions</nowiki>]
}}

{{TodoItem
|Use UTF8 encoding for NLS messages so all server encodings can read them properly}}

{{TodoItem
|Allow creation of universal binaries for Darwin
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00884.php <nowiki>Getting to universal binaries for Darwin</nowiki>]
}}

{{TodoItem
|Consider GnuTLS if OpenSSL license becomes a problem
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg00892.php
* [http://archives.postgresql.org/pgsql-patches/2006-05/msg00040.php <nowiki>[PATCH] Add support for GnuTLS</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg01213.php <nowiki>TODO: GNU TLS</nowiki>]
}}

{{TodoItem
|Consider making NAMEDATALEN more configurable in future releases}}

{{TodoItem
|Research use of signals and sleep wake ups
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00003.php <nowiki>Restartable signals 'n all that</nowiki>]
}}

{{TodoItem
|Allow C++ code to more easily access backend code
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00302.php <nowiki>Mostly Harmless: Welcoming our C++ friends</nowiki>]
}}

{{TodoItem
|Consider simplifying how memory context resets handle child contexts
* [http://archives.postgresql.org/pgsql-patches/2007-08/msg00067.php <nowiki>Re: Memory leak in nodeAgg</nowiki>]
}}

{{TodoItem
|Create three versions of libpgport to simplify client code
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00154.php <nowiki>8.4 TODO item: make src/port support libpq and ecpg directly</nowiki>]
}}

{{TodoItem
|Improve detection of shared memory segments being used by others by checking the SysV shared memory field 'nattch'
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00656.php <nowiki>postgresql in FreeBSD jails: proposal</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00673.php <nowiki>Re: postgresql in FreeBSD jails: proposal</nowiki>]
}}

{{TodoItem
|Consider using POSIX shared memory to avoid System V shared memory kernel limits
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00558.php
}}

{{TodoItem
|Implement the non-threaded Avahi service discovery protocol
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00939.php <nowiki>Re: [PATCHES] Avahi support for Postgresql</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-02/msg00097.php <nowiki>Re: Avahi support for Postgresql</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg01211.php <nowiki>Re: [PATCHES] Avahi support for Postgresql</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-04/msg00001.php <nowiki>Re: [HACKERS] Avahi support for Postgresql</nowiki>]
}}

{{TodoItem
|Reduce data row alignment requirements on some 64-bit systems
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00369.php <nowiki>[WIP] Reduce alignment requirements on 64-bit systems.</nowiki>]
}}

{{TodoItem
|Restructure TOAST internal storage format for greater flexibility
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00049.php <nowiki>Re: PG_PAGE_LAYOUT_VERSION 5 - time for change</nowiki>]
}}

{{TodoItem
| Add regression tests for pg_dump/restore
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01967.php <nowiki>"make install-check-pg_dump" target in src/regress]</nowiki>]
}}

{{TodoItem
| Research different memory allocation methods for lists
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg01467.php
}}

=== /contrib/pg_upgrade ===
{{TodoSubsection}}

{{TodoItemDone
|Remove copy_dir() code, or use it
}}

{{TodoItem
|Handle large object comments
|This is difficult to do because the large object doesn't exist when --schema-only is loaded.
}}

{{TodoItem
|Consider using pg_depend for checking object usage in version.c
}}

{{TodoItem
|If reindex is necessary, allow it to be done in parallel with pg_dump custom format
}}

{{TodoItem
|Migrate pg_statistic by dumping it out as a flat file, so analyze is not necessary
|pg_class.oid is not preserved so schema.tablename must be used.
}}

{{TodoItem
|Improve testing, perhaps using the buildfarm
|The buildfarm has access to multiple versions of PostgreSQL.
}}

{{TodoItem
|Create machine-readable output of pg_controldata
|This would avoid parsing its output. The problem is we need pg_controldata output from both the old and new clusters so we would need to support both formats.
}}

{{TodoEndSubsection}}

=== Windows ===
{{TodoSubsection}}

{{TodoItem
|Remove configure.in check for link failure when cause is found}}

{{TodoItem
|Remove readdir() errno patch when runtime/mingwex/dirent.c rev 1.4 is released}}

{{TodoItem
|Allow psql to use readline once non-US code pages work with backslashes}}

{{TodoItem
|Fix problem with shared memory on the Win32 Terminal Server}}

{{TodoItem
|Improve signal handling
* [http://archives.postgresql.org/pgsql-patches/2005-06/msg00027.php <nowiki>Simplify Win32 Signaling code</nowiki>]
}}

{{TodoItem
|Convert MSVC build system to remove most batch files
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00961.php <nowiki>MSVC build system</nowiki>]
}}

{{TodoItem
|Support pgxs when using MSVC}}

{{TodoItem
|Fix MSVC NLS support, like for to_char()
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00485.php <nowiki>NLS on MSVC strikes back!</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-02/msg00038.php <nowiki>Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)</nowiki>]
}}

{{TodoItem
|Find a correct rint() substitute on Windows
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00808.php <nowiki>Minor bug in src/port/rint.c</nowiki>]
}}

{{TodoItem
|Fix global namespace issues when using multiple terminal server sessions
* [http://archives.postgresql.org/message-id/48F3BFCC.8030107@dunslane.net problems with Windows global namespace]}}

{{TodoItem
|Change from the current autoconf/gmake build system to cmake
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg01869.php <nowiki>About CMake (was Re: [COMMITTERS] pgsql: Append major version number and for libraries soname major)</nowiki>]
}}

{{TodoItem
|Improve consistency of path separator usage
* http://archives.postgresql.org/message-id/49C0BDC5.4010002@hagander.net
}}

{{TodoItem
|Fix cross-compiling on Windows
* http://archives.postgresql.org/pgsql-bugs/2010-10/msg00110.php
}}

{{TodoItem
|Allow multiple Postgres clusters running on the same machine to distinguish themselves in the event log
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01297.php
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00574.php
}}

{{TodoEndSubsection}}

=== Wire Protocol Changes ===
{{TodoSubsection}}

{{TodoItem
|Allow dynamic character set handling}}

{{TodoItem
|Add decoded type, length, precision}}

{{TodoItem
|Mark result columns as known-not-null when possible
* [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01029.php <nowiki>Adding nullable indicator to Describe</nowiki>]
}}

{{TodoItem
|Provide more control over planner treatment of statements being prepared}}

{{TodoItem
|Use compression?}}

{{TodoItem
|Update clients to use data types, typmod, schema.table.column names of result sets using new statement protocol}}

{{TodoEndSubsection}}

== Documentation ==

{{TodoItem
|Convert single quotes to apostrophes in the PDF documentation
* [http://archives.postgresql.org/pgsql-docs/2007-12/msg00059.php <nowiki>SGML docs and pdf single-quotes</nowiki>]
}}

{{TodoItem
|Provide a manpage for postgresql.conf
* {{messageLink|20080819194311.GH4428@alvh.no-ip.org|A smaller default postgresql.conf}}
* {{messageLink|200808211910.37524.peter_e@gmx.net|A smaller default postgresql.conf}}
}}

{{TodoItem
|Change the manpage-generating toolchain to use the new XML-based docbook2x tools
* {{messageLink|200808211910.37524.peter_e@gmx.net|A smaller default postgresql.conf}}
}}

{{TodoItem
|Consider changing documentation format from SGML to XML
* [http://archives.postgresql.org/pgsql-docs/2006-12/msg00152.php <nowiki>Re: Authoring Tools WAS: Switching to XML</nowiki>]
* http://archives.postgresql.org/pgsql-docs/2011-04/msg00020.php
* http://wiki.postgresql.org/wiki/Switching_PostgreSQL_documentation_from_SGML_to_XML
}}

{{TodoItem
|Document support for N<nowiki>' '</nowiki> national character string literals, if it matches the SQL standard
* http://archives.postgresql.org/message-id/1275895438.1849.1.camel@fsopti579.F-Secure.com
}}

{{TodoItem
|Add diagrams to the documentation
* http://archives.postgresql.org/pgsql-docs/2010-07/msg00001.php
}}

== Exotic Features ==

{{TodoItem
|Add pre-parsing phase that converts non-ISO syntax to supported syntax
|This could allow SQL written for other databases to run without modification.}}

{{TodoItem
|Allow plug-in modules to emulate features from other databases}}

{{TodoItem
|Add features of Oracle-style packages
|A package would be a schema with session-local variables, public/private functions, and initialization functions. It is also possible to implement these capabilities in any schema and not use a separate "packages" syntax at all.
* [http://archives.postgresql.org/pgsql-hackers/2006-08/msg00384.php <nowiki>proposal for PL packages for 8.3.</nowiki>]
}}

{{TodoItem
|Consider allowing control of upper/lower case folding of unquoted identifiers
* [http://archives.postgresql.org/pgsql-hackers/2004-04/msg00818.php <nowiki>Bringing PostgreSQL torwards the standard regarding case folding</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg01527.php <nowiki>Re: [SQL] Case Preservation disregarding case sensitivity?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00849.php <nowiki>TODO Item: Consider allowing control of upper/lower case folding of unquoted, identifiers</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00415.php <nowiki>Identifier case folding notes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00415.php <nowiki>Identifier case folding notes</nowiki>]
}}

{{TodoItem
|Add autonomous transactions
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php <nowiki>autonomous transactions</nowiki>]
}}

{{TodoItem
|Give query progress indication
* [[Query progress indication]]
}}

{{TodoItem
|Rethink our type system
* [[Rethinking datatypes]]
}}

== Features We Do ''Not'' Want ==

The following features have been discussed ad nauseum on the PostgreSQL mailing lists and the consensus has been that the project is not interested in them. As such, if you are going to bring them up as potential features, you will want to be familiar with all of the arguments against these features which have been previously made over the years. If you decide to work on such features anyway, you should be aware that you face a higher-than-normal barrier to get the Project to accept them.

{{TodoItem
|All backends running as threads in a single process (not wanted)
|This eliminates the process protection we get from the current setup. Thread creation is usually the same overhead as process creation on modern systems, so it seems unwise to use a pure threaded model, and MySQL and DB2 have demonstrated that threads introduce as many issues as they solve. Threading specific operations such as I/O, seq scans, and connection management has been discussed and will probably be implemented to enable specific performance features. Moving to a threaded engine would also require halting all other work on PostgreSQL for one to two years.}}

{{TodoItem
|"Oracle-style" optimizer hints (not wanted)
|Optimizer hints, as implemented in Oracle and other RDBMSes, are used to work around problems in the optimizer and introduce upgrade and maintenance issues. We would rather have such problems reported and fixed. We have discussed a more sophisticated system of per-class cost adjustment instead, but a specification remains to be developed. See [[OptimizerHintsDiscussion|Optimizer Hints Discussion]] for further information.}}

{{TodoItem
|Embedded server (not wanted)
|While PostgreSQL clients runs fine in limited-resource environments, the server requires multiple processes and a stable pool of resources to run reliably and efficiently. Stripping down the PostgreSQL server to run in the same process address space as the client application would add too much complexity and failure cases. Besides, there are several very mature embedded SQL databases already available.}}

{{TodoItem
|Obfuscated function source code (not wanted)
|Obfuscating function source code has minimal protective benefits because anyone with super-user access can find a way to view the code. At the same time, it would greatly complicate backups and other administrative tasks. To prevent non-super-users from viewing function source code, remove SELECT permission on pg_proc.
* [http://archives.postgresql.org/pgsql-general/2008-09/msg00668.php <nowiki>Obfuscated stored procedures (was Re: Oracle and Postgresql)</nowiki>]
}}

{{TodoItem
|Indeterminate behavior for the GROUP BY clause (not wanted)
|At least one other database product allows specification of a subset of the result columns which GROUP BY would need to be able to provide predictable results; the server is free to return any value from the group. This is not viewed as a desirable feature. PostgreSQL 9.1 will allow result columns that are not referenced by GROUP BY if a primary key for the same table is referenced in GROUP BY.
* [http://archives.postgresql.org/pgsql-hackers/2010-03/msg00297.php <nowiki>Re: SQL compatibility reminder: MySQL vs PostgreSQL</nowiki>]
}}

</div>

[[Category:Todo]]

Todo

2011-05-15T15:47:42Z

Schmiddy: /* psql */ We're now using pg_table_size() for \dt+ which should include TOAST size (though not index size)

<div style="margin: 1ex 1em; float: right;">
__TOC__
</div>

This list contains '''all known PostgreSQL bugs and feature requests'''. If you would like to work on an item, please read the [[Developer FAQ]] first. There is also a [[Development_information|development information page]].

* {{TodoPending}} - marks ordinary, incomplete items
* {{TodoEasy}} - marks items that are easier to implement
* {{TodoDone}} - marks changes that are done, and will appear in the PostgreSQL 9.1 release.

For help on editing this list, please see [[Talk:Todo]]. <b>Please do not add items here without discussion on the mailing list.</b>

<div style="padding: 1ex 4em;">
== Administration ==

{{TodoItem
|Allow administrators to cancel multi-statement idle transactions
|This allows locks to be released, but it is complex to report the cancellation back to the client.
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg01340.php <nowiki>Cancelling idle in transaction state</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg00441.php <nowiki>Re: Cancelling idle in transaction state</nowiki>]
}}

{{TodoItem
|Check for unreferenced table files created by transactions that were in-progress when the server terminated abruptly
* [http://archives.postgresql.org/pgsql-patches/2006-06/msg00096.php <nowiki>Removing unreferenced files</nowiki>]
}}

{{TodoItem
|Set proper permissions on non-system schemas during db creation
|Currently all schemas are owned by the super-user because they are copied from the template1 database. However, since all objects are inherited from the template database, it is not clear that setting schemas to the db owner is correct.}}

{{TodoItem
|Allow log_min_messages to be specified on a per-module basis
|This would allow administrators to see more detailed information from specific sections of the backend, e.g. checkpoints, autovacuum, etc. Another idea is to allow separate configuration files for each module, or allow arbitrary SET commands to be passed to them. See also [[Logging Brainstorm]].}}

{{TodoItem
|Simplify creation of partitioned tables
|This would allow creation of partitioned tables without requiring creation of triggers or rules for INSERT/UPDATE/DELETE, and constraints for rapid partition selection. Options could include range and hash partition selection. See also [[Table partitioning]]
}}

{{TodoItemDone
|Allow auto-selection of partitioned tables for min/max() operations
|There was a patch on -hackers from July 2009, but it has not been merged: [http://archives.postgresql.org/pgsql-hackers/2009-07/msg01115.php <nowiki>MIN/MAX optimization for partitioned table</nowiki>]}}

{{TodoItem
|Allow custom variables to appear in pg_settings()
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00850.php <nowiki>Re: count(*) performance improvement ideas</nowiki>]
}}

{{TodoItem
|Have custom variables be transaction-safe
* {{MessageLink|4B577E9F.8000505@dunslane.net|Custom GUCs still a bit broken}}
}}

{{TodoItem
|Implement the SQL-standard mechanism whereby REVOKE ROLE revokes only the privilege granted by the invoking role, and not those granted by other roles
* [http://archives.postgresql.org/pgsql-bugs/2007-05/msg00010.php <nowiki>Re: Grantor name gets lost when grantor role dropped</nowiki>]
}}

{{TodoItemDone
|Improve server security options
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01875.php <nowiki>Re: [0/4] Proposal of SE-PostgreSQL patches</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00000.php <nowiki>Re: [0/4] Proposal of SE-PostgreSQL patches</nowiki>]
}}

{{TodoItem
|Prevent query cancel packets from being replayed by an attacker, especially when using SSL
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg00345.php <nowiki>Replay attack of query cancel</nowiki>]
}}

{{TodoItem
|Provide a way to query the log collector subprocess to determine the name of the currently active log file
* [http://archives.postgresql.org/pgsql-general/2008-11/msg00418.php <nowiki>Current log files when rotating?</nowiki>]
}}

{{TodoItemDone
|Allow the client to authenticate the server in a Unix-domain socket connection, e.g., using SO_PEERCRED
* http://archives.postgresql.org/message-id/20090401173756.GB21229@svana.org
}}

{{TodoItem
|Allow simpler reporting of the unix domain socket directory and allow easier configuration of its default location
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg01555.php
}}

{{TodoItem
|Allow custom daemons to be automatically stopped/started along with the postmaster
|This allows easier administration of daemons like user job schedulers or replication-related daemons.
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01701.php <nowiki>Re: scheduler in core</nowiki>]
}}

{{TodoItemDone
|Increase maximum values for max_standby_streaming_delay and log_min_duration_statement
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg01517.php
* Committed: http://archives.postgresql.org/pgsql-committers/2011-03/msg00210.php
}}

{{TodoItem
|Improve logging of prepared transactions recovered during startup
* [http://archives.postgresql.org/pgsql-hackers/2006-11/msg00092.php <nowiki>"recovering prepared transaction" after server restart message</nowiki>]
}}

=== Configuration files ===
{{TodoSubsection}}

{{TodoItemDone
|Allow pg_hba.conf to specify host names along with IP addresses
|Host name lookup could occur when the postmaster reads the pg_hba.conf file, or when the backend starts. Another solution would be to reverse lookup the connection IP and check that hostname against the host names in pg_hba.conf. We could also then check that the host name maps to the IP address.
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00569.php <nowiki>TODO Item: Allow pg_hba.conf to specify host names along with IP addresses</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00613.php
}}

{{TodoItem
|Allow postgresql.conf file values to be changed via an SQL API, perhaps using SET GLOBAL
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00764.php
}}

{{TodoItem
|Consider normalizing fractions in postgresql.conf, perhaps using '%'
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00550.php <nowiki>Fractions in GUC variables</nowiki>]
}}

{{TodoItem
|Allow Kerberos to disable stripping of realms so we can check the username@realm against multiple realms
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00009.php <nowiki>krb_match_realm patch</nowiki>]
}}

{{TodoItem
|Improve LDAP authentication configuration options
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01745.php <nowiki>Proposed Patch - LDAPS support for servers on port 636 w/o TLS</nowiki>]
}}

{{TodoItem
|Add external tool to auto-tune some postgresql.conf parameters
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00000.php <nowiki>Re: Overhauling GUCS</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00033.php <nowiki>Simple postgresql.conf wizard</nowiki>]
}}

{{TodoItem
|Add 'hostgss' pg_hba.conf option to allow GSS link-level encryption
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg01454.php <nowiki>Re: Plans for 8.4</nowiki>]
}}

{{TodoItem
|Process pg_hba.conf keywords as case-insensitive
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00432.php <nowiki>More robust pg_hba.conf parsing/error logging</nowiki>]
}}

{{TodoItemDone
|Have pg_hba.conf consider "replication" special only in the database field
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00632.php
}}

{{TodoItemDone
|Rename unix domain socket 'ident' connections to 'peer', to avoid confusion with TCP 'ident'
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg01053.php
}}

{{TodoItem
|Create utility to compute accurate random_page_cost value}}

{{TodoItem
|Allow configuration files to be independently validated
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01831.php
}}

{{TodoItem
|Allow postgresql.conf settings to be accepted by backends even if some settings are invalid for those backends
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00330.php
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00375.php
}}

{{TodoItem
|Allow all backends to receive postgresql.conf setting changes at the same time
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00330.php
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00375.php
}}

{{TodoEndSubsection}}

=== Tablespaces ===
{{TodoSubsection}}

{{TodoItem
|Allow a database in tablespace t1 with tables created in tablespace t2 to be used as a template for a new database created with default tablespace t2
|Currently all objects in the default database tablespace must have default tablespace specifications. This is because new databases are created by copying directories. If you mix default tablespace tables and tablespace-specified tables in the same directory, creating a new database from such a mixed directory would create a new database with tables that had incorrect explicit tablespaces. To fix this would require modifying pg_class in the newly copied database, which we don't currently do.}}

{{TodoItem
|Allow reporting of which objects are in which tablespaces
|This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.}}

{{TodoItem
|Allow WAL replay of CREATE TABLESPACE to work when the directory structure on the recovery computer is different from the original}}

{{TodoItem
|Allow per-tablespace quotas}}

{{TodoItem
|Allow tablespaces on RAM-based partitions for unlogged tables
* http://archives.postgresql.org/pgsql-advocacy/2011-05/msg00033.php
}}

{{TodoEndSubsection}}

=== Statistics Collector ===
{{TodoSubsection}}

{{TodoItem
|Allow statistics last vacuum/analyze execution times to be displayed without requiring track_counts to be enabled
* [http://archives.postgresql.org/pgsql-docs/2007-04/msg00028.php <nowiki>row-level stats and last analyze time</nowiki>]
}}

{{TodoItem
|Clear table counters on TRUNCATE
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00169.php <nowiki>Small TRUNCATE glitch</nowiki>]
}}

{{TodoItemDone
| Allow the clearing of cluster-level statistics
* [http://archives.postgresql.org/pgsql-hackers/2009-03/msg00917.php <nowiki>Resetting cluster-wide statistics</nowiki>]
* ''pg_stat_reset_shared('bgwriter')'' (9.0) now handles the ''pg_stat_bgwriter'' subset of this
}}

{{TodoEndSubsection}}

=== SSL ===
{{TodoSubsection}}

{{TodoItem
|Allow SSL authentication/encryption over unix domain sockets
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00924.php <nowiki>Re: Spoofing as the postmaster</nowiki>]
}}

{{TodoItem
|Allow SSL key file permission checks to be optionally disabled when sharing SSL keys with other applications
* [http://archives.postgresql.org/pgsql-bugs/2007-12/msg00069.php <nowiki>BUG #3809: SSL "unsafe" private key permissions bug</nowiki>]
}}

{{TodoItem
|Allow SSL CRL files to be re-read during configuration file reload, rather than requiring a server restart
|Unlike SSL CRT files, CRL (Certificate Revocation List) files are updated frequently
* [http://archives.postgresql.org/pgsql-general/2008-12/msg00832.php <nowiki>Automatic CRL reload</nowiki>]
Alternatively or additionally supporting OCSP (online certificate security protocol) would provide real-time revocation discovery without reloading
}}

{{TodoItem
| Allow automatic selection of SSL client certificates from a certificate store
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00406.php <nowiki>Allow multiple certificates or keys in the postgresql.crt/.key files</nowiki>]
}}

{{TodoItem
| Send the full certificate server chain to the client
* [http://archives.postgresql.org/pgsql-bugs/2009-12/msg00145.php BUG #5245: Full Server Certificate Chain Not Sent to client]
}}

{{TodoEndSubsection}}

=== Point-In-Time Recovery (PITR) ===
{{TodoSubsection}}

{{TodoItemEasy
|Create dump tool for write-ahead logs for use in determining transaction id for point-in-time recovery
|This is useful for checking PITR recovery.}}

{{TodoItemDone
|Allow recovery.conf to support the same syntax as postgresql.conf, including quoting
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00497.php <nowiki>recovery.conf parsing problems</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg00684.php
}}

{{TodoItem
|Allow archive_mode to be changed without server restart?
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01655.php <nowiki>Enabling archive_mode without restart</nowiki>]
}}

{{TodoItem
|Consider avoiding WAL switching via archive_timeout if there has been no database activity
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01469.php <nowiki>archive_timeout behavior for no activity</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00395.php <nowiki>Re: archive_timeout behavior for no activity</nowiki>]
}}

{{TodoItemEasy
|Expose pg_controldata via an SQL interface
|Helpful for monitoring replicated databases
* http://archives.postgresql.org/message-id/4B901D73.8030003@agliodbs.com
* [http://archives.postgresql.org/message-id/4B959D7A.6010907@joeconway.com initial patch]
}}

{{TodoEndSubsection}}

=== Standby server mode ===
{{TodoSubsection}}

{{TodoItem
| Allow pg_xlogfile_name() to be used in recovery mode
* [http://archives.postgresql.org/message-id/3f0b79eb1001190135vd9f62f1sa7868abc1ea61d12@mail.gmail.com <nowiki>Streaming replication and pg_xlogfile_name()</nowiki>]
}}

{{TodoItem
| Prevent variables inherited from the server environment from begin used for making streaming replication connections.
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01011.php <nowiki>Re: Parameter name standby_mode</nowiki>]
}}

{{TodoItemDone
| Add a new privilege for connecting for streaming replication
* [http://archives.postgresql.org/message-id/3f0b79eb1003040247p6b092241of91784a505e9abd8@mail.gmail.com <nowiki>Streaming replication and privilege</nowiki>]
}}

{{TodoItemDone
| Add support for synchronous replication.
}}

{{TodoItemDone
| Add capability to take and send a base backup over the streaming replication connection, making it possible to initialize a new standby server from a running primary server without a WAL archive or other access to the primary server.
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg00136.php
}}

{{TodoItem
| Allow hot file system backups on standby servers
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01727.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01490.php
}}

{{TodoItemDone
| Allow the automatic removal of old directories when streaming base backups
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00558.php
}}

{{TodoItem
| Change walsender so that it applies per-role settings
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg00642.php
}}

{{TodoItem
| Add more control over waiting for synchronous commit
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01611.php
}}

{{TodoItem
| Restructure configuration parameters for standby mode
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg01820.php
}}

{{TodoItem
| Allow time-delayed application of logs on the standby
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00992.php
}}

{{TodoEndSubsection}}

== Data Types ==

{{TodoItemDone
|Reduce storage space for small NUMERICs
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg01331.php <nowiki>Saving space for common kinds of numeric values</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-02/msg00505.php <nowiki>Numeric patch to add special-case representations for < 8 bytes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php <nowiki>Re: Reducing NUMERIC size for 8.3</nowiki>]
}}

{{TodoItem
|Fix data types where equality comparison is not intuitive, e.g. box}}

{{TodoItem
|Add support for public SYNONYMs
* [http://archives.postgresql.org/pgsql-hackers/2006-03/msg00519.php <nowiki>Proposal for SYNONYMS</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg02043.php
* http://archives.postgresql.org/pgsql-general/2010-12/msg00139.php
}}

{{TodoItem
|Add support for SQL-standard GENERATED/IDENTITY columns
* [http://archives.postgresql.org/pgsql-hackers/2006-07/msg00543.php <nowiki>Re: Three weeks left until feature freeze</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-08/msg00038.php <nowiki>GENERATED ... AS IDENTITY, Was: Re: Feature Freeze</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00344.php <nowiki>Behavior of GENERATED columns per SQL2003</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00076.php <nowiki>Re: [HACKERS] Behavior of GENERATED columns per SQL2003</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00604.php <nowiki>IDENTITY/GENERATED patch</nowiki>]
}}

{{TodoItem
|Consider placing all sequences in a single table, or create a system view
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php <nowiki>Re: newbie: renaming sequences task</nowiki>]
}}

{{TodoItem
|Consider a special data type for regular expressions
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg01067.php <nowiki>Why is there a tsquery data type?</nowiki>]
}}

{{TodoItem
|Reduce BIT data type overhead using short varlena headers
* [http://archives.postgresql.org/pgsql-general/2007-12/msg00273.php <nowiki>storage size of "bit" data type..</nowiki>]
}}

{{TodoItemDone
|Allow adding enumerated values to an existing enumerated data type
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01718.php <nowiki>Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing</nowiki>]
}}

{{TodoItem
|Allow renaming and deleting enumerated values from an existing enumerated data type
}}

{{TodoItem
|Support scoped IPv6 addresses in the inet type
* [http://archives.postgresql.org/pgsql-bugs/2007-05/msg00111.php <nowiki>strange problem with ip6</nowiki>]
}}

{{TodoItem
|Add a JSON (JavaScript Object Notation) data type
|This would behave similar to the XML data type, which is stored as text, but allows element lookup and conversion functions.
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg01494.php <nowiki>PATCH: Add hstore_to_json()</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg00001.php <nowiki>Re: PATCH: Add hstore_to_json()</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-03/msg01092.php <nowiki>Proposal: Add JSON support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00057.php <nowiki>Re: Proposal: Add JSON support</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00481.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01694.php
}}

{{TodoItem
|Considering improving performance of computing CHAR() value lengths
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00900.php <nowiki>char() overhead on read-only workloads not so insignifcant as the docs claim it is...</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01787.php <nowiki>Re: [PATCH] backend: compare word-at-a-time in bcTruelen</nowiki>]
}}

{{TodoItem
|Add overlaps geometric operators that ignore point overlaps
* http://archives.postgresql.org/pgsql-hackers/2010-03/msg00861.php
}}

=== Domains ===
{{TodoSubsection}}

{{TodoItem
|Allow functions defined as casts to domains to be called during casting
* [http://archives.postgresql.org/pgsql-hackers/2006-05/msg00072.php <nowiki>bug? non working casts for domain</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-09/msg01681.php <nowiki>TODO: Fix CREATE CAST on DOMAINs</nowiki>]
}}

{{TodoItem
|Allow values to be cast to domain types
* [http://archives.postgresql.org/pgsql-hackers/2003-06/msg01206.php <nowiki>Domain casting still doesn't work right</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00289.php <nowiki>domain casting?</nowiki>]
}}

{{TodoItem
|Make domains work better with polymorphic functions
* [http://archives.postgresql.org/message-id/4887.1228700773@sss.pgh.pa.us Polymorphic types vs. domains]
* [http://archives.postgresql.org/message-id/15535.1238774571@sss.pgh.pa.us some difficulties with fixing it]
}}

{{TodoEndSubsection}}

=== Dates and Times ===
{{TodoSubsection}}

{{TodoItem
|Allow infinite intervals just like infinite timestamps}}

{{TodoItem
|Allow TIMESTAMP WITH TIME ZONE to store the original timezone information, either zone name or offset from UTC
|If the TIMESTAMP value is stored with a time zone name, interval computations should adjust based on the time zone rules.
* [http://archives.postgresql.org/pgsql-hackers/2004-10/msg00705.php <nowiki>timestamp with time zone a la sql99</nowiki>]
}}

{{TodoItem
|Have timestamp subtraction not call justify_hours()?
* [http://archives.postgresql.org/pgsql-sql/2006-10/msg00059.php <nowiki>timestamp subtraction (was Re: formatting intervals with to_char)</nowiki>]
}}

{{TodoItem
|Improve TIMESTAMP WITH TIME ZONE subtraction to be DST-aware
|Currently subtracting one date from another that crosses a daylight savings time adjustment can return '1 day 1 hour', but adding that back to the first date returns a time one hour in the future. This is caused by the adjustment of '25 hours' to '1 day 1 hour', and '1 day' is the same time the next day, even if daylight savings adjustments are involved.}}

{{TodoItem
|Fix interval display to support values exceeding 2^31 hours}}

{{TodoItem
|Add overflow checking to timestamp and interval arithmetic}}

{{TodoItem
|Add function to allow the creation of timestamps using parameters
* http://archives.postgresql.org/pgsql-performance/2010-06/msg00232.php
}}

{{TodoEndSubsection}}

=== Arrays ===
{{TodoSubsection}}

{{TodoItem
|Add support for arrays of domains
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00114.php <nowiki>Re: updated WIP: arrays of composites</nowiki>]
}}

{{TodoItem
|Allow single-byte header storage for array elements}}

{{TodoItem
|Add function to detect if an array is empty
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00475.php <nowiki>Re: array_length()</nowiki>]
}}

{{TodoItem
|Improve handling of empty arrays
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01033.php <nowiki>So what's an "empty" array anyway?</nowiki>]
}}

{{TodoItem
|Improve handling of NULLs in arrays
* [http://archives.postgresql.org/pgsql-bugs/2008-11/msg00009.php <nowiki>BUG #4509: array_cat's null behaviour is inconsistent</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg01040.php
}}

{{TodoEndSubsection}}

=== Binary Data ===
{{TodoSubsection}}

{{TodoItem
|Improve vacuum of large objects, like contrib/vacuumlo?}}

{{TodoItem
|Auto-delete large objects when referencing row is deleted
|contrib/lo offers this functionality.}}

{{TodoItem
|Allow read/write into TOAST values like large objects
|This requires the TOAST column to be stored EXTERNAL.}}

{{TodoItem
|Add API for 64-bit large object access
* [http://archives.postgresql.org/pgsql-hackers/2005-09/msg00781.php <nowiki>64-bit API for large objects</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg01790.php
}}

{{TodoEndSubsection}}

=== MONEY Data Type ===
{{TodoSubsection}}

{{TodoItem
|Add locale-aware MONEY type, and support multiple currencies
* [http://archives.postgresql.org/pgsql-general/2005-08/msg01432.php <nowiki>A real currency type</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01181.php <nowiki>Money type todos?</nowiki>]
}}

{{TodoItem
|MONEY dumps in a locale-specific format making it difficult to restore to a system with a different locale}}

{{TodoItemDone
|Allow MONEY to be easily cast to/from other numeric data types}}

{{TodoEndSubsection}}

=== Text Search ===
{{TodoSubsection}}

{{TodoItem
|Allow dictionaries to change the token that is passed on to later dictionaries
* [http://archives.postgresql.org/pgsql-patches/2007-11/msg00081.php <nowiki>a tsearch2 (8.2.4) dictionary that only filters out stopwords</nowiki>]
}}

{{TodoItem
|Consider a function-based API for '@@' searches
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00511.php <nowiki>Simplifying Text Search</nowiki>]
}}

{{TodoItem
|Improve text search error messages
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00966.php <nowiki>Poorly designed tsearch NOTICEs</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg01146.php <nowiki>Re: Poorly designed tsearch NOTICEs</nowiki>]
}}

{{TodoItem
|Consider changing error to warning for strings larger than one megabyte
* [http://archives.postgresql.org/pgsql-bugs/2008-02/msg00190.php <nowiki>BUG #3975: tsearch2 index should not bomb out of 1Mb limit</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-03/msg00062.php <nowiki>Re: [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit</nowiki>]
}}

{{TodoItem
|tsearch and tsdicts regression tests fail in Turkish locale on glibc
* [http://archives.postgresql.org/message-id/49749645.5070801@gmx.net tsearch with Turkish locale]
}}

{{TodoItem
|tsquery negator operator treated as part of lexeme
* [http://archives.postgresql.org/pgsql-bugs/2009-06/msg00346.php BUG #4887: inclusion operator (@>) on tsqeries behaves not conforming to documentation]
}}

{{TodoItem
|Improve handling of plus signs in email address user names, and perhaps improve URL parsing
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00772.php
}}

{{TodoItem
|Improve default parser, to more easily allow adding new tokens
* http://archives.postgresql.org/message-id/23485.1297727826@sss.pgh.pa.us
}}

{{TodoEndSubsection}}

=== XML ===
{{TodoSubsection}}

{{TodoItem
|Allow XML arrays to be cast to other data types
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00981.php <nowiki>proposal casting from XML[] to int[], numeric[], text[]</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00231.php <nowiki>Re: proposal casting from XML[] to int[], numeric[], text[]</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00471.php <nowiki>Re: proposal casting from XML[] to int[], numeric[], text[]</nowiki>]
}}

{{TodoItem
|Add XML Schema validation and xmlvalidate functions (SQL:2008)}}

{{TodoItem
|Add xmlvalidatedtd variant to support validating against a DTD?}}

{{TodoItem
|Relax-NG validation; libxml2 supports this already}}

{{TodoItem
|Allow reliable XML operation non-UTF8 server encodings (xpath(), in particular, is known to not work)
* [http://archives.postgresql.org/pgsql-bugs/2009-01/msg00135.php <nowiki>BUG #4622: xpath only work in utf-8 server encoding</nowiki>]
* http://archives.postgresql.org/message-id/4110.1238973350@sss.pgh.pa.us}}

{{TodoItem
|Add functions from SQL:2006: XMLDOCUMENT, XMLCAST, XMLTEXT}}

{{TodoItem
|Add XMLNAMESPACES support in XMLELEMENT and elsewhere}}

{{TodoItem
|Move XSLT from contrib/xml2 to a more reasonable location
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00539.php
}}

{{TodoItem
|Report errors returned by the XSLT library
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00562.php
}}

{{TodoItem
|Improve the XSLT parameter passing API
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00416.php
}}

{{TodoItem
|XML Canonical: Convert XML documents to canonical form to compare them. libxml2 has support for this.}}

{{TodoItem
|Add pretty-printed XML output option
|Parse a document and serialize it back in some indented form. libxml2 might support this.}}

{{TodoItem
|Add XMLQUERY (from the SQL/XML standard)}}

{{TodoItem
|Allow XML sthredding
|In some cases shredding could be better option (if there is no need to keep XML docs entirely, e.g. if we have already developed tools that understand only relational data. This would be a separate module that implements annotated schema decomposition technique, similar to DB2 and SQL Server functionality.}}

{{TodoItem
|Fix Nested or repeated xpath() that apparently mess up namespaces [http://archives.postgresql.org/pgsql-bugs/2008-03/msg00097.php] [http://archives.postgresql.org/pgsql-bugs/2008-03/msg00144.php] [http://archives.postgresql.org/pgsql-general/2008-03/msg00295.php] [http://archives.postgresql.org/pgsql-bugs/2008-07/msg00054.php] [http://archives.postgresql.org/message-id/004f01c90e91$138e9d10$3aabd730$@anstett@iaas.uni-stuttgart.de]}}

{{TodoItem
|XPath: Adding the <x> at the root causes problems [http://archives.postgresql.org/pgsql-bugs/2008-05/msg00184.php] [http://archives.postgresql.org/pgsql-bugs/2008-07/msg00054.php] [http://archives.postgresql.org/pgsql-general/2008-07/msg00613.php]}}

{{TodoItem
|xpath_table needs to be implemented/implementable to get rid of contrib/xml2 [http://archives.postgresql.org/pgsql-general/2008-05/msg00823.php]}}

{{TodoItem
|xpath_table is pretty broken anyway [http://archives.postgresql.org/pgsql-hackers/2010-02/msg02424.php]}}

{{TodoItem
|better handling of XPath data types [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00616.php] [http://archives.postgresql.org/message-id/004a01c90e90$4b986d90$e2c948b0$@anstett@iaas.uni-stuttgart.de]}}

{{TodoItemDone
|xpath_exists() is needed
|This checks whether or not the path specified exists in the XML value. Without this function we need to use the weird "array_dims(xpath(...)) IS NOT NULL" syntax.}}

{{TodoItem
|Improve handling of PIs and DTDs in xmlconcat() [http://archives.postgresql.org/message-id/200904211211.n3LCB09p008988@wwwmaster.postgresql.org]}}

{{TodoItem
|Restructure XML and /contrib/xml2 functionality
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg02314.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg00017.php
}}

{{TodoEndSubsection}}

== Functions ==

{{TodoItem
|Allow INET subnet comparisons using non-constants to be indexed}}

{{TodoItem
|Add an INET overlaps operator, for use by exclusion constraints
* http://archives.postgresql.org/pgsql-hackers/2010-03/msg00845.php
}}

{{TodoItem
|Enforce typmod for function inputs, function results and parameters for spi_prepare'd statements called from PLs
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg01403.php <nowiki>Re: BUG #2917: spi_prepare doesn't accept typename aliases</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-11/msg01160.php <nowiki>RFC for adding typmods to functions</nowiki>]
}}

{{TodoItem
|Fix IS OF so it matches the ISO specification, and add documentation
* [http://archives.postgresql.org/pgsql-patches/2003-08/msg00060.php <nowiki>Re: [HACKERS] IS OF</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00060.php <nowiki>ToDo: add documentation for operator IS OF</nowiki>]
}}

{{TodoItem
|Implement Boyer-Moore searching in LIKE queries
* {{messageLink|27645.1220635769@sss.pgh.pa.us|TODO item: Implement Boyer-Moore searching (First time hacker)}}
}}

{{TodoItem
|Prevent malicious functions from being executed with the permissions of unsuspecting users
|Index functions are safe, so VACUUM and ANALYZE are safe too. Triggers, CHECK and DEFAULT expressions, and rules are still vulnerable.
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00268.php <nowiki>Some notes about the index-functions security vulnerability</nowiki>]
}}

{{TodoItem
|Reduce memory usage of aggregates in set returning functions
* [http://archives.postgresql.org/pgsql-performance/2008-01/msg00031.php <nowiki>Re: Performance of aggregates over set-returning functions</nowiki>]
}}

{{TodoItem
|Fix /contrib/ltree operator
* [http://archives.postgresql.org/pgsql-bugs/2007-11/msg00044.php <nowiki>BUG #3720: wrong results at using ltree</nowiki>]
}}

{{TodoItem
|Fix /contrib/btree_gist's implementation of inet indexing
* [http://archives.postgresql.org/pgsql-bugs/2010-10/msg00099.php <nowiki>BUG #5705: btree_gist: Index on inet changes query result</nowiki>]
}}

{{TodoItem
|<nowiki>Fix inconsistent precedence of =, >, and < compared to <>, >=, and <=</nowiki>
* [http://archives.postgresql.org/pgsql-bugs/2007-12/msg00145.php <nowiki>BUG #3822: Nonstandard precedence for comparison operators</nowiki>]
}}

{{TodoItem
|Fix regular expression bug when using complex back-references
* [http://archives.postgresql.org/pgsql-bugs/2007-10/msg00000.php <nowiki>BUG #3645: regular expression back references seem broken</nowiki>]
}}

{{TodoItem
|Have /contrib/dblink reuse unnamed connections
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00895.php <nowiki>dblink un-named connection doesn't get re-used</nowiki>]
}}

{{TodoItem
|Improve formatting of pg_get_viewdef() output
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg01648.php <nowiki>pg_get_viewdef formattiing</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01885.php <nowiki>Re: pretty print viewdefs</nowiki>]
}}

{{TodoItemDone
|Add printf()-like functionality
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00367.php <nowiki>RfD: more powerful "any" types</nowiki>]
}}

{{TodoItem
|Add function to dump pg_depend information cleanly
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00226.php <nowiki>Elementary dependency look-up</nowiki>]
}}

{{TodoItem
|Improve relation size functions such as pg_relation_size() to avoid producing an error when called against a no longer visible relation
* [http://archives.postgresql.org/message-id/28488.1286461610@sss.pgh.pa.us pg_relation_size / could not open relation with OID #]
}}

=== Character Formatting ===

{{TodoSubsection}}
{{TodoItem
|Allow to_date() and to_timestamp() to accept localized month names}}

{{TodoItem
|Add missing parameter handling in to_char()
* [http://archives.postgresql.org/pgsql-hackers/2005-12/msg00948.php <nowiki>Re: to_char and i18n</nowiki>]
}}

{{TodoItem
|Throw an error from to_char() instead of printing a string of "#" when a number doesn't fit in the desired output format.
* discussed in [http://archives.postgresql.org/message-id/37ed240d0907290836w42187222n18664dfcbcb445b1@mail.gmail.com "to_char, support for EEEE format"]
}}

{{TodoItem
|Allow to_char() on interval values to accumulate the highest unit requested
|2= Some special format flag would be required to request such accumulation. Such functionality could also be added to EXTRACT. Prevent accumulation that crosses the month/day boundary because of the uneven number of days in a month.
* to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
* to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
* to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
* to_char(INTERVAL '3 years 5 months','MM') => 41
}}

{{TodoItem
|Fix to_number() handling for values not matching the format string
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg01447.php <nowiki>Re: numeric_to_number() function skipping some digits</nowiki>]
}}

{{TodoEndSubsection}}

== Multi-Language Support ==

{{TodoItem
|Add NCHAR (as distinguished from ordinary varchar),}}

{{TodoItemDone
|Allow more fine-grained collation selection
|Right now the collation is fixed at database creation time.
* [http://archives.postgresql.org/pgsql-hackers/2005-03/msg00932.php <nowiki>Re: Patch for collation using ICU</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2005-08/msg00039.php <nowiki>FW: Win32 unicode vs ICU</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2005-08/msg00309.php <nowiki>Re: FW: Win32 unicode vs ICU</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2005-09/msg00110.php <nowiki>Proof of concept COLLATE support with patch</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2005-09/msg00020.php <nowiki>For review: Initial support for COLLATE</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2005-12/msg01121.php <nowiki>Proposed COLLATE implementation</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-01/msg00767.php <nowiki>TODO item: locale per database patch (new iteration)</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-03/msg00233.php <nowiki>Re: FW: Win32 unicode vs ICU</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-09/msg00662.php <nowiki>Re: Fixed length data types issue</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00557.php <nowiki>[WIP] collation support revisited (phase 1)</nowiki>]
* [[Todo:Collate]]
* [[Todo:ICU]]
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg01362.php <nowiki>WIP patch: Collation support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg00012.php <nowiki>Re: WIP patch: Collation support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00868.php <nowiki>PGDay.it collation discussion notes</nowiki>]
* [http://www.unicode.org/unicode/reports/tr10/ Unicode collation algorithm]
}}

{{TodoItem
|Add a cares-about-collation column to pg_proc, so that unresolved-collation errors can be thrown at parse time
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01520.php <nowiki>Open issues for collations</nowiki>]
}}

{{TodoItem
|Integrate collations with text search configurations
* [http://archives.postgresql.org/message-id/28887.1303579034@sss.pgh.pa.us <nowiki>Some TODO items for collations</nowiki>]
}}

{{TodoItem
|Integrate collations with to_char() and related functions
* [http://archives.postgresql.org/message-id/28887.1303579034@sss.pgh.pa.us <nowiki>Some TODO items for collations</nowiki>]
}}

{{TodoItem
|Add a LOCALE option to CREATE DATABASE, as a shorthand
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00119.php <nowiki> Re: 8.4 open items list</nowiki>]
}}

{{TodoItem
|Support multiple simultaneous character sets, per SQL:2008}}

{{TodoItem
|Improve UTF8 combined character handling?}}

{{TodoItem
|Add octet_length_server() and octet_length_client()}}

{{TodoItem
|Make octet_length_client() the same as octet_length()?}}

{{TodoItem
|Fix problems with wrong runtime encoding conversion for NLS message files}}

{{TodoItem
|Add URL to more complete multi-byte regression tests
* [http://archives.postgresql.org/pgsql-hackers/2005-07/msg00272.php <nowiki>Multi-byte and client side character encoding tests for copy command..</nowiki>]
}}

{{TodoItem
|Fix contrib/fuzzystrmatch to work with multibyte encodings
* [http://archives.postgresql.org/pgsql-bugs/2009-04/msg00047.php <nowiki> soundex function returns UTF-16 characters</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00138.php <nowiki> dmetaphone woes</nowiki>]
}}

{{TodoItemDone
|Set client encoding based on the client operating system encoding
|Currently client_encoding is set in postgresql.conf, which defaults to the server encoding.
* [http://archives.postgresql.org/pgsql-hackers/2006-08/msg01696.php <nowiki>Re: [GENERAL] invalid byte sequence ?</nowiki>]
}}

{{TodoItem
|Change memory allocation for multi-byte functions so memory is allocated inside conversion functions
|Currently we preallocate memory based on worst-case usage.}}

{{TodoItem
|Add ability to use case-insensitive regular expressions on multi-byte characters
|Currently it works for UTF-8, but not other multi-byte encodings
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00433.php <nowiki>Regexps vs. locale</nowiki>]
* {{MessageLink|20091201210024.B1393753FB7@cvs.postgresql.org|A partial solution for UTF-8}}
}}

{{TodoItem
|Improve encoding of connection startup messages sent to the client
|Currently some authentication error messages are sent in the server encoding
* [http://archives.postgresql.org/pgsql-general/2008-12/msg00801.php <nowiki>encoding of PostgreSQL messages</nowiki>]
* [http://archives.postgresql.org/pgsql-general/2009-01/msg00005.php <nowiki>Re: encoding of PostgreSQL messages</nowiki>]
}}

{{TodoItem
|Have pg_stat_activity display query strings in the correct client encoding
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00131.php <nowiki>pg_stats queries versus per-database encodings</nowiki>]
}}

{{TodoItem
|More sensible support for Unicode combining characters, normal forms
* http://archives.postgresql.org/message-id/200904141532.44618.peter_e@gmx.net
}}

== Views / Rules ==

{{TodoItem
|Automatically create rules on views so they are updateable, per SQL:2008
|We can only auto-create rules for simple views. For more complex cases users will still have to write rules manually.
* [http://archives.postgresql.org/pgsql-hackers/2006-03/msg00586.php <nowiki>Proposal for updatable views</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-08/msg00255.php <nowiki>Updatable views</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg01746.php <nowiki>Re: [COMMITTERS] pgsql: Automatic view update rules Bernd Helmle</nowiki>]
* http://wiki.postgresql.org/wiki/Updatable_views
}}

{{TodoItem
|Add the functionality of the WITH CHECK OPTION clause to CREATE VIEW}}

{{TodoItem
|Allow VIEW/RULE recompilation when the underlying tables change
|This is both difficult and controversial.
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg01723.php Re: About "Allow VIEW/RULE recompilation when the underlying tables change"]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg01724.php Re: About "Allow VIEW/RULE recompilation when the underlying tables change"]
}}

{{TodoItem
|Make it possible to use RETURNING together with conditional DO INSTEAD rules, such as for partitioning setups
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00577.php <nowiki>RETURNING and DO INSTEAD ... Intentional or not?</nowiki>]
}}

{{TodoItem
|Add the ability to automatically create materialized views
|Right now materialized views require the user to create triggers on the main table to keep the summary table current. SQL syntax should be able to manage the triggers and summary table automatically. A more sophisticated implementation would automatically retrieve from the summary table when the main table is referenced, if possible. See [[Materialized Views]] for implementation details
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00479.php <nowiki>GSoC - proposal - Materialized Views in PostgreSQL</nowiki>]
}}

{{TodoItem
|Improve ability to modify views via ALTER TABLE
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00691.php <nowiki>Re: idea: storing view source in system catalogs</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg01410.php <nowiki>modifying views</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg00300.php <nowiki>Re: patch: Add columns via CREATE OR REPLACE VIEW</nowiki>]
}}

{{TodoItem
|Prevent low-cost functions from seeing unauthorized view rows
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01346.php <nowiki>Using views for row-level access control is leaky</nowiki>]
}}

== SQL Commands ==

{{TodoItem
|Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT}}

{{TodoItem
|Improve type determination of unknown (NULL or quoted literal) result columns for UNION/INTERSECT/EXCEPT
* [http://archives.postgresql.org/message-id/9799.1302719551@sss.pgh.pa.us <nowiki>UNION construct type cast gives poor error message</nowiki>]
}}

{{TodoItem
|Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php <nowiki>WIP: grouping sets support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00466.php <nowiki>Implementation of GROUPING SETS (T431: Extended grouping capabilities)</nowiki>]
}}

{{TodoItemDone
|Fix TRUNCATE ... RESTART IDENTITY so its effect on sequences is rolled back on transaction abort
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00550.php <nowiki>Re: [PATCHES] TRUNCATE TABLE with IDENTITY</nowiki>]
}}

{{TodoItem
|Allow prepared transactions with temporary tables created and dropped in the same transaction, and when an ON COMMIT DELETE ROWS temporary table is accessed
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00047.php <nowiki>Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables</nowiki>]
* [http://archives.postgresql.org/message-id/492543D5.9050904@enterprisedb.com A suggestion on how to implement this]
}}

{{TodoItem
|Add a GUC variable to warn about non-standard SQL usage in queries}}

{{TodoItem
|Add SQL-standard MERGE/REPLACE/UPSERT command
|MERGE is typically used to merge two tables. REPLACE or UPSERT command does UPDATE, or on failure, INSERT. See [[SQL MERGE]] for notes on the implementation details.
}}

{{TodoItem
|Add NOVICE output level for helpful messages
|For example, have it warn about unjoined tables. This could also control automatic sequence/index creation messages.
}}

{{TodoItem
|Allow NOTIFY in rules involving conditionals}}

{{TodoItem
|Allow EXPLAIN to identify tables that were skipped because of constraint_exclusion
}}

{{TodoItemDone
|Enable standard_conforming_strings by default
|When this is done, backslash-quote should be prohibited in non-E<nowiki>''</nowiki> strings because of possible confusion over how such strings treat backslashes. Basically, <nowiki>''</nowiki> is always safe for a literal single quote, while \' might or might not be based on the backslash handling rules.}}

{{TodoItem
|Simplify dropping roles that have objects in several databases}}

{{TodoItem
|Allow the count returned by SELECT, etc to be represented as an int64 to allow a higher range of values}}

{{TodoItem
|Add support for WITH RECURSIVE ... CYCLE
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00291.php <nowiki>WITH RECURSIVE ... CYCLE in vanilla SQL: issues with arrays of rows</nowiki>]}}

{{TodoItem
|Add DEFAULT .. AS OWNER so permission checks are done as the table owner
|This would be useful for SERIAL nextval() calls and CHECK constraints.}}

{{TodoItem
|Allow DISTINCT to work in multiple-argument aggregate calls}}

{{TodoItem
|Add column to pg_stat_activity that shows the progress of long-running commands like CREATE INDEX and VACUUM
* [http://archives.postgresql.org/pgsql-patches/2008-04/msg00203.php <nowiki>EXPLAIN progress info</nowiki>]
}}

{{TodoItemDone
|Allow INSERT/UPDATE/DELETE ... RETURNING in common table expressions
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg00472.php <nowiki>Writeable CTEs and side effects</nowiki>]
}}

{{TodoItem
|Add comments on system tables/columns using the information in catalogs.sgml
|Ideally the information would be pulled from the SGML file automatically.}}

{{TodoItem
|Prevent the specification of conflicting transaction read/write options
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00684.php <nowiki>Re: SET TRANSACTION and SQL Standard</nowiki>]
}}

{{TodoItem
|Support LATERAL subqueries
|Lateral subqueries can reference columns of tables defined outside the subquery at the same level, i.e. ''laterally''.
For example, a LATERAL subquery in a FROM clause could reference tables defined in the same FROM clause.
Currently only the columns of tables defined ''above'' subqueries are recognized.
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00292.php <nowiki>LATERAL</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg00991.php <nowiki>Re: LATERAL</nowiki>]
}}

{{TodoItemDone
|Add support for functional dependencies
|This would allow omitting GROUP BY columns when grouping by the primary key.
}}

{{TodoItem
|Prevent temporary tables created with ON COMMIT DELETE ROWS from repeatedly truncating the table on every commit if the table is already empty
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg00842.php
* http://archives.postgresql.org/pgsql-performance/2010-03/msg00392.php
* http://archives.postgresql.org/pgsql-performance/2010-04/msg00046.php
}}

{{TodoItem
|Allow DELETE and UPDATE to be used with LIMIT and ORDER BY
* http://archives.postgresql.org/pgadmin-hackers/2010-04/msg00078.php
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg01997.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00021.php
}}

{{TodoItem
|Allow finer control over the caching of prepared query plans
|Currently anonymous (un-named) queries prepared via the wire protocol are replanned every time bind parameters are supplied --- allow SQL PREPARE to do the same. Also, allow control over replanning prepared queries either manually or automatically when statistics for execute parameters differ dramatically from those used during planning.
* http://archives.postgresql.org/message-id/201002151911.o1FJBYh22763@momjian.us
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00597.php
}}

{{TodoItem
|Allow PREPARE of cursors}}

{{TodoItem
|Have DISCARD PLANS discard plans cached by functions
|DISCARD all should do the same.
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00431.php
}}

=== CREATE ===
{{TodoSubsection}}

{{TodoItem
|Allow CREATE TABLE AS to determine column lengths for complex expressions like SELECT col1 || col2}}

{{TodoItem
|Have WITH CONSTRAINTS also create constraint indexes
* [http://archives.postgresql.org/pgsql-patches/2007-04/msg00149.php <nowiki>Re: CREATE TABLE LIKE INCLUDING INDEXES support</nowiki>]
}}

{{TodoItem
|Move NOT NULL constraint information to pg_constraint
|Currently NOT NULL constraints are stored in pg_attribute without any designation of their origins, e.g. primary keys. One manifest problem is that dropping a PRIMARY KEY constraint does not remove the NOT NULL constraint designation. Another issue is that we should probably force NOT NULL to be propagated from parent tables to children, just as CHECK constraints are. (But then does dropping PRIMARY KEY affect children?)
* http://archives.postgresql.org/message-id/19768.1238680878@sss.pgh.pa.us
* http://archives.postgresql.org/message-id/200909181005.n8IA5Ris061239@wwwmaster.postgresql.org
}}

{{TodoItem
|Prevent concurrent CREATE TABLE from sometimes returning a cryptic error message
* [http://archives.postgresql.org/pgsql-bugs/2007-10/msg00169.php <nowiki>BUG #3692: Conflicting create table statements throw unexpected error</nowiki>]
}}

{{TodoItemDone
|Allow CREATE TABLE to optionally create a table if it does not already exist, without throwing an error
|The fact that tables contain data makes this more complex than other CREATE OR REPLACE operations.
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg01300.php <nowiki>Add column if not exists (CINE)</nowiki>]
}}

{{TodoItem
|Add CREATE SCHEMA ... LIKE that copies a schema}}

{{TodoItem
|Fix CREATE OR REPLACE FUNCTION to not leave objects depending on the function in inconsistent state
* [http://archives.postgresql.org/pgsql-general/2008-08/msg00985.php indexes on functions and create or replace function]
}}

{{TodoItem
|Allow temporary tables to exist as empty by default in all sessions
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00006.php <nowiki>what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg01329.php <nowiki>idea: global temp tables</nowiki>]
* [http://archives.postgresql.org//pgsql-hackers/2009-05/msg00016.php <nowiki>Re: idea: global temp tables</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg01098.php <nowiki>global temporary tables</nowiki>]
}}

{{TodoItem
|Allow the creation of "distinct" types
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01647.php <nowiki>Distinct types</nowiki>]
}}

{{TodoItem
|Consider analyzing temporary tables when they are first used in a query
|Autovacuum cannot analyze or vacuum temporary tables.
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00416.php <nowiki>autovacuum and temp tables support</nowiki>]
}}

{{TodoItem
|Allow an unlogged table to be changed to logged
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00315.php
}}

{{TodoEndSubsection}}

=== UPDATE ===
{{TodoSubsection}}

{{TodoItem
|<nowiki>Allow UPDATE tab SET ROW (col, ...) = (SELECT...)</nowiki>
* [http://archives.postgresql.org/pgsql-hackers/2006-07/msg01308.php <nowiki>Re: [PATCHES] extension for sql update</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00865.php <nowiki>UPDATE using sub selects</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-04/msg00315.php <nowiki>UPDATE using sub selects</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-03/msg00237.php <nowiki>Re: UPDATE using sub selects</nowiki>]
}}

{{TodoItem
|Research self-referential UPDATEs that see inconsistent row versions in read-committed mode
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00507.php <nowiki>Concurrently updating an updatable view</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00016.php <nowiki>Re: Do we need a TODO? (was Re: Concurrently updating anupdatable view)</nowiki>]
}}

{{TodoItem
|Improve performance of EvalPlanQual mechanism that rechecks already-updated rows
|This is related to the previous item, which questions whether it even has the right semantics
* [http://archives.postgresql.org/pgsql-bugs/2008-09/msg00045.php <nowiki>BUG #4401: concurrent updates to a table blocks one update indefinitely</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2009-07/msg00302.php <nowiki>BUG #4945: Parallel update(s) gone wild</nowiki>]
}}

{{TodoEndSubsection}}

=== ALTER ===
{{TodoSubsection}}

{{TodoItem
|Have ALTER TABLE RENAME of a SERIAL column rename the sequence
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php <nowiki>Re: newbie: renaming sequences task</nowiki>]
}}

{{TodoItem
|Have ALTER SEQUENCE RENAME rename the sequence name stored in the sequence table
* [http://archives.postgresql.org/pgsql-bugs/2007-09/msg00092.php <nowiki>BUG #3619: Renaming sequence does not update its 'sequence_name' field</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2007-10/msg00007.php <nowiki>Re: BUG #3619: Renaming sequence does not update its 'sequence_name' field</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php <nowiki>Re: newbie: renaming sequences task</nowiki>]
}}

{{TodoItemEasy
|Allow ALTER TABLE ... ALTER CONSTRAINT ... RENAME or ALTER TABLE RENAME CONSTRAINT
* [http://archives.postgresql.org/pgsql-patches/2006-02/msg00168.php <nowiki>ALTER CONSTRAINT RENAME patch reverted</nowiki>]
}}

{{TodoItem
|Add ALTER DOMAIN to modify the underlying data type}}

{{TodoItemDone
|Allow ALTER TABLE to change constraint deferrability}}

{{TodoItemDone
|Add missing object types for ALTER ... SET SCHEMA}}

{{TodoItem
|Allow ALTER TABLESPACE to move the tablespace to different directories}}

{{TodoItem
|Allow moving system tables to other tablespaces, where possible
|Currently non-global system tables must be in the default database tablespace. Global system tables can never be moved.}}

{{TodoItem
|Have ALTER INDEX update the name of a constraint using that index}}

{{TodoItem
|Allow column display reordering by recording a display, storage, and permanent id for every column?
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00782.php <nowiki>Re: column ordering, was Re: [PATCHES] Enums patch v2</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg01029.php <nowiki>Column reordering in pg_dump</nowiki>]
}}

{{TodoItemDone
|Allow an existing index to be marked as a table's primary key
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00500.php <nowiki>Setting a pre-existing index as a primary key</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00642.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00265.php
}}

{{TodoItemDone
|Allow ALTER TYPE on composite types to perform operations similar to ALTER TABLE
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00245.php <nowiki>ALTER composite type does not work, but ALTER TABLE which ROWTYPE is used as a type - works fine</nowiki>]
}}

{{TodoItemDone
|Don't require table rewrite on ALTER TABLE ... ALTER COLUMN TYPE, when the old and new data types are binary compatible
* http://archives.postgresql.org/message-id/200903040137.n241bAUV035002@wwwmaster.postgresql.org
* [http://archives.postgresql.org/pgsql-patches/2006-10/msg00154.php <nowiki>Eliminating phase 3 requirement for varlen increases via ALTER COLUMN</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg02360.php
}}

{{TodoItem|Allow deactivating (and reactivating) indexes via ALTER TABLE|{{messageLink|<87hbegz5ir.fsf@cbbrowne.afilias-int.info>|In discussion on FK activation/deactivation}} }}

{{TodoItemDone
|Reduce locking required for ALTER commands
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg00533.php <nowiki>ALTER TABLE SET STATISTICS requires AccessExclusiveLock</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01083.php <nowiki>Re: ALTER TABLE SET STATISTICS requires AccessExclusiveLock</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2008-10/msg01248.php
* http://archives.postgresql.org/pgsql-hackers/2008-10/msg00242.php
}}

{{TodoEndSubsection}}

=== CLUSTER ===
{{TodoSubsection}}

{{TodoItem
|Automatically maintain clustering on a table
|This might require some background daemon to maintain clustering during periods of low usage. It might also require tables to be only partially filled for easier reorganization. Another idea would be to create a merged heap/index data file so an index lookup would automatically access the heap data too. A third idea would be to store heap rows in hashed groups, perhaps using a user-supplied hash function.
* [http://archives.postgresql.org/pgsql-performance/2004-08/msg00350.php <nowiki>Equivalent praxis to CLUSTERED INDEX?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00155.php <nowiki>Re: Grouped Index Tuples</nowiki>]
* http://community.enterprisedb.com/git/
* [http://archives.postgresql.org/pgsql-performance/2009-10/msg00346.php <nowiki>Re: maintain_cluster_order_v5.patch</nowiki>]
}}

{{TodoItemDone
|Improve CLUSTER performance by sorting to reduce random I/O
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php <nowiki>Our CLUSTER implementation is pessimal</nowiki>]
}}

{{TodoItemDone
|Make CLUSTER VERBOSE more verbose.
|It is also used by new VACUUM FULL VERBOSE.}}

{{TodoEndSubsection}}

=== COPY ===
{{TodoSubsection}}

{{TodoItem
|Allow COPY to report error lines and continue
|This requires the use of a savepoint before each COPY line is processed, with ROLLBACK on COPY failure.
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00572.php <nowiki>Re: VLDB Features</nowiki>]
}}

{{TodoItem
|Allow COPY on a newly-created table to skip WAL logging
|On crash recovery, the table involved in the COPY would be removed or have its heap and index files truncated. One issue is that no other backend should be able to add to the table at the same time, which is something that is currently allowed. This currently is done if the table is created inside the same transaction block as the COPY because no other backends can see the table.}}

{{TodoItem
|Allow COPY FROM to create index entries in bulk
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00811.php <nowiki>Batch update of indexes on data loading</nowiki>]
}}

{{TodoItem
|Allow COPY in CSV mode to control whether a quoted zero-length string is treated as NULL
|Currently this is always treated as a zero-length string, which generates an error when loading into an integer column
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00905.php <nowiki>Re: [PATCHES] allow CSV quote in NULL</nowiki>]
}}

{{TodoItem
|Improve COPY performance
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00954.php <nowiki>Re: 8.3 / 8.2.6 restore comparison</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01882.php
}}

{{TodoItem
|Allow COPY to report errors sooner
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01169.php <nowiki>Timely reporting of COPY errors</nowiki>]
}}

{{TodoItem
|Allow COPY to handle other number formats
|E.g. the German notation. Best would be something like WITH DECIMAL ','.
}}

{{TodoItem
|Allow a stalled COPY to exit if the backend is terminated
* [http://archives.postgresql.org/pgsql-bugs/2009-04/msg00067.php <nowiki>Re: possible bug not in open items</nowiki>]
}}

{{TodoEndSubsection}}

=== GRANT/REVOKE ===
{{TodoSubsection}}

{{TodoItem
|Allow SERIAL sequences to inherit permissions from the base table?}}

{{TodoItem
|Allow dropping of a role that has connection rights
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00736.php <nowiki>DROP ROLE dependency tracking ...</nowiki>]
}}
{{TodoEndSubsection}}

=== DECLARE CURSOR ===
{{TodoSubsection}}

{{TodoItem
|Prevent DROP TABLE from dropping a table referenced by its own open cursor?}}

{{TodoItem
|Provide some guarantees about the behavior of cursors that invoke volatile functions
* [http://archives.postgresql.org/message-id/20997.1244563664@sss.pgh.pa.us Re: Cursor with hold emits the same row more than once across commits in 8.3.7]
}}

{{TodoEndSubsection}}

=== INSERT ===
{{TodoSubsection}}

{{TodoItem
|Allow INSERT/UPDATE of the system-generated oid value for a row}}

{{TodoItem
|In rules, allow VALUES() to contain a mixture of 'old' and 'new' references}}

{{TodoEndSubsection}}

=== SHOW/SET ===
{{TodoSubsection}}

{{TodoItem
|Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM ANALYZE, and CLUSTER}}

{{TodoItem
|Rationalize the discrepancy between settings that use values in bytes and SHOW that returns the object count
* [http://archives.postgresql.org/pgsql-docs/2008-07/msg00007.php <nowiki>Re: [ADMIN] shared_buffers and shmmax</nowiki>]
}}

{{TodoEndSubsection}}

=== ANALYZE ===
{{TodoSubsection}}

{{TodoItem
|Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage}}

{{TodoItem
|Have EXPLAIN ANALYZE report rows as floating-point numbers
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg01363.php <nowiki>explain analyze rows=%.0f</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00108.php <nowiki>Re: explain analyze rows=%.0f</nowiki>]
}}

{{TodoItem
|Improve how ANALYZE computes in-doubt tuples
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00771.php <nowiki>VACUUM/ANALYZE counting of in-doubt tuples</nowiki>]
}}

{{TodoEndSubsection}}

=== Window Functions ===
See {{messageLink|357.1230492361@sss.pgh.pa.us|TODO items for window functions}}.
{{TodoSubsection}}
{{TodoItem
|Support creation of user-defined window functions
|We have the ability to create new window functions written in C. Is it
worth the effort to create an API that would let them be written in PL/pgsql, etc?}}

{{TodoItem
|Implement full support for window framing clauses
|In addition to done clauses described in the [http://developer.postgresql.org/pgdocs/postgres/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS latest doc], these clauses are not implemented yet.
* RANGE BETWEEN ... PRECEDING/FOLLOWING
* EXCLUDE
}}

{{TodoItem
|Investigate tuplestore performance issues
|The tuplestore_in_memory() thing is just a band-aid, we ought to try to solve it properly. tuplestore_advance seems like a weak spot as well.
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00152.php <nowiki>tuplestore potential performance problem</nowiki>]
}}

{{TodoItem|Do we really need so much duplicated code between Agg and WindowAgg?}}

{{TodoItem
|Teach planner to evaluate multiple windows in the optimal order
|Currently windows are always evaluated in the query-specified order.
* http://archives.postgresql.org/message-id/3CDAD71E9D70417290FCF66F0178D1E1@amd64
}}

{{TodoItem
|Implement DISTINCT clause in window aggregates
|Some proprietary RDBMSs have implemented it already, so it helps with porting from those.}}

{{TodoEndSubsection}}

== Integrity Constraints ==
=== Keys ===

{{TodoSubsection}}

{{TodoItem
|Improve deferrable unique constraints for cases with many conflicts
|The current implementation fires a trigger for each potentially conflicting row. This might not scale well for an update that changes many key values at once.
}}

{{TodoEndSubsection}}

=== Referential Integrity ===
{{TodoSubsection}}

{{TodoItem
|Add MATCH PARTIAL referential integrity}}

{{TodoItem
|Change foreign key constraint for array -> element to mean element in array?
* [http://archives.postgresql.org/pgsql-hackers/2010-10/msg01814.php <nowiki>foreign keys for array/period contains relationships</nowiki>]
}}

{{TodoItem
|Fix problem when cascading referential triggers make changes on cascaded tables, seeing the tables in an intermediate state
* [http://archives.postgresql.org/pgsql-hackers/2005-09/msg00174.php <nowiki>Re: [PATCHES] Work-in-progress referential action trigger timing</nowiki>]
}}

{{TodoItem
|Optimize referential integrity checks
* [http://archives.postgresql.org/pgsql-performance/2005-10/msg00458.php <nowiki>Re: Effects of cascading references in foreign keys</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00744.php <nowiki>Can't ri_KeysEqual() consider two nulls as equal?</nowiki>]
}}

{{TodoEndSubsection}}

== Server-Side Languages ==

{{TodoItem
|Add support for polymorphic arguments and return types to languages other than PL/PgSQL}}

{{TodoItem
|Add support for OUT and INOUT parameters to languages other than PL/PgSQL}}

{{TodoItem
|Add more fine-grained specification of functions taking arbitrary data types
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00367.php <nowiki>RfD: more powerful "any" types</nowiki>]
}}

{{TodoItem
|Implement stored procedures
|This might involve the control of transaction state and the return of multiple result sets
* [http://archives.postgresql.org/pgsql-general/2008-10/msg00454.php <nowiki>PL/pgSQL stored procedure returning multiple result sets (SELECTs)?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg01375.php <nowiki>Proposal: real procedures again (8.4)</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg00542.php
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg01149.php <nowiki>Gathering specs and discussion on feature (post 9.1)</nowiki>]
}}

{{TodoItem
|Allow holdable cursors in SPI}}

{{TodoItemEasy
|Add SPI_gettypmod() to return a field's typemod from a TupleDesc
* http://archives.postgresql.org/pgsql-hackers/2005-11/msg00250.php
}}

=== SQL-Language Functions ===
{{TodoSubsection}}

{{TodoItem
|Allow SQL-language functions to reference parameters by parameter name
|Currently SQL-language functions can only refer to dollar parameters, e.g. $1
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01479.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01519.php
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00221.php
}}

{{TodoItem
|Rethink query plan caching and timing of parse analysis within SQL-language functions
|They should work more like plpgsql functions do ...
* [http://archives.postgresql.org/pgsql-bugs/2011-05/msg00078.php <nowiki>Re: BUG #6019: invalid cached plan on inherited table</nowiki>]
}}

{{TodoEndSubsection}}

=== PL/pgSQL ===
{{TodoSubsection}}

{{TodoItem
|Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]}}

{{TodoItem
|<nowiki>Allow listing of record column names, and access to record columns via variables, e.g. columns := r.(*), tval2 := r.(colname)</nowiki>
* [http://archives.postgresql.org/pgsql-patches/2005-07/msg00458.php <nowiki>Re: PL/PGSQL: Dynamic Record Introspection</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-05/msg00302.php <nowiki>Re: PL/PGSQL: Dynamic Record Introspection</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-06/msg00031.php <nowiki>Re: PL/PGSQL: Dynamic Record Introspection</nowiki>]
}}

{{TodoItem
|Allow row and record variables to be set to NULL constants, and allow NULL tests on such variables
|Because a row is not scalar, do not allow assignment from NULL-valued scalars.
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg00070.php <nowiki>NULL and plpgsql rows</nowiki>]
}}

{{TodoItem
|Consider keeping separate cached copies when search_path changes
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg01009.php <nowiki>pl/pgsql Plan Invalidation and search_path</nowiki>]
}}

{{TodoItem
|Improve handling of NULL row values vs. NULL rows
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg01758.php <nowiki>Null row vs. row of nulls in plpgsql</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg01973.php
}}

{{TodoEndSubsection}}

=== PL/Perl ===
{{TodoSubsection}}

{{TodoItem
|Allow regex operations in plperl using UTF8 characters in non-UTF8 encoded databases}}

{{TodoEndSubsection}}

=== PL/Python ===
{{TodoSubsection}}

{{TodoItemDone
|Add table function support}}

{{TodoItemDone
|Add tracebacks
* [http://archives.postgresql.org/pgsql-patches/2006-02/msg00288.php <nowiki>Re: plpython tracebacks</nowiki>]
}}

{{TodoItem
|Develop a trusted variant of PL/Python.}}

{{TodoItem
|Create a new restricted execution class that will allow passing function arguments in as locals. Passing them as globals means functions cannot be called recursively.
* [http://archives.postgresql.org/pgsql-hackers/2011-02/msg01468.php <nowiki>Re: pl/python do not delete function arguments</nowiki>]
}}

{{TodoItem
|Improve documentation}}

{{TodoItem
|Add a DB-API compliant interface on top of the SPI interface}}

{{TodoItem
|Improve behaviour of exception functions and types
* http://archives.postgresql.org/pgsql-docs/2010-11/msg00022.php
* http://archives.postgresql.org/pgsql-docs/2010-11/msg00031.php
}}
{{TodoItem
|For functions returning a setof record with a composite type, cache the I/O functions for the composite type
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg02007.php
}}

{{TodoEndSubsection}}

=== PL/Tcl ===
{{TodoSubsection}}

{{TodoItem
|Add table function support}}

{{TodoItem
|Check encoding validity of values passed back to Postgres in function returns, trigger tuple changes, and SPI calls.}}

{{TodoEndSubsection}}

== Clients ==

{{TodoItem
|Add a function like pg_get_indexdef() that report more detailed index information
* [http://archives.postgresql.org/pgsql-bugs/2007-12/msg00166.php <nowiki>BUG #3829: Wrong index reporting from pgAdmin III (v1.8.0 rev 6766-6767)</nowiki>]
}}

{{TodoItem
|Split out pg_resetxlog output into pre- and post-sections
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg02040.php
}}

=== pg_ctl ===
{{TodoSubsection}}

{{TodoItem
|Allow pg_ctl to work properly with configuration files located outside the PGDATA directory
|pg_ctl can not read the pid file because it isn't located in the config directory but in the PGDATA directory. The solution is to allow pg_ctl to read and understand postgresql.conf to find the data_directory value.
* [http://archives.postgresql.org/pgsql-bugs/2009-10/msg00024.php <nowiki>BUG #5103: "pg_ctl -w (re)start" fails with custom unix_socket_directory</nowiki>]
}}

{{TodoItem
|Modify pg_ctl behavior and exit codes to make it easier to write an LSB conforming init script
|It may be desirable to condition some of the changes on a command-line switch, to avoid breaking existing scripts. A Linux shell (sh) script is referenced which has been tested and seems to provide a high degree of conformance in multiple environments. Study of this script might suggest areas where pg_ctl could be modified to make writing an LSB conforming script easier; however, some aspects of that script would be unnecessary with other suggested changes to pg_ctl, and discussion on the lists did not reach consensus on support for all aspects of this script. Further discussion of particular changes is needed before beginning any work.
* [[Lsb_conforming_init_script|LSB conforming init script]]
These threads should be studied for other ideas on improvements:
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01390.php <nowiki>We should Axe /contrib/start-scripts</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01843.php <nowiki>Linux LSB init script</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00008.php <nowiki>Re: Linux LSB init script</nowiki>]
}}

{{TodoEndSubsection}}

=== psql ===
{{TodoSubsection}}

{{TodoItem
|Have psql \ds show all sequences and their settings
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00916.php <nowiki>Re: TODO item: Have psql show current values for a sequence</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00401.php <nowiki>Quick patch: Display sequence owner</nowiki>]
}}

{{TodoItem
|Have \d on a sequence indicate if the sequences is owned by a table}}

{{TodoItem
|Move psql backslash database information into the backend, use mnemonic commands?
|This would allow non-psql clients to pull the same information out of the database as psql.
* [http://archives.postgresql.org/pgsql-hackers/2004-01/msg00191.php <nowiki>Re: psql \d option list overloaded</nowiki>]
}}

{{TodoItem
|Make psql's \d commands more consistent in its handling of schemas
* [http://archives.postgresql.org/pgsql-hackers/2004-11/msg00014.php <nowiki>Re: psql and schemas</nowiki>]
}}

{{TodoItem
|Consistently display privilege information for all objects in psql}}

{{TodoItem
|Add "auto" expanded mode that outputs in expanded format if "wrapped" mode can't wrap the output to the screen width
|Consider using auto-expanded mode for backslash commands like \df+.
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00417.php <nowiki>Re: psql wrapped format default for backslash-d commands</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg01638.php
}}

{{TodoItem
|Prevent tab completion of SET TRANSACTION from querying the database and therefore preventing the transaction isolation level from being set.
|Currently SET <tab> causes a database lookup to check all supported session variables. This query causes problems because setting the transaction isolation level must be the first statement of a transaction.}}

{{TodoItem
|Add a \set variable to control whether \s displays line numbers
|Another option is to add \# which lists line numbers, and allows command execution.
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00255.php <nowiki>Re: psql possible TODO</nowiki>]
}}

{{TodoItem
|Have \d show child tables that inherit from the specified parent}}

{{TodoItem
|Include the symbolic SQLSTATE name in verbose error reports
* [http://archives.postgresql.org/pgsql-general/2007-09/msg00438.php <nowiki>Re: Checking is TSearch2 query is valid</nowiki>]
}}

{{TodoItem
|Add prompt escape to display the client and server versions
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00310.php <nowiki>WIP patch for TODO Item: Add prompt escape to display the client and server versions</nowiki>]
}}

{{TodoItem
|Add option to wrap column values at whitespace boundaries, rather than chopping them at a fixed width.
|Currently, "wrapped" format chops values into fixed widths. Perhaps the word wrapping could use the same algorithm documented in the W3C specification.
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00404.php <nowiki>Re: psql wrapped format default for backslash-d commands</nowiki>]
* http://www.w3.org/TR/CSS21/tables.html#auto-table-layout}}

{{TodoItem
|Support the ReST table output format
|Details about the ReST format: http://docutils.sourceforge.net/rst.html#reference-documentation
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg01007.php <nowiki>Proposal: new border setting in psql</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00518.php <nowiki>Re: Proposal: new border setting in psql</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00609.php <nowiki>Re: Proposal: new border setting in psql</nowiki>]
}}

{{TodoItem
|Add option to print advice for people familiar with other databases
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01845.php <nowiki>MySQL-ism help patch for psql</nowiki>]
}}

{{TodoItemDone
|Consider showing TOAST and index sizes in \dt+
* [http://archives.postgresql.org/pgsql-general/2010-01/msg00912.php <nowiki>\dt+ sizes don't include TOAST data</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00485.php <nowiki>Re: psql \dt and table size</nowiki>]
}}

{{TodoItem
|Allow \dd to show constraint comments
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00436.php <nowiki>Re: More robust pg_hba.conf parsing/error logging</nowiki>]
* [http://archives.postgresql.org/pgsql-general/2009-09/msg00199.php <nowiki>comment on constraint</nowiki>]
}}

{{TodoItem
|Add ability to edit views with \ev
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00023.php <nowiki>Adding \ev view editor?</nowiki>]
}}

{{TodoItemDone
|Add \dL to show languages
* [http://archives.postgresql.org/pgsql-hackers/2009-07/msg00915.php <nowiki>Re: [PATCH] Psql List Languages</nowiki>]
}}

{{TodoItemDone
|Distinguish between unique indexes and unique constraints in \d+
* http://archives.postgresql.org/message-id/8780.1271187360@sss.pgh.pa.us
}}

{{TodoItem
|Fix FETCH_COUNT to handle SELECT ... INTO and WITH queries
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01565.php
* http://archives.postgresql.org/pgsql-bugs/2010-05/msg00192.php
}}

{{TodoItem
|Prevent psql from sending remaining single-line multi-statement queries after reconnecting
* http://archives.postgresql.org/pgsql-bugs/2010-05/msg00159.php
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01283.php
}}

{{TodoItemEasy
|Add \i option to bring in the specified file as a quoted literal. This would be useful for creating functions and other areas. Details still need to be worked out.
* http://archives.postgresql.org/pgsql-bugs/2011-02/msg00016.php
* http://archives.postgresql.org/pgsql-bugs/2011-02/msg00020.php
}}

{{TodoItem
|Consider having psql -c read .psqlrc, for consistency
|psql -f already reads .psqlrc
}}

{{TodoItem
|Allow processing of multiple -f (file) options
}}

{{TodoItem
|Improve line drawing characters
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00386.php
}}

{{TodoEndSubsection}}

=== pg_dump / pg_restore ===
{{TodoSubsection}}

{{TodoItemEasy
|<nowiki>Add full object name to the tag field. eg. for operators we need '=(integer, integer)', instead of just '='.</nowiki>}}

{{TodoItem
|Add pg_dumpall custom format dumps?
* [http://archives.postgresql.org/pgsql-general/2010-05/msg00509.php pg_dumpall custom format]
|}}

{{TodoItem
|Avoid using platform-dependent locale names in pg_dumpall output
|Using native locale names puts roadblocks in the way of porting a dump to another platform. One possible solution is to get
CREATE DATABASE to accept some agreed-on set of locale names and fix them up to meet the platform's requirements.
* http://archives.postgresql.org/message-id/21396.1241716688@sss.pgh.pa.us
}}

{{TodoItem
|Allow selection of individual object(s) of all types, not just tables}}

{{TodoItem
|In a selective dump, allow dumping of an object and all its dependencies}}

{{TodoItem
|Add options like pg_restore -l and -L to pg_dump}}

{{TodoItem
|Add support for multiple pg_restore -t options, like pg_dump
|pg_restore's -t switch is less useful than pg_dump's in quite a few ways: no multiple switches, no pattern matching, no ability to pick up indexes and other dependent items for a selected table. It should be made to handle this switch just like pg_dump does.}}

{{TodoItem
|Stop dumping CASCADE on DROP TYPE commands in clean mode}}

{{TodoItem
|Allow pg_dump --clean to drop roles that own objects or have privileges
|tgl says: if this is about pg_dumpall, it's done as of 8.4. If it's really about pg_dump, what does it mean? pg_dump has no business dropping roles.}}

{{TodoItem
|Remove unnecessary function pointer abstractions in pg_dump source code}}

{{TodoItem
|Allow pg_dump to utilize multiple CPUs and I/O channels by dumping multiple objects simultaneously
|The difficulty with this is getting multiple dump processes to produce a single dump output file. It also would require several sessions to share the same snapshot.
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php <nowiki>pg_dump additional options for performance</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00135.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00040.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg02454.php
}}

{{TodoItem
|Allow pg_restore to load different parts of the COPY data for a single table simultaneously}}

{{TodoItem
|Remove support for dumping from pre-7.3 servers
|In 7.3 and later, we can get accurate dependency information from the server. pg_dump still contains a lot of crufty code
to try to deal with the lack of dependency info in older servers, but the usefulness of maintaining that code grows small.}}

{{TodoItem
|Allow pre/data/post files when schema and data are dumped separately, for performance reasons
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php <nowiki>pg_dump additional options for performance</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-07/msg00185.php <nowiki>Re: pg_dump additional options for performance</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00821.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00135.php
}}

{{TodoItem
|Refactor handling of database attributes between pg_dump and pg_dumpall
|Currently only pg_dumpall emits database attributes, such as ALTER DATABASE SET commands and database-level GRANTs.
Many people wish that pg_dump would do that. One proposal is to let pg_dump issue such commands if the -C switch was used,
but it's unclear whether that will satisfy the demand.
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg01031.php <nowiki>ALTER DATABASE vs pg_dump</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2010-05/msg00010.php summary of the issues]
}}

{{TodoItem
|Change pg_dump so that a comment on the dumped database is applied to the loaded database, even if the database has a different name.
|This will require new backend syntax, perhaps COMMENT ON CURRENT DATABASE. This is related to the previous item.}}

{{TodoItem
|Allow parallel restore of tar dumps
* [http://archives.postgresql.org/pgsql-hackers/2009-02/msg01154.php <nowiki>Re: parallel restore</nowiki>]
}}

{{TodoItem
|Allow pg_dumpall to output restorable ALTER USER/DATABASE SET settings
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00916.php
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00394.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg02359.php
}}

{{TodoEndSubsection}}

=== ecpg ===
{{TodoSubsection}}

{{TodoItem
|Docs
|Document differences between ecpg and the SQL standard and information about the Informix-compatibility module.}}

{{TodoItem
|Solve cardinality > 1 for input descriptors / variables?}}

{{TodoItem
|Add a semantic check level, e.g. check if a table really exists}}

{{TodoItem
|fix handling of DB attributes that are arrays}}

{{TodoItem
|Fix nested C comments}}

{{TodoItemEasy
|sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified}}

{{TodoItem
|Make SET CONNECTION thread-aware, non-standard?}}

{{TodoItem
|Allow multidimensional arrays}}

{{TodoItem
|Implement COPY FROM STDIN}}

{{TodoItem
|Provide a way to specify size of a bytea parameter
* [http://archives.postgresql.org/message-id/200906192131.n5JLVoMo044178@wwwmaster.postgresql.org <nowiki>BUG #4866: ECPG and BYTEA</nowiki>]
}}

{{TodoItemEasy
|Fix small memory leaks in ecpg
|Memory leaks in a short running application like ecpg are not really a problem, but make debugging more complicated}}

{{TodoItem
|Allow reuse of cursor name variables
* [http://archives.postgresql.org/message-id/20100329113435.GA3430@feivel.credativ.lan <nowiki>Problems with variable cursorname in ecpg</nowiki>]
}}

{{TodoEndSubsection}}

=== libpq ===
{{TodoSubsection}}

{{TodoItem
|Prevent PQfnumber() from lowercasing unquoted column names
|PQfnumber() should never have been doing lowercasing, but historically it has so we need a way to prevent it}}

{{TodoItem
|Allow statement results to be automatically batched to the client
|Currently all statement results are transferred to the libpq client before libpq makes the results available to the application. This feature would allow the application to make use of the first result rows while the rest are transferred, or held on the server waiting for them to be requested by libpq. One complexity is that a statement like SELECT 1/col could error out mid-way through the result set.}}

{{TodoItem
|Consider disallowing multiple queries in PQexec() as an additional barrier to SQL injection attacks
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg00184.php <nowiki>Re: InitPostgres and flatfiles question</nowiki>]
}}

{{TodoItem
|Add PQexecf() that allows complex parameter substitution
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01803.php <nowiki>Last minute mini-proposal (I know, know) for PQexecf()</nowiki>]
}}

{{TodoItem
|Add SQLSTATE and severity to errors generated within libpq itself
* [http://archives.postgresql.org/pgsql-interfaces/2007-11/msg00015.php <nowiki>v8.1: Error severity on libpq PGconn*</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01425.php
}}

{{TodoItemDone
|Add code to detect client encoding and locale from the operating system environment
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg01040.php <nowiki>Determining client_encoding from client locale</nowiki>]
}}

{{TodoItem
|Add support for interface/ipaddress binding to libpq
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01811.php <nowiki>SR/libpq - outbound interface/ipaddress binding</nowiki>]
}}

{{TodoEndSubsection}}

== Triggers ==

{{TodoItem
|Improve storage of deferred trigger queue
|Right now all deferred trigger information is stored in backend memory. This could exhaust memory for very large trigger queues. This item involves dumping large queues into files, or doing some kind of join to process all the triggers, some bulk operation, or a bitmap.
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00876.php <nowiki>Re: BUG #4204: COPY to table with FK has memory leak</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg00464.php <nowiki>Scaling up deferred unique checks and the after trigger queue</nowiki>]
}}

{{TodoItem
|Allow triggers to be disabled in only the current session.
|This is currently possible by starting a multi-statement transaction, modifying the system tables, performing the desired SQL, restoring the system tables, and committing the transaction. ALTER TABLE ... TRIGGER requires a table lock so it is not ideal for this usage.}}

{{TodoItem
|With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN KEY
|If the dump is known to be valid, allow foreign keys to be added without revalidating the data.}}

{{TodoItem
|Allow statement-level triggers to access modified rows}}

{{TodoItem
|When statement-level triggers are defined on a parent table, have them fire only on the parent table, and fire child table triggers only where appropriate
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg01883.php <nowiki>Statement-level triggers and inheritance</nowiki>]
}}

{{TodoItem
|Allow AFTER triggers on system tables
|System tables are modified in many places in the backend without going through the executor and therefore not causing triggers to fire. To complete this item, the functions that modify system tables will have to fire triggers.
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01665.php
* http://wiki.postgresql.org/wiki/DDL_Triggers
}}

{{TodoItem
|Tighten trigger permission checks
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00564.php <nowiki>Security leak with trigger functions?</nowiki>]
}}

{{TodoItem
|Allow BEFORE INSERT triggers on views
* [http://archives.postgresql.org/pgsql-general/2007-02/msg01466.php <nowiki>Re: Why can't I put a BEFORE EACH ROW trigger on a view?</nowiki>]
}}

{{TodoItem
|Add database and transaction-level triggers
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00451.php <nowiki>Proposal for db level triggers</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00620.php <nowiki>triggers on prepare, commit, rollback... ?</nowiki>]
}}

{{TodoItem
|Reduce locking requirements for creating a trigger
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00635.php <nowiki>Re: Change lock requirements for adding a trigger</nowiki>]
}}

{{TodoItem
|Avoid requirement for "AFTER" trigger functions to return a value
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg02384.php
}}

== Inheritance ==

{{TodoItem
|Allow inherited tables to inherit indexes, UNIQUE constraints, and primary/foreign keys
* [http://archives.postgresql.org/pgsql-hackers/2010-05/msg00285.php <nowiki>Partitioning/inherited tables vs FKs</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00039.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00305.php
}}

{{TodoItem
|Honor UNIQUE INDEX on base column in INSERTs/UPDATEs on inherited table, e.g. INSERT INTO inherit_table (unique_index_col) VALUES (dup) should fail
|The main difficulty with this item is the problem of creating an index that can span multiple tables.}}

{{TodoItem
|Determine whether ALTER TABLE / SET SCHEMA should work on inheritance hierarchies (and thus support ONLY). If yes, implement it.}}

{{TodoItem
|ALTER TABLE variants sometimes support recursion and sometimes not, but this is poorly/not documented, and the ONLY marker would then be silently ignored. Clarify the documentation, and reject ONLY if it is not supported.}}

== Indexes ==

{{TodoItem
|Prevent index uniqueness checks when UPDATE does not modify the column
|Uniqueness (index) checks are done when updating a column even if the column is not modified by the UPDATE.
However, HOT already short-circuits this in common cases, so more work might not be helpful.}}

{{TodoItem
|Allow the creation of on-disk bitmap indexes which can be quickly combined with other bitmap indexes
|Such indexes could be more compact if there are only a few distinct values. Such indexes can also be compressed. Keeping such indexes updated can be costly.
* [http://archives.postgresql.org/pgsql-patches/2005-07/msg00512.php <nowiki>Re: Bitmap index AM</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg01107.php <nowiki>Bitmap index thoughts</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00265.php <nowiki>Stream bitmaps</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01214.php <nowiki>Re: Bitmapscan changes - Requesting further feedback</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00013.php <nowiki>Updated bitmap index patch</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00741.php <nowiki>Reviewing new index types (was Re: [PATCHES] Updated bitmap indexpatch)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01023.php <nowiki>Bitmap Indexes: request for feedback</nowiki>]
* http://archives.postgresql.org/message-id/800923.27831.qm@web29010.mail.ird.yahoo.com
}}

{{TodoItem
|Allow accurate statistics to be collected on indexes with more than one column or expression indexes, perhaps using per-index statistics
* [http://archives.postgresql.org/pgsql-performance/2006-10/msg00222.php <nowiki>Re: Simple join optimized badly?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01131.php <nowiki>Stats for multi-column indexes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00741.php <nowiki>Cross-column statistics revisited</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg01431.php <nowiki>Multi-Dimensional Histograms</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00913.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg02179.php
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00459.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg02054.php
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg01731.php
}}

{{TodoItem
|Consider having a larger statistics target for indexed columns and expression indexes.
}}

{{TodoItem
|Consider smaller indexes that record a range of values per heap page, rather than having one index entry for every heap row
|This is useful if the heap is clustered by the indexed values.
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00341.php <nowiki>Grouped Index Tuples</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg01264.php <nowiki>Grouped Index Tuples</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00465.php <nowiki>Grouped Index Tuples / Clustered Indexes</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-03/msg00163.php <nowiki>Bitmapscan changes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00014.php <nowiki>Re: GIT patch</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00487.php <nowiki>Re: Index Tuple Compression Approach?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01589.php <nowiki>Re: Index AM change proposals, redux</nowiki>]
}}

{{TodoItem
|Add REINDEX CONCURRENTLY, like CREATE INDEX CONCURRENTLY
|This is difficult because you must upgrade to an exclusive table lock to replace the existing index file. CREATE INDEX CONCURRENTLY does not have this complication. This would allow index compaction without downtime.
* [http://archives.postgresql.org/pgsql-performance/2007-08/msg00289.php <nowiki>Re: When/if to Reindex</nowiki>]
}}

{{TodoItem
|Allow multiple indexes to be created concurrently, ideally via a single heap scan
|pg_restore allows parallel index builds, but it is done via subprocesses, and there is no SQL interface for this.
}}

{{TodoItem
|Consider sorting entries before inserting into btree index
* [http://archives.postgresql.org/pgsql-general/2008-01/msg01010.php <nowiki>Re: ATTN: Clodaldo was Performance problem. Could it be related to 8.3-beta4?</nowiki>]
}}

{{TodoItem
|Allow index scans to return matching index keys, not just the matching heap locations
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01657.php <nowiki>Re: Is this TODO item done?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01477.php <nowiki>Index-only quals</nowiki>]
}}

{{TodoItem
|Allow creation of an index that can do comparisons to test if a value is between two column values
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00757.php <nowiki>Proposal: temporal extension "period" data type</nowiki>]
}}

{{TodoItem
|Consider using "effective_io_concurrency" for index scans
* Currently only bitmap scans use this, which might be fine because most multi-row index scans use bitmap scans.
}}

=== GIST ===
{{TodoSubsection}}

{{TodoItem
|Add more GIST index support for geometric data types}}

{{TodoItem
|Allow GIST indexes to create certain complex index types, like digital trees (see Aoki)}}

{{TodoItem
|Fix performance issues in contrib/seg and contrib/cube GiST support
* [http://archives.postgresql.org/message-id/alpine.DEB.2.00.0904161633160.4053@aragorn.flymine.org GiST index performance]
* [http://archives.postgresql.org/message-id/alpine.DEB.2.00.0904221704470.22330@aragorn.flymine.org draft patch]
* [http://archives.postgresql.org/pgsql-performance/2009-05/msg00069.php <nowiki>Re: GiST index performance</nowiki>]
* [http://archives.postgresql.org/pgsql-performance/2009-06/msg00068.php <nowiki>GiST index performance</nowiki>]
}}

{{TodoEndSubsection}}

=== GIN ===
{{TodoSubsection}}

{{TodoItemDone
|Support empty indexed values (such as zero-element arrays) properly
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00237.php contrib/intarray vs empty arrays]
* [http://archives.postgresql.org/pgsql-bugs/2009-05/msg00118.php BUG #4806: Bug with GiST index and empty integer array]
}}

{{TodoItemDone
|Behave correctly for cases where some elements of an indexed value are NULL
* [http://archives.postgresql.org/pgsql-hackers/2009-03/msg01003.php <nowiki>GIN versus zero-key queries</nowiki>]
}}

{{TodoItemDone
|Support queries that require a full scan
* [http://archives.postgresql.org/pgsql-general/2009-05/msg00402.php Issue report]
* [http://archives.postgresql.org/pgsql-general/2007-06/msg01132.php Older issue report]
* [http://archives.postgresql.org/pgsql-hackers/2010-10/msg00521.php Still another complaint]
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg01581.php Previous partial fix]
}}

{{TodoItemDone
|Improve GIN's handling of NULL array values
* http://archives.postgresql.org/pgsql-bugs/2010-12/msg00032.php
}}

{{TodoEndSubsection}}

=== Hash ===
{{TodoSubsection}}

{{TodoItem
|Add UNIQUE capability to hash indexes}}

{{TodoItem
|Add hash WAL logging for crash recovery}}

{{TodoItem
|Allow multi-column hash indexes}}

{{TodoEndSubsection}}

== Sorting ==

{{TodoItem
|Consider whether duplicate keys should be sorted by block/offset
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00558.php <nowiki>Remove hacks for old bad qsort() implementations?</nowiki>]
}}

{{TodoItem
|Consider being smarter about memory and external files used during sorts
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg01101.php <nowiki>Sorting Improvements for 8.4</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00045.php <nowiki>Re: Sorting Improvements for 8.4</nowiki>]
}}

{{TodoItem
|Consider detoasting keys before sorting}}

{{TodoItem
|Allow sorts to use more available memory
* http://archives.postgresql.org/pgsql-hackers/2007-11/msg01026.php
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg01123.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg01957.php
}}

== Fsync ==

{{TodoItem
|Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options and whether fsync does anything
|Ideally this requires a separate test program like /contrib/pg_test_fsync that can be run at initdb time or optionally later.}}

{{TodoItem
|Consider sorting writes during checkpoint
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00541.php <nowiki>Sorted writes in checkpoint</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-07/msg00050.php <nowiki>Re: Sorting writes during checkpoint</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg02012.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg00278.php
}}

== Cache Usage ==

{{TodoItem
|Speed up COUNT(*)
|We could use a fixed row count and a +/- count to follow MVCC visibility rules, or a single cached value could be used and invalidated if anyone modifies the table. Another idea is to get a count directly from a unique index, but for this to be faster than a sequential scan it must avoid access to the heap to obtain tuple visibility information.}}

{{TodoItem
|Provide a way to calculate an "estimated COUNT(*)"
|Perhaps by using the optimizer's cardinality estimates or random sampling.
* [http://archives.postgresql.org/pgsql-hackers/2005-11/msg00943.php <nowiki>Re: Improving count(*)</nowiki>]
}}

{{TodoItem
|Allow data to be pulled directly from indexes
|Currently indexes do not have enough tuple visibility information to allow data to be pulled from the index without also accessing the heap. The idea is to use the visibility map used for vacuum to avoid heap lookups on pages where all tuples are visible.
* [http://wiki.postgresql.org/wiki/Index-only_scans Index-Only Scans wiki]
}}

{{TodoItem
|Consider automatic caching of statements at various levels:
* Parsed query tree
* Query execute plan
* Query results

:
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00823.php <nowiki>Cached Query Plans (was: global prepared statements)</nowiki>]
}}

{{TodoItem
|Consider increasing internal areas (NUM_CLOG_BUFFERS) when shared buffers is increased
* [http://archives.postgresql.org/pgsql-hackers/2005-10/msg01419.php <nowiki>Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00030.php <nowiki>clog_buffers to 64 in 8.3?</nowiki>]
* [http://archives.postgresql.org/pgsql-performance/2007-08/msg00024.php <nowiki>CLOG Patch</nowiki>]
}}

{{TodoItem
|Consider decreasing the amount of memory used by PrivateRefCount
|
* [http://archives.postgresql.org/pgsql-hackers/2006-11/msg00797.php <nowiki>PrivateRefCount (for 8.3)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg00752.php <nowiki>Re: PrivateRefCount (for 8.3)</nowiki>]
}}

{{TodoItem
|Consider allowing higher priority queries to have referenced buffer cache pages stay in memory longer
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00562.php <nowiki>Re: How to keep a table in memory?</nowiki>]
}}

== Vacuum ==

{{TodoItem
|Auto-fill the free space map by scanning the buffer cache or by checking pages written by the background writer
* [http://archives.postgresql.org/pgsql-hackers/2006-02/msg01125.php <nowiki>Dead Space Map</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-03/msg00011.php <nowiki>Re: Automatic free space map filling</nowiki>]
}}

{{TodoItem
|Allow concurrent inserts to use recently created pages rather than creating new ones
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg00853.php
}}

{{TodoItem
|Consider having single-page pruning update the visibility map
* <nowiki>https://commitfest.postgresql.org/action/patch_view?id=75</nowiki>
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg02344.php <nowiki>Re: visibility maps and heap_prune</nowiki>]
}}

{{TodoItem
|Improve tracking of total relation tuple counts now that vacuum doesn't always scan the whole heap
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00531.php Partial vacuum versus pg_class.reltuples]
}}

{{TodoItem
|Bias FSM towards returning free space near the beginning of the heap file, in hopes that empty pages at the end can be truncated by VACUUM
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg01124.php <nowiki>FSM search modes</nowiki>]
}}

{{TodoItem
|Consider a more compact data representation for dead tuple locations within VACUUM
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00143.php <nowiki>Re: Have vacuum emit a warning when it runs out of maintenance_work_mem</nowiki>]
}}

{{TodoItem
|Provide more information in order to improve user-side estimates of dead space bloat in relations
* [http://archives.postgresql.org/pgsql-general/2009-05/msg01039.php <nowiki>Re: Bloated Table</nowiki>]
}}

{{TodoItem
|Improve locking behaviour of vacuum during trailing page truncation
* http://archives.postgresql.org/pgsql-bugs/2011-03/msg00319.php
* http://archives.postgresql.org/message-id/4D8DF88E.7080205@Yahoo.com
}}

=== Auto-vacuum ===
{{TodoSubsection}}

{{TodoItemEasy
|Issue log message to suggest VACUUM FULL if a table is nearly empty?}}

{{TodoItem
|Prevent long-lived temporary tables from causing frozen-xid advancement starvation
|The problem is that autovacuum cannot vacuum them to set frozen xids; only the session that created them can do that.
* [http://archives.postgresql.org/pgsql-general/2007-06/msg01645.php <nowiki>Re: AutoVacuum Behaviour Question</nowiki>]
}}

{{TodoItem
|Prevent autovacuum from running if an old transaction is still running from the last vacuum
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00899.php <nowiki>Re: Autovacuum and OldestXmin</nowiki>]
}}

{{TodoItem
|Have autoanalyze of parent tables occur when child tables are modified
* http://archives.postgresql.org/pgsql-performance/2010-06/msg00137.php
* http://archives.postgresql.org/pgsql-performance/2010-10/msg00271.php
}}

{{TodoEndSubsection}}

== Locking ==

{{TodoItem
|Fix priority ordering of read and write light-weight locks
* [http://archives.postgresql.org/pgsql-hackers/2004-11/msg00893.php <nowiki>lwlocks and starvation</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2004-11/msg00905.php <nowiki>Re: lwlocks and starvation</nowiki>]
}}

{{TodoItem
|Fix problem when multiple subtransactions of the same outer transaction hold different types of locks, and one subtransaction aborts
* [http://archives.postgresql.org/pgsql-hackers/2006-11/msg01011.php <nowiki>FOR SHARE vs FOR UPDATE locks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00001.php <nowiki>Re: FOR SHARE vs FOR UPDATE locks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00435.php <nowiki>Re: [PATCHES] [pgsql-patches] Phantom Command IDs, updated patch</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00773.php <nowiki>Re: savepoints and upgrading locks</nowiki>]
}}

{{TodoItem
|Allow UPDATEs on only non-referential integrity columns not to conflict with referential integrity locks
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00073.php <nowiki>Referential Integrity and SHARE locks</nowiki>]
}}

{{TodoItem
|Add idle_in_transaction_timeout GUC so locks are not held for long periods of time}}

{{TodoItem
|Improve deadlock detection when a page cleaning lock conflicts with a shared buffer that is pinned
* [http://archives.postgresql.org/pgsql-bugs/2008-01/msg00138.php <nowiki>BUG #3883: Autovacuum deadlock with truncate?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00873.php <nowiki>Thoughts about bug #3883</nowiki>]
* [http://archives.postgresql.org/pgsql-committers/2008-01/msg00365.php <nowiki>Re: pgsql: Add checks to TRUNCATE, CLUSTER, and REINDEX to prevent</nowiki>]
}}

{{TodoItem
|Detect deadlocks involving LockBufferForCleanup()
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00873.php <nowiki>Thoughts about bug #3883</nowiki>]
}}

{{TodoItem
|Allow finer control over who is cancelled in a deadlock
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01727.php
}}

{{TodoItem
|Consider a lock timeout parameter
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00485.php <nowiki>SELECT ... FOR UPDATE [WAIT integer | NOWAIT] for 8.5</nowiki>]
}}

{{TodoItemDone
|Consider improving serialized transaction behavior to avoid anomalies
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00217.php <nowiki>Serializable Isolation without blocking</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg01136.php <nowiki>User-facing aspects of serializable transactions</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00035.php <nowiki>Re: User-facing aspects of serializable transactions</nowiki>]
}}

== Startup Time Improvements ==

{{TodoItem
|Experiment with multi-threaded backend for backend creation
|This would prevent the overhead associated with process creation. Most operating systems have trivial process creation time compared to database startup overhead, but a few operating systems (Win32, Solaris) might benefit from threading. Also explore the idea of a single session using multiple threads to execute a statement faster.}}

{{TodoItem
|Allow backends to change their database without restart
|This allows for faster server startup.
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00843.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00336.php
}}

== Write-Ahead Log ==

{{TodoItem
|Eliminate need to write full pages to WAL before page modification
|Currently, to protect against partial disk page writes, we write full page images to WAL before they are modified so we can correct any partial page writes during recovery. These pages can also be eliminated from point-in-time archive files.
* [http://archives.postgresql.org/pgsql-hackers/2002-06/msg00655.php <nowiki>Re: Index Scans become Seq Scans after VACUUM ANALYSE</nowiki>]
}}

{{TodoItem
|When full page writes are off, write CRC to WAL and check file system blocks on recovery
|If CRC check fails during recovery, remember the page in case a later CRC for that page properly matches.}}

{{TodoItem
|Write full pages during file system write and not when the page is modified in the buffer cache
|This allows most full page writes to happen in the background writer. It might cause problems for applying WAL on recovery into a partially-written page, but later the full page will be replaced from WAL.}}

{{TodoItem
|Reduce WAL traffic so only modified values are written rather than entire rows
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01589.php <nowiki>Reduction in WAL for UPDATEs</nowiki>]
}}

{{TodoItem
|Allow WAL information to recover corrupted pg_controldata
* [http://archives.postgresql.org/pgsql-patches/2006-06/msg00025.php <nowiki>Re: [HACKERS] pg_resetxlog -r flag</nowiki>]
}}

{{TodoItem
|Find a way to reduce rotational delay when repeatedly writing last WAL page
|Currently fsync of WAL requires the disk platter to perform a full rotation to fsync again. One idea is to write the WAL to different offsets that might reduce the rotational delay.
* [http://archives.postgresql.org/pgsql-hackers/2002-11/msg00483.php <nowiki>500 tpsQL + WAL log implementation</nowiki>]
}}

{{TodoItemDone
|Allow WAL logging to be turned off for a table, but the table might be dropped or truncated during crash recovery
|Allow tables to bypass WAL writes and just fsync() dirty pages on commit. This should be implemented using ALTER TABLE, e.g. <nowiki>ALTER TABLE PERSISTENCE [ DROP | TRUNCATE | DEFAULT ]</nowiki>. Tables using non-default logging should not use referential integrity with default-logging tables. A table without dirty buffers during a crash could perhaps avoid the drop/truncate.
* [http://archives.postgresql.org/pgsql-hackers/2005-12/msg01016.php <nowiki>Re: [Bizgres-general] WAL bypass for INSERT, UPDATE and</nowiki>]
}}

{{TodoItem
|Speed WAL recovery by allowing more than one page to be prefetched
|This should be done utilizing the same infrastructure used for prefetching in general to avoid introducing complex error-prone code in WAL replay.
* [http://archives.postgresql.org/pgsql-general/2007-12/msg00683.php <nowiki>Slow PITR restore</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00497.php <nowiki>Re: [GENERAL] Slow PITR restore</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg01279.php <nowiki>Read-ahead and parallelism in redo recovery</nowiki>]
}}

{{TodoItem
|Improve WAL concurrency by increasing lock granularity
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00556.php <nowiki>Reworking WAL locking</nowiki>]
}}

{{TodoItem
|Be more aggressive about creating WAL files
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg01325.php <nowiki>Re: PANIC caused by open_sync on Linux</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2004-07/msg01075.php <nowiki>PreallocXlogFiles</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2005-04/msg00556.php <nowiki>WAL/PITR additional items</nowiki>]
}}

{{TodoItem
|Have resource managers report the duration of their status changes
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg01468.php <nowiki>Recovery of Multi-stage WAL actions</nowiki>]
}}

{{TodoItem
|Move pgfoundry's xlogdump to /contrib and have it rely more closely on the WAL backend code
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00035.php <nowiki>xlogdump</nowiki>]
}}

{{TodoItem
|Close deleted WAL files held open in *nix by long-lived read-only backends
* [http://archives.postgresql.org/pgsql-hackers/2009-11/msg01754.php <nowiki>Deleted WAL files held open by backends in Linux</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg00060.php <nowiki>Re: Deleted WAL files held open by backends in Linux</nowiki>]
}}

== Optimizer / Executor ==

{{TodoItem
|Improve selectivity functions for geometric operators}}

{{TodoItem
|Consider increasing the default values of from_collapse_limit, join_collapse_limit, and/or geqo_threshold
* [http://archives.postgresql.org/message-id/4136ffa0905210551u22eeb31bn5655dbe7c9a3aed5@mail.gmail.com from_collapse_limit vs. geqo_threshold]
}}

{{TodoItem
|Improve ability to display optimizer analysis using OPTIMIZER_DEBUG}}

{{TodoItem
|Log statements where the optimizer row estimates were dramatically different from the number of rows actually found?}}

{{TodoItem
|Consider compressed annealing to search for query plans
|This might replace GEQO.
* http://archives.postgresql.org/message-id/15658.1241278636%40sss.pgh.pa.us
}}

{{TodoItem
|Improve use of expression indexes for ORDER BY
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01553.php <nowiki>Resjunk sort columns, Heikki's index-only quals patch, and bug #5000</nowiki>]
}}

{{TodoItem
|Modify the planner to better estimate caching effects
* http://archives.postgresql.org/pgsql-performance/2010-11/msg00117.php
}}

=== Hashing ===
{{TodoSubsection}}

{{TodoItem
|Consider using a hash for joining to a large IN (VALUES ...) list
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00450.php <nowiki>Planning large IN lists</nowiki>]
}}

{{TodoItem
|Allow single batch hash joins to preserve outer pathkeys
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg00806.php Re: Potential Join Performance Issue]
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|"lazy" hash tables - look up only the tuples that are actually requested
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|Avoid building the same hash table more than once during the same query
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|Avoid hashing for distinct and then re-hashing for hash join
* [http://archives.postgresql.org/message-id/4136ffa0902191346g62081081v8607f0b92c206f0a@mail.gmail.com Re: Fixing Grittner's planner issues]
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItemDone
|Allow hashing to be used on arrays, if the element type is hashable
* http://archives.postgresql.org/message-id/11087.1244905821@sss.pgh.pa.us
}}

{{TodoEndSubsection}}

== Background Writer ==

{{TodoItem
|Consider having the background writer update the transaction status hint bits before writing out the page
|Implementing this requires the background writer to have access to system catalogs and the transaction status log.}}

{{TodoItem
|Consider adding buffers the background writer finds reusable to the free list
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00781.php <nowiki>Background LRU Writer/free list</nowiki>]
}}

{{TodoItem
|Automatically tune bgwriter_delay based on activity rather then using a fixed interval
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00781.php <nowiki>Background LRU Writer/free list</nowiki>]
}}

{{TodoItem
|Consider whether increasing BM_MAX_USAGE_COUNT improves performance
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg01007.php <nowiki>Bgwriter LRU cleaning: we've been going at this all wrong</nowiki>]
}}

{{TodoItem
|Test to see if calling PreallocXlogFiles() from the background writer will help with WAL segment creation latency
* [http://archives.postgresql.org/pgsql-patches/2007-06/msg00340.php <nowiki>Re: Load Distributed Checkpoints, final patch</nowiki>]
}}

== Concurrent Use of Resources ==

{{TodoItem
|Do async I/O for faster random read-ahead of data
|Async I/O allows multiple I/O requests to be sent to the disk with results coming back asynchronously.
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg00820.php <nowiki>Asynchronous I/O Support</nowiki>]
* [http://archives.postgresql.org/pgsql-performance/2007-09/msg00255.php <nowiki>Re: random_page_costs - are defaults of 4.0 realistic for SCSI RAID 1</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00027.php <nowiki>There's random access and then there's random access</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-01/msg00170.php <nowiki>Bitmap index scan preread using posix_fadvise (Was: There's random access and then there's random access)</nowiki>]
The above patch is already applied as of 8.4, but it still remains to figure out how to handle plain indexscans effectively.
* [http://archives.postgresql.org//pgsql-hackers/2009-01/msg00806.php Problems with the patch submitted for posix_fadvise in index scans]
}}

{{TodoItem
|Experiment with multi-threaded backend for better I/O utilization
|This would allow a single query to make use of multiple I/O channels simultaneously. One idea is to create a background reader that can pre-fetch sequential and index scan pages needed by other backends. This could be expanded to allow concurrent reads from multiple devices in a partitioned table.
* http://archives.postgresql.org/pgsql-performance/2011-02/msg00123.php
}}

{{TodoItem
|Experiment with multi-threaded backend for better CPU utilization
|This would allow several CPUs to be used for a single query, such as for sorting or query execution.
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00945.php <nowiki>Multi CPU Queries - Feedback and/or suggestions wanted!</nowiki>]
}}

{{TodoItem
|SMP scalability improvements
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00439.php <nowiki>Straightforward changes for increased SMP scalability</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00206.php <nowiki>Re: Reducing Transaction Start/End Contention</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00361.php <nowiki>Re: Reducing Transaction Start/End Contention</nowiki>]
}}

== TOAST ==

{{TodoItem
|Allow user configuration of TOAST thresholds
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00213.php <nowiki>Re: Proposed adjustments in MaxTupleSize and toastthresholds</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00082.php <nowiki>pg_lzcompress strategy parameters</nowiki>]
}}

{{TodoItem
|Reduce unnecessary cases of deTOASTing
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00895.php <nowiki>Re: [PATCHES] Eliminate more detoast copies for packed varlenas</nowiki>]
}}

{{TodoItem
|Reduce costs of repeat de-TOASTing of values
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg01096.php <nowiki>WIP patch: reducing overhead for repeat de-TOASTing</nowiki>]
}}

== Miscellaneous Performance ==

{{TodoItem
|Use mmap() rather than SYSV for shared buffers?
|This would remove the requirement for SYSV SHM but would introduce portability issues. Anonymous mmap (or mmap to /dev/zero) is required to prevent I/O overhead. We could also consider mmap() for writing WAL.
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00750.php
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00756.php
}}

{{TodoItem
|Rather than consider mmap()-ing in 8k pages, consider mmap()'ing entire files into a backend?
|Doing I/O to large tables would consume a lot of address space or require frequent mapping/unmapping. Extending the file also causes mapping problems that might require mapping only individual pages, leading to thousands of mappings. Another problem is that there is no way to _prevent_ I/O to disk from the dirty shared buffers so changes could hit disk before WAL is written.
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01239.php
}}

{{TodoItem
|Consider ways of storing rows more compactly on disk:
* Reduce the row header size?
* Consider reducing on-disk varlena length from four bytes to two because a heap row cannot be more than 64k in length}}

{{TodoItem
|Consider transaction start/end performance improvements
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00948.php <nowiki>Reducing Transaction Start/End Contention</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00361.php <nowiki>Re: Reducing Transaction Start/End Contention</nowiki>]
}}

{{TodoItem
|Allow configuration of backend priorities via the operating system
|Though backend priorities make priority inversion during lock waits possible, research shows that this is not a huge problem.
* [http://archives.postgresql.org/pgsql-general/2007-02/msg00493.php <nowiki>Priorities for users or queries?</nowiki>]
}}

{{TodoItem
|Consider increasing the minimum allowed number of shared buffers
* [http://archives.postgresql.org/pgsql-bugs/2008-02/msg00157.php <nowiki>Re: [PATCH] Don't bail with legitimate -N/-B options</nowiki>]
}}

{{TodoItem
|Consider if CommandCounterIncrement() can avoid its AcceptInvalidationMessages() call
* [http://archives.postgresql.org/pgsql-committers/2007-11/msg00585.php <nowiki>pgsql: Avoid incrementing the CommandCounter when</nowiki>]
}}

{{TodoItem
|Consider Cartesian joins when both relations are needed to form an indexscan qualification for a third relation
* [http://archives.postgresql.org/pgsql-performance/2007-12/msg00090.php <nowiki>Re: TB-sized databases</nowiki>]
}}

{{TodoItem
|Consider not storing a NULL bitmap on disk if all the NULLs are trailing
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00624.php <nowiki>Proposal for Null Bitmap Optimization(for Trailing NULLs)</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-12/msg00109.php <nowiki>Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)</nowiki>]
}}

{{TodoItem
|Sort large UPDATE/DELETEs so it is done in heap order
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg01119.php <nowiki>Possible future performance improvement: sort updates/deletes by ctid</nowiki>]
}}

{{TodoItem
|Allow one transaction to see tuples using the snapshot of another transaction
|This would assist multiple backends in working together.
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00400.php <nowiki>Transaction Snapshot Cloning</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00135.php
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00260.php
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00466.php
}}

{{TodoItem
|Consider decreasing the I/O caused by updating tuple hint bits
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00847.php <nowiki>Hint Bits and Write I/O</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-07/msg00199.php <nowiki>Re: [HACKERS] Hint Bits and Write I/O</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00695.php
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00792.php
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg01063.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01408.php
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01453.php
}}

{{TodoItem
|Avoid the requirement of freezing pages that are infrequently modified
|If all rows on a page are visible, it is possible to set a bit in the visibility map (once the visibility map is 100% reliable) and not need to freeze the page, avoiding a page rewrite
* http://archives.postgresql.org/message-id/4BF701CF.2090205@agliodbs.com
* http://archives.postgresql.org/pgsql-hackers/2010-06/msg00082.php
}}

{{TodoItem
|Avoid reading in b-tree pages when replaying vacuum records in hot standby mode
* [http://archives.postgresql.org/message-id/1272571938.4161.14739.camel@ebony <nowiki>Hot Standby tuning for btree_xlog_vacuum()</nowiki>]
}}

{{TodoItem
|Restructure truncation logic to be more resistant to failure
|This also involves not writing dirty buffers for a truncated or dropped relation
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01032.php
}}

{{TodoItem
|Consider adding logic to increase large tables by more than 8k
|This would reduce file system fragmentation
* http://archives.postgresql.org/pgsql-bugs/2011-03/msg00337.php
}}

== Miscellaneous Other ==

{{TodoItem
|Deal with encoding issues for filenames in the server filesystem
* {{MessageLink|20090413184335.39BE.52131E4D@oss.ntt.co.jp|a proposed patch here}}
* {{MessageLink|8484.1244655656@sss.pgh.pa.us|some issues about it here}}
* {{MessageLink|20100107103740.97A5.52131E4D@oss.ntt.co.jp|Windows-specific patch here}}
}}

{{TodoItem
|Deal with encoding issues in the output of localeconv()
* [http://archives.postgresql.org/message-id/40c6d9160904210658y590377cfw6dbbecb53d2b8be0@mail.gmail.com bug report]
* [http://archives.postgresql.org/message-id/49EF8DA0.90008@tpf.co.jp draft patch]
* [http://archives.postgresql.org/message-id/21710.1243620986@sss.pgh.pa.us review of patch]
}}

{{TodoItem
|Provide schema name and other fields available from SQL GET DIAGNOSTICS in error reports
* [http://archives.postgresql.org/message-id/dcc563d10810211907n3c59a920ia9eb7cd2a6d5ea58@mail.gmail.com <nowiki>How to get schema name which violates fk constraint</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-11/msg00846.php <nowiki>patch - Report the schema along table name in a referential failure error message</nowiki>]
* {{MessageLink|3191.1263306359@sss.pgh.pa.us|Re: NOT NULL violation and error-message}}
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg00213.php <nowiki>the case for machine-readable error fields</nowiki>]
}}

{{TodoItemEasy
| Provide [http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html#LIBPQ-CONNECT-FALLBACK-APPLICATION-NAME fallback_application_name] in contrib/pgbench, oid2name, and dblink.
* {{MessageLink|w2g9837222c1004070216u3bc46b3ahbddfdffdbfb46212@mail.gmail.com|fallback_application_name and pgbench}}
}}

{{TodoItem
|Add 64-bit support to /contrib/pgbench
* http://archives.postgresql.org/pgsql-hackers/2010-07/msg00153.php
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg00705.php
}}

== Source Code ==

{{TodoItem
|Add use of 'const' for variables in source tree}}

{{TodoItemEasy
|Remove warnings created by -Wcast-align}}

{{TodoItem
|Move platform-specific ps status display info from ps_status.c to ports}}

{{TodoItem
|Add optional CRC checksum to heap and index pages
|One difficulty is how to prevent hint bit changes from affecting the computed CRC checksum.
* http://archives.postgresql.org/message-id/19934.1226601952%40sss.pgh.pa.us
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00002.php <nowiki>Re: Block-level CRC checks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01028.php <nowiki>double-buffering page writes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00524.php <nowiki>Re: Block-level CRC checks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg01101.php <nowiki>Re: Block-level CRC checks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg00011.php <nowiki>Re: Block-level CRC checks</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00249.php
}}

{{TodoItem
|Consider a faster CRC32 algorithm
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01112.php
}}

{{TodoItem
|Allow cross-compiling by generating the zic database on the target system}}

{{TodoItem
|Improve NLS maintenance of libpgport messages linked onto applications}}

{{TodoItemDone
|Improve the module installation experience (/contrib, etc)
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00132.php <nowiki>modules</nowiki>]
* {{messageLink|ca33c0a30807231640n6fb4035dod8121a18aa1fa29c@mail.gmail.com|Re: PostgreSQL extensions packaging}}
* {{messageLink|ca33c0a30804061349s41b4d8fcsa9c579454b27ecd2@mail.gmail.com|Database owner installable modules patch}}
* [http://archives.postgresql.org//pgsql-hackers/2009-03/msg00855.php <nowiki>Re: contrib function naming, and upgrade issues</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00912.php <nowiki>search_path vs extensions</nowiki>]
}}

{{TodoItem
|Use UTF8 encoding for NLS messages so all server encodings can read them properly}}

{{TodoItem
|Allow creation of universal binaries for Darwin
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00884.php <nowiki>Getting to universal binaries for Darwin</nowiki>]
}}

{{TodoItem
|Consider GnuTLS if OpenSSL license becomes a problem
* http://archives.postgresql.org/pgsql-hackers/2011-02/msg00892.php
* [http://archives.postgresql.org/pgsql-patches/2006-05/msg00040.php <nowiki>[PATCH] Add support for GnuTLS</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg01213.php <nowiki>TODO: GNU TLS</nowiki>]
}}

{{TodoItem
|Consider making NAMEDATALEN more configurable in future releases}}

{{TodoItem
|Research use of signals and sleep wake ups
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00003.php <nowiki>Restartable signals 'n all that</nowiki>]
}}

{{TodoItem
|Allow C++ code to more easily access backend code
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00302.php <nowiki>Mostly Harmless: Welcoming our C++ friends</nowiki>]
}}

{{TodoItem
|Consider simplifying how memory context resets handle child contexts
* [http://archives.postgresql.org/pgsql-patches/2007-08/msg00067.php <nowiki>Re: Memory leak in nodeAgg</nowiki>]
}}

{{TodoItem
|Create three versions of libpgport to simplify client code
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00154.php <nowiki>8.4 TODO item: make src/port support libpq and ecpg directly</nowiki>]
}}

{{TodoItem
|Improve detection of shared memory segments being used by others by checking the SysV shared memory field 'nattch'
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00656.php <nowiki>postgresql in FreeBSD jails: proposal</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00673.php <nowiki>Re: postgresql in FreeBSD jails: proposal</nowiki>]
}}

{{TodoItem
|Consider using POSIX shared memory to avoid System V shared memory kernel limits
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg00558.php
}}

{{TodoItem
|Implement the non-threaded Avahi service discovery protocol
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00939.php <nowiki>Re: [PATCHES] Avahi support for Postgresql</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-02/msg00097.php <nowiki>Re: Avahi support for Postgresql</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg01211.php <nowiki>Re: [PATCHES] Avahi support for Postgresql</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-04/msg00001.php <nowiki>Re: [HACKERS] Avahi support for Postgresql</nowiki>]
}}

{{TodoItem
|Reduce data row alignment requirements on some 64-bit systems
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00369.php <nowiki>[WIP] Reduce alignment requirements on 64-bit systems.</nowiki>]
}}

{{TodoItem
|Restructure TOAST internal storage format for greater flexibility
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00049.php <nowiki>Re: PG_PAGE_LAYOUT_VERSION 5 - time for change</nowiki>]
}}

{{TodoItem
| Add regression tests for pg_dump/restore
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01967.php <nowiki>"make install-check-pg_dump" target in src/regress]</nowiki>]
}}

{{TodoItem
| Research different memory allocation methods for lists
* http://archives.postgresql.org/pgsql-hackers/2011-04/msg01467.php
}}

=== /contrib/pg_upgrade ===
{{TodoSubsection}}

{{TodoItemDone
|Remove copy_dir() code, or use it
}}

{{TodoItem
|Handle large object comments
|This is difficult to do because the large object doesn't exist when --schema-only is loaded.
}}

{{TodoItem
|Consider using pg_depend for checking object usage in version.c
}}

{{TodoItem
|If reindex is necessary, allow it to be done in parallel with pg_dump custom format
}}

{{TodoItem
|Migrate pg_statistic by dumping it out as a flat file, so analyze is not necessary
|pg_class.oid is not preserved so schema.tablename must be used.
}}

{{TodoItem
|Improve testing, perhaps using the buildfarm
|The buildfarm has access to multiple versions of PostgreSQL.
}}

{{TodoItem
|Create machine-readable output of pg_controldata
|This would avoid parsing its output. The problem is we need pg_controldata output from both the old and new clusters so we would need to support both formats.
}}

{{TodoEndSubsection}}

=== Windows ===
{{TodoSubsection}}

{{TodoItem
|Remove configure.in check for link failure when cause is found}}

{{TodoItem
|Remove readdir() errno patch when runtime/mingwex/dirent.c rev 1.4 is released}}

{{TodoItem
|Allow psql to use readline once non-US code pages work with backslashes}}

{{TodoItem
|Fix problem with shared memory on the Win32 Terminal Server}}

{{TodoItem
|Improve signal handling
* [http://archives.postgresql.org/pgsql-patches/2005-06/msg00027.php <nowiki>Simplify Win32 Signaling code</nowiki>]
}}

{{TodoItem
|Convert MSVC build system to remove most batch files
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00961.php <nowiki>MSVC build system</nowiki>]
}}

{{TodoItem
|Support pgxs when using MSVC}}

{{TodoItem
|Fix MSVC NLS support, like for to_char()
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00485.php <nowiki>NLS on MSVC strikes back!</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-02/msg00038.php <nowiki>Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)</nowiki>]
}}

{{TodoItem
|Find a correct rint() substitute on Windows
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00808.php <nowiki>Minor bug in src/port/rint.c</nowiki>]
}}

{{TodoItem
|Fix global namespace issues when using multiple terminal server sessions
* [http://archives.postgresql.org/message-id/48F3BFCC.8030107@dunslane.net problems with Windows global namespace]}}

{{TodoItem
|Change from the current autoconf/gmake build system to cmake
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg01869.php <nowiki>About CMake (was Re: [COMMITTERS] pgsql: Append major version number and for libraries soname major)</nowiki>]
}}

{{TodoItem
|Improve consistency of path separator usage
* http://archives.postgresql.org/message-id/49C0BDC5.4010002@hagander.net
}}

{{TodoItem
|Fix cross-compiling on Windows
* http://archives.postgresql.org/pgsql-bugs/2010-10/msg00110.php
}}

{{TodoItem
|Allow multiple Postgres clusters running on the same machine to distinguish themselves in the event log
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01297.php
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00574.php
}}

{{TodoEndSubsection}}

=== Wire Protocol Changes ===
{{TodoSubsection}}

{{TodoItem
|Allow dynamic character set handling}}

{{TodoItem
|Add decoded type, length, precision}}

{{TodoItem
|Mark result columns as known-not-null when possible
* [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01029.php <nowiki>Adding nullable indicator to Describe</nowiki>]
}}

{{TodoItem
|Provide more control over planner treatment of statements being prepared}}

{{TodoItem
|Use compression?}}

{{TodoItem
|Update clients to use data types, typmod, schema.table.column names of result sets using new statement protocol}}

{{TodoEndSubsection}}

== Documentation ==

{{TodoItem
|Convert single quotes to apostrophes in the PDF documentation
* [http://archives.postgresql.org/pgsql-docs/2007-12/msg00059.php <nowiki>SGML docs and pdf single-quotes</nowiki>]
}}

{{TodoItem
|Provide a manpage for postgresql.conf
* {{messageLink|20080819194311.GH4428@alvh.no-ip.org|A smaller default postgresql.conf}}
* {{messageLink|200808211910.37524.peter_e@gmx.net|A smaller default postgresql.conf}}
}}

{{TodoItem
|Change the manpage-generating toolchain to use the new XML-based docbook2x tools
* {{messageLink|200808211910.37524.peter_e@gmx.net|A smaller default postgresql.conf}}
}}

{{TodoItem
|Consider changing documentation format from SGML to XML
* [http://archives.postgresql.org/pgsql-docs/2006-12/msg00152.php <nowiki>Re: Authoring Tools WAS: Switching to XML</nowiki>]
* http://archives.postgresql.org/pgsql-docs/2011-04/msg00020.php
* http://wiki.postgresql.org/wiki/Switching_PostgreSQL_documentation_from_SGML_to_XML
}}

{{TodoItem
|Document support for N<nowiki>' '</nowiki> national character string literals, if it matches the SQL standard
* http://archives.postgresql.org/message-id/1275895438.1849.1.camel@fsopti579.F-Secure.com
}}

{{TodoItem
|Add diagrams to the documentation
* http://archives.postgresql.org/pgsql-docs/2010-07/msg00001.php
}}

== Exotic Features ==

{{TodoItem
|Add pre-parsing phase that converts non-ISO syntax to supported syntax
|This could allow SQL written for other databases to run without modification.}}

{{TodoItem
|Allow plug-in modules to emulate features from other databases}}

{{TodoItem
|Add features of Oracle-style packages
|A package would be a schema with session-local variables, public/private functions, and initialization functions. It is also possible to implement these capabilities in any schema and not use a separate "packages" syntax at all.
* [http://archives.postgresql.org/pgsql-hackers/2006-08/msg00384.php <nowiki>proposal for PL packages for 8.3.</nowiki>]
}}

{{TodoItem
|Consider allowing control of upper/lower case folding of unquoted identifiers
* [http://archives.postgresql.org/pgsql-hackers/2004-04/msg00818.php <nowiki>Bringing PostgreSQL torwards the standard regarding case folding</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg01527.php <nowiki>Re: [SQL] Case Preservation disregarding case sensitivity?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00849.php <nowiki>TODO Item: Consider allowing control of upper/lower case folding of unquoted, identifiers</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00415.php <nowiki>Identifier case folding notes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00415.php <nowiki>Identifier case folding notes</nowiki>]
}}

{{TodoItem
|Add autonomous transactions
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php <nowiki>autonomous transactions</nowiki>]
}}

{{TodoItem
|Give query progress indication
* [[Query progress indication]]
}}

{{TodoItem
|Rethink our type system
* [[Rethinking datatypes]]
}}

== Features We Do ''Not'' Want ==

The following features have been discussed ad nauseum on the PostgreSQL mailing lists and the consensus has been that the project is not interested in them. As such, if you are going to bring them up as potential features, you will want to be familiar with all of the arguments against these features which have been previously made over the years. If you decide to work on such features anyway, you should be aware that you face a higher-than-normal barrier to get the Project to accept them.

{{TodoItem
|All backends running as threads in a single process (not wanted)
|This eliminates the process protection we get from the current setup. Thread creation is usually the same overhead as process creation on modern systems, so it seems unwise to use a pure threaded model, and MySQL and DB2 have demonstrated that threads introduce as many issues as they solve. Threading specific operations such as I/O, seq scans, and connection management has been discussed and will probably be implemented to enable specific performance features. Moving to a threaded engine would also require halting all other work on PostgreSQL for one to two years.}}

{{TodoItem
|"Oracle-style" optimizer hints (not wanted)
|Optimizer hints, as implemented in Oracle and other RDBMSes, are used to work around problems in the optimizer and introduce upgrade and maintenance issues. We would rather have such problems reported and fixed. We have discussed a more sophisticated system of per-class cost adjustment instead, but a specification remains to be developed. See [[OptimizerHintsDiscussion|Optimizer Hints Discussion]] for further information.}}

{{TodoItem
|Embedded server (not wanted)
|While PostgreSQL clients runs fine in limited-resource environments, the server requires multiple processes and a stable pool of resources to run reliably and efficiently. Stripping down the PostgreSQL server to run in the same process address space as the client application would add too much complexity and failure cases. Besides, there are several very mature embedded SQL databases already available.}}

{{TodoItem
|Obfuscated function source code (not wanted)
|Obfuscating function source code has minimal protective benefits because anyone with super-user access can find a way to view the code. At the same time, it would greatly complicate backups and other administrative tasks. To prevent non-super-users from viewing function source code, remove SELECT permission on pg_proc.
* [http://archives.postgresql.org/pgsql-general/2008-09/msg00668.php <nowiki>Obfuscated stored procedures (was Re: Oracle and Postgresql)</nowiki>]
}}

{{TodoItem
|Indeterminate behavior for the GROUP BY clause (not wanted)
|At least one other database product allows specification of a subset of the result columns which GROUP BY would need to be able to provide predictable results; the server is free to return any value from the group. This is not viewed as a desirable feature. PostgreSQL 9.1 will allow result columns that are not referenced by GROUP BY if a primary key for the same table is referenced in GROUP BY.
* [http://archives.postgresql.org/pgsql-hackers/2010-03/msg00297.php <nowiki>Re: SQL compatibility reminder: MySQL vs PostgreSQL</nowiki>]
}}

</div>

[[Category:Todo]]

Developer FAQ

2011-05-12T19:58:46Z

Schmiddy: /* What areas need work? */

{{Languages}}

== Getting Involved ==

=== How do I get involved in PostgreSQL development? ===

Download the code and have a look around. See [[#How_do_I_download.2Fupdate_the_current_source_tree.3F|downloading the source tree]].

Subscribe to and read the [http://archives.postgresql.org/pgsql-hackers/ pgsql-hackers mailing list] (often termed "hackers"). This is where the major contributors and core members of the project discuss development.

=== How do I download/update the current source tree? ===

There are several ways to obtain the source tree. Occasional developers can just get the most recent source tree snapshot from ftp://ftp.postgresql.org/pub/snapshot/.

Regular developers might want to take advantage of anonymous access to our source code management system. The source tree is currently hosted in git. For details of how to obtain the source from git see http://developer.postgresql.org/pgdocs/postgres/git.html and [[Working with Git]].

=== What development environment is required to develop code? ===

PostgreSQL is developed mostly in the C programming language. The source code is targeted at most of the popular Unix platforms and the Windows environment (XP, Windows 2000, and up).

Most developers run a Unix-like operating system and use an open source tool chain with [http://gcc.gnu.org GCC], [http://www.gnu.org/software/make/make.html GNU Make], [http://www.gnu.org/software/gdb/gdb.html GDB], [http://www.gnu.org/software/autoconf/ Autoconf], and so on. If you have contributed to open source software before, you will probably be familiar with these tools. Developers using this tool chain on Windows make use of [http://www.mingw.org/ MinGW], though most development on Windows is currently done with the Microsoft Visual Studio 2005 (version 8) development environment and associated tools.

The complete list of required software to build PostgreSQL can be found in the [http://developer.postgresql.org/pgdocs/postgres/install-requirements.html installation instructions].

Developers who regularly rebuild the source often pass the --enable-depend flag to configure. The result is that if you make a modification to a C header file, all files depend upon that file are also rebuilt.

src/Makefile.custom can be used to set environment variables, like CUSTOM_COPT, that are used for every compile.

=== What areas need work? ===
Outstanding features are detailed in [[Todo]].

You can learn more about these features by consulting the [http://archives.postgresql.org/ archives], the SQL standards and the recommended texts (see [[#What_books_are_good_for_developers.3F|books for developers]]).

=== How do I get involved in PostgreSQL web site development? ===

PostgreSQL website development is discussed on the [http://archives.postgresql.org/pgsql-www/ pgsql-www mailing list]. There is a project page where the source code is available at http://pgweb.postgresql.org/.

== Development Tools and Help ==

=== How is the source code organized? ===

If you point your browser at [http://www.postgresql.org/developer/ext.backend.html How PostgreSQL Processes a Query], (also in a checkout of the source code, under src/tools/backend/index.html), you will see few paragraphs describing the data flow, the backend components in a flow chart, and a description of the shared memory area. You can click on any flowchart box to see a description. If you then click on the directory name, you will be taken to the source directory, to browse the actual source code behind it. We also have several README files in some source directories to describe the function of the module. The browser will display these when you enter the directory also.

Other than documentation in the source tree itself, you can find some papers/presentations discussing the code at http://www.postgresql.org/developer/coding. An excellent presentation is at http://neilconway.org/talks/hacking/

=== What tools are available for developers? ===

First, all the files in the src/tools directory are designed for developers.

RELEASE_CHANGES changes we have to make for each release
backend description/flowchart of the backend directories
ccsym find standard defines made by your compiler
copyright fixes copyright notices

entab converts spaces to tabs, used by pgindent
find_static finds functions that could be made static
find_typedef finds typedefs in the source code
find_badmacros finds macros that use braces incorrectly
fsync a script to provide information about the cost of cache
syncing system calls
make_ctags make vi 'tags' file in each directory
make_diff make *.orig and diffs of source
make_etags make emacs 'etags' files
make_keywords make comparison of our keywords and SQL'92
make_mkid make mkid ID files
git_changelog used to generate a list of changes for each release
pginclude scripts for adding/removing include files
pgindent indents source files
pgtest a semi-automated build system
thread a thread testing script

In src/include/catalog:

unused_oids a script that finds unused OIDs for use in system catalogs
duplicate_oids finds duplicate OIDs in system catalog definitions

tools/backend was already described in the question-and-answer above.

Second, you really should have an editor that can handle tags, so you can tag a function call to see the function definition, and then tag inside that function to see an even lower-level function, and then back out twice to return to the original function. Most editors support this via tags or etags files.

Third, you need to get id-utils from ftp://ftp.gnu.org/gnu/id-utils/

By running tools/make_mkid, an archive of source symbols can be created that can be rapidly queried.

Some developers make use of cscope, which can be found at http://cscope.sf.net/. Others use glimpse, which can be found at http://webglimpse.net/.

tools/make_diff has tools to create patch diff files that can be applied to the distribution. This produces context diffs, which is our preferred format.

pgindent is used to fix the source code style to conform to our standards, and is normally run at the end of each development cycle; see [[#What.27s_the_formatting_style_used_in_PostgreSQL_source_code.3F|this question]] for more information on our style.

pginclude contains scripts used to add needed #include's to include files, and removed unneeded #include's.

When adding built-in objects such as types or functions, you will need to assign OIDs to them. Our convention is that all hand-assigned OIDs are distinct values in the range 1-9999. (It would work mechanically for them to be unique within individual system catalogs, but for clarity we require them to be unique across the whole system.) There is a script called unused_oids in src/include/catalog that shows the currently unused OIDs. To assign a new OID, pick one that is free according to unused_oids, and for bonus points pick one that is nearby to related existing objects. See also the duplicate_oids script, which will complain if you made a mistake.

=== What's the formatting style used in PostgreSQL source code? ===

Our standard format BSD style, with each level of code indented one tab, where each tab is four spaces. You will need to set your editor or file viewer to display tabs as four spaces:

For '''vi''' (in <code>.exrc</code> or <code>.vimrc</code>):
set tabstop=4 shiftwidth=4 noexpandtab

For '''less''' or '''more''', specify <code>-x4</code> to get the correct indentation.

The tools/editors directory of the latest sources contains sample settings that can be used with the emacs, xemacs and vim editors, that assist in keeping to PostgreSQL coding standards.

pgindent will the format code by specifying flags to your operating system's utility indent. pgindent is run on all source files just before each beta test period. It auto-formats all source files to make them consistent. Comment blocks that need specific line breaks should be formatted as block comments, where the comment starts as /*------. These comments will not be reformatted in any way.

See also [http://developer.postgresql.org/pgdocs/postgres/source-format.html the Formatting section] in the documentation. [http://archives.postgresql.org/message-id/1221125165.5637.12.camel@abbas-laptop This posting] talks about our naming of variable and function names.

If you're wondering why we bother with this, [http://ezine.daemonnews.org/200112/single_coding_style.html this article] describes the value of a consistent coding style.

=== Is there a diagram of the system catalogs available? ===

Yes, we have [http://dalibo.org/_media/articles/catalog.png at least one] ([http://svn.postgresql.fr/repos/materials/advocacy/trunk/posters/catalogs83.svg SVG version]).

=== What books are good for developers? ===

There are five good books:

* An Introduction to Database Systems, by C.J. Date, Addison, Wesley
* A Guide to the SQL Standard, by C.J. Date, et. al, Addison, Wesley
* Fundamentals of Database Systems, by Elmasri and Navathe
* Transaction Processing, by Jim Gray and Andreas Reuter, Morgan Kaufmann
* Transactional Information Systems, by Gerhard Weikum and Gottfried Vossen, Morgan Kaufmann

=== What is configure all about? ===

The files configure and configure.in are part of the GNU autoconf package. Configure allows us to test for various capabilities of the OS, and to set variables that can then be tested in C programs and Makefiles. Autoconf is installed on the PostgreSQL main server. To add options to configure, edit configure.in, and then run autoconf to generate configure.

When configure is run by the user, it tests various OS capabilities, stores those in config.status and config.cache, and modifies a list of *.in files. For example, if there exists a Makefile.in, configure generates a Makefile that contains substitutions for all @var@ parameters found by configure.

When you need to edit files, make sure you don't waste time modifying files generated by configure. Edit the *.in file, and re-run configure to recreate the needed file. If you run make distclean from the top-level source directory, all files derived by configure are removed, so you see only the file contained in the source distribution.
=== How do I add a new port? ===

There are a variety of places that need to be modified to add a new port. First, start in the src/template directory. Add an appropriate entry for your OS. Also, use src/config.guess to add your OS to src/template/.similar. You shouldn't match the OS version exactly. The configure test will look for an exact OS version number, and if not found, find a match without version number. Edit src/configure.in to add your new OS. (See configure item above.) You will need to run autoconf, or patch src/configure too.

Then, check src/include/port and add your new OS file, with appropriate values. Hopefully, there is already locking code in src/include/storage/s_lock.h for your CPU. There is also a src/makefiles directory for port-specific Makefile handling. There is a backend/port directory if you need special files for your OS.
=== Why don't you use threads, raw devices, async-I/O, <insert your favorite wizz-bang feature here>? ===

There is always a temptation to use the newest operating system features as soon as they arrive. We resist that temptation.

First, we support 15+ operating systems, so any new feature has to be well established before we will consider it. Second, most new wizz-bang features don't provide dramatic improvements. Third, they usually have some downside, such as decreased reliability or additional code required. Therefore, we don't rush to use new features but rather wait for the feature to be established, then ask for testing to show that a measurable improvement is possible.

As an example, threads are not currently used instead of multiple processes for backends because:

* Historically, threads were poorly supported and buggy.
* An error in one backend can corrupt other backends if they're threads within a single process
* Speed improvements using threads are small compared to the remaining backend startup time.
* The backend code would be more complex.
* Terminating backend processes allows the OS to cleanly and quickly free all resources, protecting against memory and file descriptor leaks and making backend shutdown cheaper and faster
* Debugging threaded programs is much harder than debugging worker processes, and core dumps are much less useful
* Sharing of read-only executable mappings and the use of shared_buffers means processes, like threads, are very memory efficient
* Regular creation and destruction of processes helps protect against memory fragmentation, which can be hard to manage in long-running processes

(Whether individual backend processes should use multiple threads to make use of multiple cores for single queries is a separate question not covered here).

So, we are not ignorant of new features. It is just that we are cautious about their adoption. The TODO list often contains links to discussions showing our reasoning in these areas.

==== Why aren't there more compression options when dumping tables? ====

pg_dump's built-in compression method is gzip.
The primary alternative, bzip2, is normally far too slow to be useful when dumping large tables.

The two main alternatives regularly proposed for better built-in compression at good speeds are LZO and LZMA/LZMA2/XZ. LZO is released under the GPL, incompatible with PostgreSQL. The LZMA2 code has been released into the public domain, but the C port is a secondary one (C++ is the main development focus) whose code quality hasn't seemed appropriate for this project. And this whole area has traditionally been filled with patent issues that go beyond just the restrictions of the software license.

Another limitation on changing this is that pg_dump output is intended to be archivable, so we had better be prepared to support compression methods for a very long time. The "latest and greatest" compression method is exactly what we *don't* want.

See the [http://archives.postgresql.org/pgsql-hackers/2009-02/msg00352.php archives] for an idea what characteristics an alternate compression tool would need to have in order to be considered for use in core PostgreSQL.

=== How are branches managed? ===

See [[Working_with_Git#Using_Back_Branches|Using Back Branches]] and [[Committing with Git]] for information about how branches and backporting are handled.

=== Where can I get a copy of the SQL standards? ===
You are supposed to buy them from [http://www.iso.ch/ ISO] or [http://www.ansi.org ANSI]. Search for ISO/ANSI 9075. ANSI's offer is less expensive, but the contents of the documents are the same between the two organizations.

Since buying an official copy of the standard is quite expensive, most developers rely on one of the various draft versions available on the Internet. Some of these are:
* SQL-92 http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
* SQL:1999 http://web.cs.ualberta.ca/~yuan/courses/db_readings/ansi-iso-9075-2-1999.pdf
* SQL:2003 http://www.wiscorp.com/sql_2003_standard.zip
* SQL:2008 (preliminary) http://www.wiscorp.com/sql200n.zip

The PostgreSQL documentation contains information about PostgreSQL and [http://developer.postgresql.org/pgdocs/postgres/features.html SQL conformance].

Some further web pages about the SQL standard are:
* http://troels.arvin.dk/db/rdbms/links/#standards
* http://www.wiscorp.com/SQLStandards.html
* http://www.contrib.andrew.cmu.edu/~shadow/sql.html#syntax (SQL-92)
* http://dbs.uni-leipzig.de/en/lokal/standards.pdf (paper)

Note that having access to a copy of the SQL standard is not necessary to become a useful contributor to PostgreSQL development. Interpreting the standard is difficult and needs years of experience. And most features in PostgreSQL are not specified in the standard anyway.

=== Where can I get technical assistance? ===

Many technical questions held by those new to the code have been answered on the pgsql-hackers mailing list - the archives of which can be found at http://archives.postgresql.org/pgsql-hackers/.

If you cannot find discussion or your particular question, feel free to put it to the list.

Major contributors also answer technical questions, including questions about development of new features, on IRC at irc.freenode.net in the #postgresql channel.

=== Why haven't you replaced CVS with SVN, Git, Monotone, VSS, <insert your favorite SCMS here>? ===
The project switched to Git in September 2010.

== Development Process ==

=== What do I do after choosing an item to work on? ===

Send an email to pgsql-hackers with a proposal for what you want to do (assuming your contribution is not trivial). Working in isolation is not advisable because others might be working on the same TODO item, or you might have misunderstood the TODO item. In the email, discuss both the internal implementation method you plan to use, and any user-visible changes (new syntax, etc). For complex patches, it is important to get community feedback on your proposal before starting work. Failure to do so might mean your patch is rejected. If your work is being sponsored by a company, read [http://momjian.us/main/writings/pgsql/company_contributions/ this article] for tips on being more effective.

Our queue of patches to be reviewed is maintained via a custom [[CommitFest]] web application at http://commitfest.postgresql.org.

=== How do I test my changes? ===

==== Basic system testing ====

The easiest way to test your code is to ensure that it builds against the latest version of the code and that it does not generate compiler warnings.

It is worth advised that you pass --enable-cassert to configure. This will turn on assertions within the source which will often make bugs more visible because they cause data corruption or segmentation violations. This generally makes debugging much easier.

Then, perform run time testing via psql.

==== Runtime environment ====

To test your modified version of PostgreSQL, it's convenient to install PostgreSQL into a local directory (in your home
directory, for instance) to avoid conflicting with a system wide
installation. Use the ''--prefix='' option to configure to specify an installation
location; ''--with-pgport'' to specify a non-standard default port is
helpful as well. To run this instance, you will need to make sure that the correct
binaries are used; depending on your operating system, environment variables
like PATH and LD_LIBRARY_PATH (on most Linux/Unix-like systems) need to be
set. Setting PGDATA will also be useful.

To avoid having to set this environment up manually, you may want to use
Greg Smith's [https://github.com/gregs1104/peg peg] scripts,or the
[https://github.com/PGBuildFarm/client-code scripts] that are used on the
buildfarm.

==== Regression test suite ====

The next step is to test your changes against the existing regression test suite. To do this, issue "make check" in the root directory of the source tree. If any tests fail, investigate.

If you've deliberately changed existing behavior, this change might cause a regression test failure but not any actual regression. If so, you should also patch the regression test suite.

==== Other run time testing ====

Some developers make use of tools such as valgrind (http://valgrind.kde.org) for memory testing, gprof (which comes with the GNU binutils suite) and oprofile (http://oprofile.sourceforge.net/) for profiling and other related tools.

==== What about unit testing, static analysis, model checking...? ====

There have been a number of discussions about other testing frameworks and some developers are exploring these ideas.

Keep in mind the Makefiles do not have the proper dependencies for include files. You have to do a make clean and then another make. If you are using GCC you can use the --enable-depend option of configure to have the compiler compute the dependencies automatically.

=== I have developed a patch, what next? ===

You will need to submit the patch to pgsql-hackers@postgresql.org. To help ensure your patch is reviewed and committed in a timely fashion, please try to follow the guidelines at [[Submitting a Patch]].

=== What happens to my patch once it is submitted? ===

It will be reviewed by other contributors to the project and will be either accepted or sent back for further work. The process is explained in more detail at [[Submitting a Patch#Patch review and commit|Submitting a Patch]].

=== How do I help with reviewing patches? ===

If you would like to contribute by reviewing a patch in the [http://commitfest.postgresql.org CommitFest] queue, you are most welcome to do so. Please read the guide at [[Reviewing a Patch]] for more information.

=== Do I need to sign a copyright assignment? ===

No, contributors keeps their copyright (as is the case for most
European countries anyway). They simply consider themselves to be part of
the Postgres Global Development Group. (It's not even possible to assign
copyright to PGDG, as it's not a legal entity). This is the same way that
the Linux Kernel and many other Open Source projects works.

=== May I add my own copyright notice where appropriate? ===

No, please don't. We like to keep the legal information short and crisp.
Additionally, we've heard that could possibly pose problems for
corporate users.

=== Doesn't the PostgreSQL license itself require to keep the copyright notice intact? ===

Yes, it does. And it is, because the PostgreSQL Global Development Group
covers all copyright holders. Also note that US law doesn't require any
copyright notice for getting the copyright granted, just like most
European laws.

== Technical Questions ==
=== How do I efficiently access information in system catalogs from the backend code? ===

You first need to find the tuples (rows) you are interested in. There are two ways. First, SearchSysCache() and related functions allow you to query the system catalogs using predefined indexes on the catalogs. This is the preferred way to access system tables, because the first call to the cache loads the needed rows, and future requests can return the results without accessing the base table. A list of available caches is located in src/backend/utils/cache/syscache.c. src/backend/utils/cache/lsyscache.c contains many column-specific cache lookup functions.

The rows returned are cache-owned versions of the heap rows. Therefore, you must not modify or delete the tuple returned by SearchSysCache(). What you should do is release it with ReleaseSysCache() when you are done using it; this informs the cache that it can discard that tuple if necessary. If you neglect to call ReleaseSysCache(), then the cache entry will remain locked in the cache until end of transaction, which is tolerable during development but not considered acceptable for release-worthy code.

If you can't use the system cache, you will need to retrieve the data directly from the heap table, using the buffer cache that is shared by all backends. The backend automatically takes care of loading the rows into the buffer cache. To do this, open the table with heap_open(). You can then start a table scan with heap_beginscan(), then use heap_getnext() and continue as long as HeapTupleIsValid() returns true. Then do a heap_endscan(). Keys can be assigned to the scan. No indexes are used, so all rows are going to be compared to the keys, and only the valid rows returned.

You can also use heap_fetch() to fetch rows by block number/offset. While scans automatically lock/unlock rows from the buffer cache, with heap_fetch(), you must pass a Buffer pointer, and ReleaseBuffer() it when completed.

Once you have the row, you can get data that is common to all tuples, like t_self and t_oid, by merely accessing the HeapTuple structure entries. If you need a table-specific column, you should take the HeapTuple pointer, and use the GETSTRUCT() macro to access the table-specific start of the tuple. You then cast the pointer, for example as a Form_pg_proc pointer if you are accessing the pg_proc table, or Form_pg_type if you are accessing pg_type. You can then access fields of the tuple by using the structure pointer:

((Form_pg_class) GETSTRUCT(tuple))->relnatts

Note however that this only works for columns that are fixed-width and never null, and only when all earlier columns are likewise fixed-width and
never null. Otherwise the column's location is variable and you must use heap_getattr() or related functions to extract it from the tuple.

Also, avoid storing directly into struct fields as a means of changing live tuples. The best way is to use heap_modifytuple() and pass it your original tuple, plus the values you want changed. It returns a palloc'ed tuple, which you pass to heap_update(). You can delete tuples by passing the tuple's t_self to heap_delete(). You use t_self for heap_update() too. Remember, tuples can be either system cache copies, which might go away after you call ReleaseSysCache(), or read directly from disk buffers, which go away when you heap_getnext(), heap_endscan, or ReleaseBuffer(), in the heap_fetch() case. Or it may be a palloc'ed tuple, that you must pfree() when finished.
=== Why are table, column, type, function, view names sometimes referenced as Name or NameData, and sometimes as char *? ===

Table, column, type, function, and view names are stored in system tables in columns of type Name. Name is a fixed-length, null-terminated type of NAMEDATALEN bytes. (The default value for NAMEDATALEN is 64 bytes.)

typedef struct nameData
{
char data[NAMEDATALEN];
} NameData;
typedef NameData *Name;

Table, column, type, function, and view names that come into the backend via user queries are stored as variable-length, null-terminated character strings.

Many functions are called with both types of names, ie. heap_open(). Because the Name type is null-terminated, it is safe to pass it to a function expecting a char *. Because there are many cases where on-disk names(Name) are compared to user-supplied names(char *), there are many cases where Name and char * are used interchangeably.

=== Why do we use Node and List to make data structures? ===

We do this because this allows a consistent way to pass data inside the backend in a flexible way. Every node has a NodeTag which specifies what type of data is inside the Node. Lists are groups of Nodes chained together as a forward-linked list. The ordering of the list elements might or might not be significant, depending on the usage of the particular list.

Here are some of the List manipulation commands:

;lfirst(i)
;lfirst_int(i)
;lfirst_oid(i)
:return the data (a pointer, integer or OID respectively) of list cell i.

;lnext(i)
:return the next list cell after i.

;foreach(i, list)
:loop through list, assigning each list cell to i.

It is important to note that i is a <code>ListCell *</code>, not the data in the List cell. You need to use one of the lfirst variants to get at the cell's data.

Here is a typical code snippet that loops through a List containing <code>Var *</code> cells and processes each one:

List *list;
ListCell *i;
...
foreach(i, list)
{
Var *var = (Var *) lfirst(i);
...
/* process var here */
}

;lcons(node, list)
:add node to the front of list, or create a new list with node if list is NIL.

;lappend(list, node)
:add node to the end of list.

;list_concat(list1, list2)
:Concatenate list2 on to the end of list1.

;list_length(list)
:return the length of the list.

;list_nth(list, i)
:return the i'th element in list, counting from zero.

;lcons_int, ...
:There are integer versions of these: lcons_int, lappend_int, etc. Also versions for OID lists: lcons_oid, lappend_oid, etc.

You can print nodes easily inside gdb. First, to disable output truncation when you use the gdb print command:

(gdb) set print elements 0

Instead of printing values in gdb format, you can use the next two commands to print out List, Node, and structure contents in a verbose format that is easier to understand. Lists are unrolled into nodes, and nodes are printed in detail. The first prints in a short format, and the second in a long format:

(gdb) call print(any_pointer)
(gdb) call pprint(any_pointer)

The output appears in the server log file, or on your screen if you are running a backend directly without a postmaster.

=== I just added a field to a structure. What else should I do? ===

The structures passed around in the parser, rewriter, optimizer, and executor require quite a bit of support. Most structures have support routines in src/backend/nodes used to create, copy, read, and output those structures -- in particular, most node types need support in the files copyfuncs.c and equalfuncs.c, and some need support in outfuncs.c and possibly readfuncs.c. Make sure you add support for your new field to these files. Find any other places the structure might need code for your new field -- searching for references to existing fields of the struct is a good way to do that. mkid is helpful with this (see [[#What_tools_are_available_for_developers.3F|available tools]]).

=== Why do we use palloc() and pfree() to allocate memory? ===

palloc() and pfree() are used in place of malloc() and free() because we find it easier to automatically free all memory allocated when a query completes. This assures us that all memory that was allocated gets freed even if we have lost track of where we allocated it. There are special non-query contexts that memory can be allocated in. These affect when the allocated memory is freed by the backend.
=== What is ereport()? ===

ereport() is used to send messages to the front-end, and optionally terminate the current query being processed. See [http://developer.postgresql.org/pgdocs/postgres/error-message-reporting.html here] for more details on how to use it.

=== What is CommandCounterIncrement()? ===

Normally, statements can not see the rows they modify. This allows UPDATE foo SET x = x + 1 to work correctly.

However, there are cases where a transaction needs to see rows affected in previous parts of the transaction. This is accomplished using a Command Counter. Incrementing the counter allows transactions to be broken into pieces so each piece can see rows modified by previous pieces. CommandCounterIncrement() increments the Command Counter, creating a new part of the transaction.
=== What debugging features are available? ===

First, if you are developing new C code you should ALWAYS work in a build configured with the --enable-cassert and --enable-debug options. Enabling asserts turns on many sanity checking options. Enabling debug symbols supports use of debuggers (such as gdb) to trace through misbehaving code.

The postgres server has a -d option that allows detailed information to be logged (elog or ereport DEBUGn printouts). The -d option takes a number that specifies the debug level. Be warned that high debug level values generate large log files.

If the postmaster is running, start psql in one window, then find the PID of the postgres process used by psql using SELECT pg_backend_pid(). Use a debugger to attach to the postgres PID. You can set breakpoints in the debugger and then issue queries from the psql session. If you are looking to find the location that is generating an error or log message, set a breakpoint at errfinish. If you are debugging something that happens during session startup, you can set PGOPTIONS="-W n", then start psql. This will cause startup to delay for n seconds so you can attach to the process with the debugger, set appropriate breakpoints, then continue through the startup sequence.

If the postmaster is not running, you can actually run the postgres backend from the command line, and type your SQL statement directly. This is almost always a bad way to do things, however, since the usage environment isn't nearly as friendly as psql (no command history for instance) and there's no chance to study concurrent behavior. You might have to use this method if you broke initdb, but otherwise it has nothing to recommend it.

You can also compile with profiling to see what functions are taking execution time --- configuring with --enable-profiling is the recommended way to set this up. (You usually shouldn't use --enable-cassert when studying performance issues, since the checks it enables are not always cheap.) Profile files from server processes will be deposited in the pgsql/data directory. Profile files from clients such as psql will be put in the client's current directory.

[[Category:FAQ]]

Developer FAQ

2011-05-12T19:54:15Z

Schmiddy: /* Doesn't the PostgreSQL license itself require to keep the copyright notice intact? */ typofix

{{Languages}}

== Getting Involved ==

=== How do I get involved in PostgreSQL development? ===

Download the code and have a look around. See [[#How_do_I_download.2Fupdate_the_current_source_tree.3F|downloading the source tree]].

Subscribe to and read the [http://archives.postgresql.org/pgsql-hackers/ pgsql-hackers mailing list] (often termed "hackers"). This is where the major contributors and core members of the project discuss development.

=== How do I download/update the current source tree? ===

There are several ways to obtain the source tree. Occasional developers can just get the most recent source tree snapshot from ftp://ftp.postgresql.org/pub/snapshot/.

Regular developers might want to take advantage of anonymous access to our source code management system. The source tree is currently hosted in git. For details of how to obtain the source from git see http://developer.postgresql.org/pgdocs/postgres/git.html and [[Working with Git]].

=== What development environment is required to develop code? ===

PostgreSQL is developed mostly in the C programming language. The source code is targeted at most of the popular Unix platforms and the Windows environment (XP, Windows 2000, and up).

Most developers run a Unix-like operating system and use an open source tool chain with [http://gcc.gnu.org GCC], [http://www.gnu.org/software/make/make.html GNU Make], [http://www.gnu.org/software/gdb/gdb.html GDB], [http://www.gnu.org/software/autoconf/ Autoconf], and so on. If you have contributed to open source software before, you will probably be familiar with these tools. Developers using this tool chain on Windows make use of [http://www.mingw.org/ MinGW], though most development on Windows is currently done with the Microsoft Visual Studio 2005 (version 8) development environment and associated tools.

The complete list of required software to build PostgreSQL can be found in the [http://developer.postgresql.org/pgdocs/postgres/install-requirements.html installation instructions].

Developers who regularly rebuild the source often pass the --enable-depend flag to configure. The result is that if you make a modification to a C header file, all files depend upon that file are also rebuilt.

src/Makefile.custom can be used to set environment variables, like CUSTOM_COPT, that are used for every compile.

=== What areas need work? ===
Outstanding features are detailed in [[Todo]].

You can learn more about these features by consulting the [http://archives.postgresql.org/ archives], the SQL standards and the recommend texts (see [[#What_books_are_good_for_developers.3F|books for developers]]).

=== How do I get involved in PostgreSQL web site development? ===

PostgreSQL website development is discussed on the [http://archives.postgresql.org/pgsql-www/ pgsql-www mailing list]. There is a project page where the source code is available at http://pgweb.postgresql.org/.

== Development Tools and Help ==

=== How is the source code organized? ===

If you point your browser at [http://www.postgresql.org/developer/ext.backend.html How PostgreSQL Processes a Query], (also in a checkout of the source code, under src/tools/backend/index.html), you will see few paragraphs describing the data flow, the backend components in a flow chart, and a description of the shared memory area. You can click on any flowchart box to see a description. If you then click on the directory name, you will be taken to the source directory, to browse the actual source code behind it. We also have several README files in some source directories to describe the function of the module. The browser will display these when you enter the directory also.

Other than documentation in the source tree itself, you can find some papers/presentations discussing the code at http://www.postgresql.org/developer/coding. An excellent presentation is at http://neilconway.org/talks/hacking/

=== What tools are available for developers? ===

First, all the files in the src/tools directory are designed for developers.

RELEASE_CHANGES changes we have to make for each release
backend description/flowchart of the backend directories
ccsym find standard defines made by your compiler
copyright fixes copyright notices

entab converts spaces to tabs, used by pgindent
find_static finds functions that could be made static
find_typedef finds typedefs in the source code
find_badmacros finds macros that use braces incorrectly
fsync a script to provide information about the cost of cache
syncing system calls
make_ctags make vi 'tags' file in each directory
make_diff make *.orig and diffs of source
make_etags make emacs 'etags' files
make_keywords make comparison of our keywords and SQL'92
make_mkid make mkid ID files
git_changelog used to generate a list of changes for each release
pginclude scripts for adding/removing include files
pgindent indents source files
pgtest a semi-automated build system
thread a thread testing script

In src/include/catalog:

unused_oids a script that finds unused OIDs for use in system catalogs
duplicate_oids finds duplicate OIDs in system catalog definitions

tools/backend was already described in the question-and-answer above.

Second, you really should have an editor that can handle tags, so you can tag a function call to see the function definition, and then tag inside that function to see an even lower-level function, and then back out twice to return to the original function. Most editors support this via tags or etags files.

Third, you need to get id-utils from ftp://ftp.gnu.org/gnu/id-utils/

By running tools/make_mkid, an archive of source symbols can be created that can be rapidly queried.

Some developers make use of cscope, which can be found at http://cscope.sf.net/. Others use glimpse, which can be found at http://webglimpse.net/.

tools/make_diff has tools to create patch diff files that can be applied to the distribution. This produces context diffs, which is our preferred format.

pgindent is used to fix the source code style to conform to our standards, and is normally run at the end of each development cycle; see [[#What.27s_the_formatting_style_used_in_PostgreSQL_source_code.3F|this question]] for more information on our style.

pginclude contains scripts used to add needed #include's to include files, and removed unneeded #include's.

When adding built-in objects such as types or functions, you will need to assign OIDs to them. Our convention is that all hand-assigned OIDs are distinct values in the range 1-9999. (It would work mechanically for them to be unique within individual system catalogs, but for clarity we require them to be unique across the whole system.) There is a script called unused_oids in src/include/catalog that shows the currently unused OIDs. To assign a new OID, pick one that is free according to unused_oids, and for bonus points pick one that is nearby to related existing objects. See also the duplicate_oids script, which will complain if you made a mistake.

=== What's the formatting style used in PostgreSQL source code? ===

Our standard format BSD style, with each level of code indented one tab, where each tab is four spaces. You will need to set your editor or file viewer to display tabs as four spaces:

For '''vi''' (in <code>.exrc</code> or <code>.vimrc</code>):
set tabstop=4 shiftwidth=4 noexpandtab

For '''less''' or '''more''', specify <code>-x4</code> to get the correct indentation.

The tools/editors directory of the latest sources contains sample settings that can be used with the emacs, xemacs and vim editors, that assist in keeping to PostgreSQL coding standards.

pgindent will the format code by specifying flags to your operating system's utility indent. pgindent is run on all source files just before each beta test period. It auto-formats all source files to make them consistent. Comment blocks that need specific line breaks should be formatted as block comments, where the comment starts as /*------. These comments will not be reformatted in any way.

See also [http://developer.postgresql.org/pgdocs/postgres/source-format.html the Formatting section] in the documentation. [http://archives.postgresql.org/message-id/1221125165.5637.12.camel@abbas-laptop This posting] talks about our naming of variable and function names.

If you're wondering why we bother with this, [http://ezine.daemonnews.org/200112/single_coding_style.html this article] describes the value of a consistent coding style.

=== Is there a diagram of the system catalogs available? ===

Yes, we have [http://dalibo.org/_media/articles/catalog.png at least one] ([http://svn.postgresql.fr/repos/materials/advocacy/trunk/posters/catalogs83.svg SVG version]).

=== What books are good for developers? ===

There are five good books:

* An Introduction to Database Systems, by C.J. Date, Addison, Wesley
* A Guide to the SQL Standard, by C.J. Date, et. al, Addison, Wesley
* Fundamentals of Database Systems, by Elmasri and Navathe
* Transaction Processing, by Jim Gray and Andreas Reuter, Morgan Kaufmann
* Transactional Information Systems, by Gerhard Weikum and Gottfried Vossen, Morgan Kaufmann

=== What is configure all about? ===

The files configure and configure.in are part of the GNU autoconf package. Configure allows us to test for various capabilities of the OS, and to set variables that can then be tested in C programs and Makefiles. Autoconf is installed on the PostgreSQL main server. To add options to configure, edit configure.in, and then run autoconf to generate configure.

When configure is run by the user, it tests various OS capabilities, stores those in config.status and config.cache, and modifies a list of *.in files. For example, if there exists a Makefile.in, configure generates a Makefile that contains substitutions for all @var@ parameters found by configure.

When you need to edit files, make sure you don't waste time modifying files generated by configure. Edit the *.in file, and re-run configure to recreate the needed file. If you run make distclean from the top-level source directory, all files derived by configure are removed, so you see only the file contained in the source distribution.
=== How do I add a new port? ===

There are a variety of places that need to be modified to add a new port. First, start in the src/template directory. Add an appropriate entry for your OS. Also, use src/config.guess to add your OS to src/template/.similar. You shouldn't match the OS version exactly. The configure test will look for an exact OS version number, and if not found, find a match without version number. Edit src/configure.in to add your new OS. (See configure item above.) You will need to run autoconf, or patch src/configure too.

Then, check src/include/port and add your new OS file, with appropriate values. Hopefully, there is already locking code in src/include/storage/s_lock.h for your CPU. There is also a src/makefiles directory for port-specific Makefile handling. There is a backend/port directory if you need special files for your OS.
=== Why don't you use threads, raw devices, async-I/O, <insert your favorite wizz-bang feature here>? ===

There is always a temptation to use the newest operating system features as soon as they arrive. We resist that temptation.

First, we support 15+ operating systems, so any new feature has to be well established before we will consider it. Second, most new wizz-bang features don't provide dramatic improvements. Third, they usually have some downside, such as decreased reliability or additional code required. Therefore, we don't rush to use new features but rather wait for the feature to be established, then ask for testing to show that a measurable improvement is possible.

As an example, threads are not currently used instead of multiple processes for backends because:

* Historically, threads were poorly supported and buggy.
* An error in one backend can corrupt other backends if they're threads within a single process
* Speed improvements using threads are small compared to the remaining backend startup time.
* The backend code would be more complex.
* Terminating backend processes allows the OS to cleanly and quickly free all resources, protecting against memory and file descriptor leaks and making backend shutdown cheaper and faster
* Debugging threaded programs is much harder than debugging worker processes, and core dumps are much less useful
* Sharing of read-only executable mappings and the use of shared_buffers means processes, like threads, are very memory efficient
* Regular creation and destruction of processes helps protect against memory fragmentation, which can be hard to manage in long-running processes

(Whether individual backend processes should use multiple threads to make use of multiple cores for single queries is a separate question not covered here).

So, we are not ignorant of new features. It is just that we are cautious about their adoption. The TODO list often contains links to discussions showing our reasoning in these areas.

==== Why aren't there more compression options when dumping tables? ====

pg_dump's built-in compression method is gzip.
The primary alternative, bzip2, is normally far too slow to be useful when dumping large tables.

The two main alternatives regularly proposed for better built-in compression at good speeds are LZO and LZMA/LZMA2/XZ. LZO is released under the GPL, incompatible with PostgreSQL. The LZMA2 code has been released into the public domain, but the C port is a secondary one (C++ is the main development focus) whose code quality hasn't seemed appropriate for this project. And this whole area has traditionally been filled with patent issues that go beyond just the restrictions of the software license.

Another limitation on changing this is that pg_dump output is intended to be archivable, so we had better be prepared to support compression methods for a very long time. The "latest and greatest" compression method is exactly what we *don't* want.

See the [http://archives.postgresql.org/pgsql-hackers/2009-02/msg00352.php archives] for an idea what characteristics an alternate compression tool would need to have in order to be considered for use in core PostgreSQL.

=== How are branches managed? ===

See [[Working_with_Git#Using_Back_Branches|Using Back Branches]] and [[Committing with Git]] for information about how branches and backporting are handled.

=== Where can I get a copy of the SQL standards? ===
You are supposed to buy them from [http://www.iso.ch/ ISO] or [http://www.ansi.org ANSI]. Search for ISO/ANSI 9075. ANSI's offer is less expensive, but the contents of the documents are the same between the two organizations.

Since buying an official copy of the standard is quite expensive, most developers rely on one of the various draft versions available on the Internet. Some of these are:
* SQL-92 http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
* SQL:1999 http://web.cs.ualberta.ca/~yuan/courses/db_readings/ansi-iso-9075-2-1999.pdf
* SQL:2003 http://www.wiscorp.com/sql_2003_standard.zip
* SQL:2008 (preliminary) http://www.wiscorp.com/sql200n.zip

The PostgreSQL documentation contains information about PostgreSQL and [http://developer.postgresql.org/pgdocs/postgres/features.html SQL conformance].

Some further web pages about the SQL standard are:
* http://troels.arvin.dk/db/rdbms/links/#standards
* http://www.wiscorp.com/SQLStandards.html
* http://www.contrib.andrew.cmu.edu/~shadow/sql.html#syntax (SQL-92)
* http://dbs.uni-leipzig.de/en/lokal/standards.pdf (paper)

Note that having access to a copy of the SQL standard is not necessary to become a useful contributor to PostgreSQL development. Interpreting the standard is difficult and needs years of experience. And most features in PostgreSQL are not specified in the standard anyway.

=== Where can I get technical assistance? ===

Many technical questions held by those new to the code have been answered on the pgsql-hackers mailing list - the archives of which can be found at http://archives.postgresql.org/pgsql-hackers/.

If you cannot find discussion or your particular question, feel free to put it to the list.

Major contributors also answer technical questions, including questions about development of new features, on IRC at irc.freenode.net in the #postgresql channel.

=== Why haven't you replaced CVS with SVN, Git, Monotone, VSS, <insert your favorite SCMS here>? ===
The project switched to Git in September 2010.

== Development Process ==

=== What do I do after choosing an item to work on? ===

Send an email to pgsql-hackers with a proposal for what you want to do (assuming your contribution is not trivial). Working in isolation is not advisable because others might be working on the same TODO item, or you might have misunderstood the TODO item. In the email, discuss both the internal implementation method you plan to use, and any user-visible changes (new syntax, etc). For complex patches, it is important to get community feedback on your proposal before starting work. Failure to do so might mean your patch is rejected. If your work is being sponsored by a company, read [http://momjian.us/main/writings/pgsql/company_contributions/ this article] for tips on being more effective.

Our queue of patches to be reviewed is maintained via a custom [[CommitFest]] web application at http://commitfest.postgresql.org.

=== How do I test my changes? ===

==== Basic system testing ====

The easiest way to test your code is to ensure that it builds against the latest version of the code and that it does not generate compiler warnings.

It is worth advised that you pass --enable-cassert to configure. This will turn on assertions within the source which will often make bugs more visible because they cause data corruption or segmentation violations. This generally makes debugging much easier.

Then, perform run time testing via psql.

==== Runtime environment ====

To test your modified version of PostgreSQL, it's convenient to install PostgreSQL into a local directory (in your home
directory, for instance) to avoid conflicting with a system wide
installation. Use the ''--prefix='' option to configure to specify an installation
location; ''--with-pgport'' to specify a non-standard default port is
helpful as well. To run this instance, you will need to make sure that the correct
binaries are used; depending on your operating system, environment variables
like PATH and LD_LIBRARY_PATH (on most Linux/Unix-like systems) need to be
set. Setting PGDATA will also be useful.

To avoid having to set this environment up manually, you may want to use
Greg Smith's [https://github.com/gregs1104/peg peg] scripts,or the
[https://github.com/PGBuildFarm/client-code scripts] that are used on the
buildfarm.

==== Regression test suite ====

The next step is to test your changes against the existing regression test suite. To do this, issue "make check" in the root directory of the source tree. If any tests fail, investigate.

If you've deliberately changed existing behavior, this change might cause a regression test failure but not any actual regression. If so, you should also patch the regression test suite.

==== Other run time testing ====

Some developers make use of tools such as valgrind (http://valgrind.kde.org) for memory testing, gprof (which comes with the GNU binutils suite) and oprofile (http://oprofile.sourceforge.net/) for profiling and other related tools.

==== What about unit testing, static analysis, model checking...? ====

There have been a number of discussions about other testing frameworks and some developers are exploring these ideas.

Keep in mind the Makefiles do not have the proper dependencies for include files. You have to do a make clean and then another make. If you are using GCC you can use the --enable-depend option of configure to have the compiler compute the dependencies automatically.

=== I have developed a patch, what next? ===

You will need to submit the patch to pgsql-hackers@postgresql.org. To help ensure your patch is reviewed and committed in a timely fashion, please try to follow the guidelines at [[Submitting a Patch]].

=== What happens to my patch once it is submitted? ===

It will be reviewed by other contributors to the project and will be either accepted or sent back for further work. The process is explained in more detail at [[Submitting a Patch#Patch review and commit|Submitting a Patch]].

=== How do I help with reviewing patches? ===

If you would like to contribute by reviewing a patch in the [http://commitfest.postgresql.org CommitFest] queue, you are most welcome to do so. Please read the guide at [[Reviewing a Patch]] for more information.

=== Do I need to sign a copyright assignment? ===

No, contributors keeps their copyright (as is the case for most
European countries anyway). They simply consider themselves to be part of
the Postgres Global Development Group. (It's not even possible to assign
copyright to PGDG, as it's not a legal entity). This is the same way that
the Linux Kernel and many other Open Source projects works.

=== May I add my own copyright notice where appropriate? ===

No, please don't. We like to keep the legal information short and crisp.
Additionally, we've heard that could possibly pose problems for
corporate users.

=== Doesn't the PostgreSQL license itself require to keep the copyright notice intact? ===

Yes, it does. And it is, because the PostgreSQL Global Development Group
covers all copyright holders. Also note that US law doesn't require any
copyright notice for getting the copyright granted, just like most
European laws.

== Technical Questions ==
=== How do I efficiently access information in system catalogs from the backend code? ===

You first need to find the tuples (rows) you are interested in. There are two ways. First, SearchSysCache() and related functions allow you to query the system catalogs using predefined indexes on the catalogs. This is the preferred way to access system tables, because the first call to the cache loads the needed rows, and future requests can return the results without accessing the base table. A list of available caches is located in src/backend/utils/cache/syscache.c. src/backend/utils/cache/lsyscache.c contains many column-specific cache lookup functions.

The rows returned are cache-owned versions of the heap rows. Therefore, you must not modify or delete the tuple returned by SearchSysCache(). What you should do is release it with ReleaseSysCache() when you are done using it; this informs the cache that it can discard that tuple if necessary. If you neglect to call ReleaseSysCache(), then the cache entry will remain locked in the cache until end of transaction, which is tolerable during development but not considered acceptable for release-worthy code.

If you can't use the system cache, you will need to retrieve the data directly from the heap table, using the buffer cache that is shared by all backends. The backend automatically takes care of loading the rows into the buffer cache. To do this, open the table with heap_open(). You can then start a table scan with heap_beginscan(), then use heap_getnext() and continue as long as HeapTupleIsValid() returns true. Then do a heap_endscan(). Keys can be assigned to the scan. No indexes are used, so all rows are going to be compared to the keys, and only the valid rows returned.

You can also use heap_fetch() to fetch rows by block number/offset. While scans automatically lock/unlock rows from the buffer cache, with heap_fetch(), you must pass a Buffer pointer, and ReleaseBuffer() it when completed.

Once you have the row, you can get data that is common to all tuples, like t_self and t_oid, by merely accessing the HeapTuple structure entries. If you need a table-specific column, you should take the HeapTuple pointer, and use the GETSTRUCT() macro to access the table-specific start of the tuple. You then cast the pointer, for example as a Form_pg_proc pointer if you are accessing the pg_proc table, or Form_pg_type if you are accessing pg_type. You can then access fields of the tuple by using the structure pointer:

((Form_pg_class) GETSTRUCT(tuple))->relnatts

Note however that this only works for columns that are fixed-width and never null, and only when all earlier columns are likewise fixed-width and
never null. Otherwise the column's location is variable and you must use heap_getattr() or related functions to extract it from the tuple.

Also, avoid storing directly into struct fields as a means of changing live tuples. The best way is to use heap_modifytuple() and pass it your original tuple, plus the values you want changed. It returns a palloc'ed tuple, which you pass to heap_update(). You can delete tuples by passing the tuple's t_self to heap_delete(). You use t_self for heap_update() too. Remember, tuples can be either system cache copies, which might go away after you call ReleaseSysCache(), or read directly from disk buffers, which go away when you heap_getnext(), heap_endscan, or ReleaseBuffer(), in the heap_fetch() case. Or it may be a palloc'ed tuple, that you must pfree() when finished.
=== Why are table, column, type, function, view names sometimes referenced as Name or NameData, and sometimes as char *? ===

Table, column, type, function, and view names are stored in system tables in columns of type Name. Name is a fixed-length, null-terminated type of NAMEDATALEN bytes. (The default value for NAMEDATALEN is 64 bytes.)

typedef struct nameData
{
char data[NAMEDATALEN];
} NameData;
typedef NameData *Name;

Table, column, type, function, and view names that come into the backend via user queries are stored as variable-length, null-terminated character strings.

Many functions are called with both types of names, ie. heap_open(). Because the Name type is null-terminated, it is safe to pass it to a function expecting a char *. Because there are many cases where on-disk names(Name) are compared to user-supplied names(char *), there are many cases where Name and char * are used interchangeably.

=== Why do we use Node and List to make data structures? ===

We do this because this allows a consistent way to pass data inside the backend in a flexible way. Every node has a NodeTag which specifies what type of data is inside the Node. Lists are groups of Nodes chained together as a forward-linked list. The ordering of the list elements might or might not be significant, depending on the usage of the particular list.

Here are some of the List manipulation commands:

;lfirst(i)
;lfirst_int(i)
;lfirst_oid(i)
:return the data (a pointer, integer or OID respectively) of list cell i.

;lnext(i)
:return the next list cell after i.

;foreach(i, list)
:loop through list, assigning each list cell to i.

It is important to note that i is a <code>ListCell *</code>, not the data in the List cell. You need to use one of the lfirst variants to get at the cell's data.

Here is a typical code snippet that loops through a List containing <code>Var *</code> cells and processes each one:

List *list;
ListCell *i;
...
foreach(i, list)
{
Var *var = (Var *) lfirst(i);
...
/* process var here */
}

;lcons(node, list)
:add node to the front of list, or create a new list with node if list is NIL.

;lappend(list, node)
:add node to the end of list.

;list_concat(list1, list2)
:Concatenate list2 on to the end of list1.

;list_length(list)
:return the length of the list.

;list_nth(list, i)
:return the i'th element in list, counting from zero.

;lcons_int, ...
:There are integer versions of these: lcons_int, lappend_int, etc. Also versions for OID lists: lcons_oid, lappend_oid, etc.

You can print nodes easily inside gdb. First, to disable output truncation when you use the gdb print command:

(gdb) set print elements 0

Instead of printing values in gdb format, you can use the next two commands to print out List, Node, and structure contents in a verbose format that is easier to understand. Lists are unrolled into nodes, and nodes are printed in detail. The first prints in a short format, and the second in a long format:

(gdb) call print(any_pointer)
(gdb) call pprint(any_pointer)

The output appears in the server log file, or on your screen if you are running a backend directly without a postmaster.

=== I just added a field to a structure. What else should I do? ===

The structures passed around in the parser, rewriter, optimizer, and executor require quite a bit of support. Most structures have support routines in src/backend/nodes used to create, copy, read, and output those structures -- in particular, most node types need support in the files copyfuncs.c and equalfuncs.c, and some need support in outfuncs.c and possibly readfuncs.c. Make sure you add support for your new field to these files. Find any other places the structure might need code for your new field -- searching for references to existing fields of the struct is a good way to do that. mkid is helpful with this (see [[#What_tools_are_available_for_developers.3F|available tools]]).

=== Why do we use palloc() and pfree() to allocate memory? ===

palloc() and pfree() are used in place of malloc() and free() because we find it easier to automatically free all memory allocated when a query completes. This assures us that all memory that was allocated gets freed even if we have lost track of where we allocated it. There are special non-query contexts that memory can be allocated in. These affect when the allocated memory is freed by the backend.
=== What is ereport()? ===

ereport() is used to send messages to the front-end, and optionally terminate the current query being processed. See [http://developer.postgresql.org/pgdocs/postgres/error-message-reporting.html here] for more details on how to use it.

=== What is CommandCounterIncrement()? ===

Normally, statements can not see the rows they modify. This allows UPDATE foo SET x = x + 1 to work correctly.

However, there are cases where a transaction needs to see rows affected in previous parts of the transaction. This is accomplished using a Command Counter. Incrementing the counter allows transactions to be broken into pieces so each piece can see rows modified by previous pieces. CommandCounterIncrement() increments the Command Counter, creating a new part of the transaction.
=== What debugging features are available? ===

First, if you are developing new C code you should ALWAYS work in a build configured with the --enable-cassert and --enable-debug options. Enabling asserts turns on many sanity checking options. Enabling debug symbols supports use of debuggers (such as gdb) to trace through misbehaving code.

The postgres server has a -d option that allows detailed information to be logged (elog or ereport DEBUGn printouts). The -d option takes a number that specifies the debug level. Be warned that high debug level values generate large log files.

If the postmaster is running, start psql in one window, then find the PID of the postgres process used by psql using SELECT pg_backend_pid(). Use a debugger to attach to the postgres PID. You can set breakpoints in the debugger and then issue queries from the psql session. If you are looking to find the location that is generating an error or log message, set a breakpoint at errfinish. If you are debugging something that happens during session startup, you can set PGOPTIONS="-W n", then start psql. This will cause startup to delay for n seconds so you can attach to the process with the debugger, set appropriate breakpoints, then continue through the startup sequence.

If the postmaster is not running, you can actually run the postgres backend from the command line, and type your SQL statement directly. This is almost always a bad way to do things, however, since the usage environment isn't nearly as friendly as psql (no command history for instance) and there's no chance to study concurrent behavior. You might have to use this method if you broke initdb, but otherwise it has nothing to recommend it.

You can also compile with profiling to see what functions are taking execution time --- configuring with --enable-profiling is the recommended way to set this up. (You usually shouldn't use --enable-cassert when studying performance issues, since the checks it enables are not always cheap.) Profile files from server processes will be deposited in the pgsql/data directory. Profile files from clients such as psql will be put in the client's current directory.

[[Category:FAQ]]

New phpPgAdmin Plugin Architecture GSoC 2011

2011-05-08T02:27:47Z

Schmiddy: Cleaned up grammar, syntax, and spelling

== Developers ==
Student: Leonardo Augusto Sápiras

Mentor: Jehan-Guillaume de Rorthais

Co-Mentor: Andreas Scherbaum

== Synopsis ==

This project will create a new plugin architecture for phpPgAdmin, using the Hook Pattern.

== Benefits to the PostgreSQL Community ==

The current phpPgAdmin plugin architecture is deprecated, having only one plugin, Slony. Today, to create a plugin, a developer needs to write intrusive code inside the phpPgAdmin core, as Slony does today, and it is not good. With a new architecture, more ideas could be developed inside phpPgAdmin without intrusive code.

With a good plugin architecture, new plugins will be more easily created and maintained, phPgAdmin will have more users, and possibly more developers and collaborators as well.

There are some ideas that are waiting for a new architecture to be developed as plugins. Like:

* dbdesigner plugin
* pgpooladmin plugin
* crud plugin

This project has been discussed in the phpPgAdmin e-mail list some time ago, and between me and Mr. Jehan-Guillaume de Rorthais some months ago. Last year Jehan-Guillaume was my mentor, so I would like to have him as my mentor again.

== Quantifiable results ==

# Refactor the current plugin architecture, creating a plugin manager to deal with the plugins;
# Create a new plugin, to give the users a live example of how to create and integrate a plugin with the PPA core;
# Tests;
# Documentation: how to create a plugin for phpPgAdmin and integrate it with its components (action buttons, browser tree, trail, tabs, navigation links, top links);
# If I have time enough, I will refactor the current Slony plugin, for this feature works with the new architecture.

== Project Details ==

=== Refactor the current plugin architecture ===

In my participation in the Google Summer of Code 2011, I have the goal to refactor the current plugin architecture.

The new plugins will have the following structure:

<code>
-plugin
|--conf/
| |--config.inc.php
|--lang/
| |--recoded
| | |--english.php
| |--english.php
| |--Makefile
|--js/
|--classes/
|--images/
|--help/
|--tests/ (optional)
|--themes/ (optional)
|--ppa_plugin.php
</code>

As can be seen above, the plugins will have their own pages, translation and configuration files.

The new plugin architecture will be developed using the [http://stevenblack.com/HooksAndAnchorsDesignPattern.html Hooks Pattern]. This way, the plugins will register their functions to the events they want to hook to. There are a lot of applications that use this concept, like the [http://api.drupal.org/api/drupal/includes--module.inc/group/hooks/7 Drupal Open Source CMS] does.

=== Inch-stones ===

The plugins on this architecture will be able to:

* Add an entry in the browser tree in any level
* Add an entry in the tabs
* Add an entry in the trailer
* Add an entry in the navigation links
* Add an entry in the action buttons
* Add an entry in the top links

[[File:ppa_gsoc2011_screen.jpg]]

=== Plugin activation ===

To activate new plugins in phpPgAdmin, a new variable (array) $conf['plugins'] will be added in the PPA configuration file (config.inc.php). So, to activate a plugin, the user just needs to add the plugin's name in this array. Example:

<code>
/* config.inc.php */
...
$conf['plugins'][] = 'ppa_slony';
$conf['plugins'][] = 'ppa_x';
$conf['plugins'][] = 'ppa_another';
...
</code>

=== Plugin execution ===

phpPgAdmin will have a plugin manager, which as the name implies, will manage activated plugins. Below is an example of how the plugin_manager.php might look:

<code>
<?php
class PluginManager {
...
function add_plugin($obj_plugin) {
$this->plugins_list[$obj_plugin->get_name()] = $obj_plugin;
}

function get_plugin($plugin_name) {
return $this->plugins_list[$plugin_name];
}

function add_plugins_functions($plugin_name, $when, $function_name) {
$this->plugins_functions[$when][] = array('plugin_name' => $plugin_name, 'function_name'=> $function_name);
}
function execute_plugins_funtions($when) {
foreach ($this->plugins_functions[$when] as $node) {
$plugin_name = $node['plugin_name'];
$function_name = $node['function_name'];
$obj_plugin = $this->get_plugin($plugin_name);

if (method_exists($obj_plugin, $function_name)) {
call_user_func(array($obj_plugin, $function_name));
}
}
}
...
}
?>
</code>

This plugin manager will be used in the lib.inc.php file, instantiating and registering the plugins and their functions, like the example below:

<code>
<?php
/* lib.inc.php */
...
$obj_plugin_manager = new PluginManager();
//register the plugins and their functions
foreach ($activated_plugins as $plugin) {
include_once('./plugins/'.$plugin.'/plugin.php');
$obj_plugin = new $plugin($obj_plugin_manager);
$obj_plugin_manager->add_plugin($obj_plugin);
}
...
?>
</code>

In the plugin, the functions will be registered with an attribute saying where they will be used by phpPgAdmin's core. Below an example of a simple plugin:

<code>
<?php
...
class Plugin1 {
...
function __construct($obj_plugin_manager) {
$obj_plugin_manager->add_plugins_functions($this->name,
'before_trail_creation',
'create_trail_links');
$obj_plugin_manager->add_plugins_functions($this->name,
'after_show_tabs',
'show_tab_links');
}

function create_trail_links() {
/* show the plugin's trail links */
}

function show_tab_links() {
/* show the plugin's tab links */
}
}
?>
</code>

So, in determined places of phpPgAdmin, e.g. the functions that create the browser tree, trail or tabs, the plugin's function will be called as below:

<code>
<?php
...
$obj_plugin_manager->execute_plugins_funtions('before_trail_creation');
...
$obj_plugin_manager->execute_plugins_funtions('after_show_tabs');
...
?>
</code>

This way, the plugin will 'say' to phpPgAdmin core, what are its elements that will be shown in the PPA components.

To demonstrate, I created a [http://fit.faccat.br/~leonardo/gsoc2011/projeto_plugin_3.tar.gz basic functional example].

== Project Schedule ==

During the Google Summer of Code 2011, I will be on-line on IRC (irc.freenode.net) channels #phppgadmin, #postgresql and #gsoc available to talk. And, at least once a week, I will contact my mentor reporting what was done, what I will do in the next week, and what is preventing me from carrying out a certain activity. I will also write my activities in my [http://sapiras.blogspot.com blog].

If there are other projects ideas being developed at the same time, it is important that developers of these applications need to be contacted to provide their versions to run tests. But that will be defined together with my mentor and phpPgAdmin developers.

A copy of this schedule can be found at [https://www.google.com/calendar/embed?src=j021kmt0up95b6iqfnpmbgav2o%40group.calendar.google.com&ctz=America/Sao_Paulo Google Calendar] (with some small changes).

'''April 25 until May 22 (Community Bonding Period)''': During that period, I will be in contact to my mentor and phpPgAdmin developers, explaining more detail about the accepted proposal. I will also review my project with them to see if is not missing anything before start coding;

'''May 23''': I start coding, creating the new basic plugin structure that will be used as example for other developers;

'''May 27''': Meeting with the mentor, when I will show him the basic plugin architecture, and start developing the Plugin Manager;

'''June 3''': Meeting with the mentor. Start developing the integration among the basic plugin example and the tabs and top links;

'''June 10''': Meeting with the mentor;

'''June 17''': Meeting with the mentor. At this date, I will start developing the integration between plugins and the trail;

'''June 24''': Meeting with the mentor. At this date, I will start working in the integration between plugins and the navigation links;

'''July 1''': Meeting with the mentor;

'''July 8''': Meeting with the mentor. Finish the integration with navigation links. Start developing the integration with the action buttons;

'''July 11 until 15''': Creation and submission of the mid-term evaluation;

'''July 15''': Meeting with the mentor;

'''July 22''': Meeting with the mentor. Start integrating the plugins with the browser tree;

'''July 29''': Meeting with the mentor;

'''August 5''': Meeting with the mentor. Finish the integration with the browser tree. Start testing and making final modifications;

'''August 12''': Meeting with the mentor. On this day, I will stop coding, and start creating the final documentation;

'''August 19''': Meeting with the mentor. Review of the documentation;

'''August 22''': Finish the documentation;

'''August 22 until 26''': Submission of the final evaluation;

== Completeness Criteria ==

This project will be accomplished if the new Architecture Plugin:

* Enable developers to create plugins easily having a clear documentation
* Enable the news plugins to be added in the action buttons, browser tree, tabs, trail, navigation links and top links
* Enable plugins to not have intrusive codes in phpPgAdmin's core
* Have an easy way to enable and disable plugins

User:Schmiddy

2011-04-17T18:19:37Z

Schmiddy: /* TODO Items / Things to Investigate */

== PostgreSQL Notes and Misc ==

=== TODO Items / Things to Investigate ===

# <del>column-level UPDATE privs + LOCK TABLE: [http://archives.postgresql.org/pgsql-hackers/2010-10/msg01042.php patch]</del>
# <del>segfaults in openjade while building PDF of docs. Docs should be updated to tell users to avoid 1.4devel openjade.</del>
# Several users have complained about bookmarks disappearing/reappearing in the doc PDFs for 8.3 and 9.0. What's causing this?
# <del>Why do psql's \z and friends not autocomplete tablenames?</del>[http://archives.postgresql.org/pgsql-hackers/2010-10/msg02002.php patch]
# <del>On OS X only: segfault in psql's autocomplete, due to buggy readline library in OS X.</del>
# CREATE TABLE newschema.newtable (LIKE oldschema.oldtable INCLUDING INDEXES) -> on 8.3, renames indexes named "foo_idx" to "foo_key", though looks like this is fixed in later branches?
# On OS X only: \dn pub[TAB] gives "public/" as the autocompleted version of schema "public"
# Fix up html builds of docs to move towards XHTML compliance
# PL/pgSQL function to handle atomic table swaps - useful for snapshot materialized view refreshes with no locks held on the original table until close to COMMIT time.
# make distclean doesn't get rid of .html files created by "make html"?
# ugliness in makefiles for doc builds
# document steps to get doc builds working on OS X. Not easy at all.. having problems with gettext/getopt, which are dependencies of 'xmlto'.
# have psql give an error message upon a badly formatted .pgpass file
; Contact Info
: [mailto:josh**at**kupershmidt.org Email: Josh Kupershmidt]
: [http://kupershmidt.org Josh Kupershmidt's Homepage]

User:Schmiddy

2011-04-13T01:31:22Z

Schmiddy: /* TODO Items / Things to Investigate */

== PostgreSQL Notes and Misc ==

=== TODO Items / Things to Investigate ===

# <del>column-level UPDATE privs + LOCK TABLE: [http://archives.postgresql.org/pgsql-hackers/2010-10/msg01042.php patch]</del>
# segfaults in openjade while building PDF of docs. Docs should be updated to tell users to avoid 1.4devel openjade.
# Several users have complained about bookmarks disappearing/reappearing in the doc PDFs for 8.3 and 9.0. What's causing this?
# <del>Why do psql's \z and friends not autocomplete tablenames?</del>[http://archives.postgresql.org/pgsql-hackers/2010-10/msg02002.php patch]
# <del>On OS X only: segfault in psql's autocomplete, due to buggy readline library in OS X.</del>
# CREATE TABLE newschema.newtable (LIKE oldschema.oldtable INCLUDING INDEXES) -> on 8.3, renames indexes named "foo_idx" to "foo_key", though looks like this is fixed in later branches?
# On OS X only: \dn pub[TAB] gives "public/" as the autocompleted version of schema "public"
# Fix up html builds of docs to move towards XHTML compliance
# PL/pgSQL function to handle atomic table swaps - useful for snapshot materialized view refreshes with no locks held on the original table until close to COMMIT time.
# make distclean doesn't get rid of .html files created by "make html"?
# ugliness in makefiles for doc builds
# document steps to get doc builds working on OS X. Not easy at all.. having problems with gettext/getopt, which are dependencies of 'xmlto'.
# have psql give an error message upon a badly formatted .pgpass file
; Contact Info
: [mailto:josh**at**kupershmidt.org Email: Josh Kupershmidt]
: [http://kupershmidt.org Josh Kupershmidt's Homepage]

User:Schmiddy

2011-04-10T23:28:49Z

Schmiddy: /* TODO Items / Things to Investigate */

== PostgreSQL Notes and Misc ==

=== TODO Items / Things to Investigate ===

# <del>column-level UPDATE privs + LOCK TABLE: [http://archives.postgresql.org/pgsql-hackers/2010-10/msg01042.php patch]</del>
# segfaults in openjade while building PDF of docs. Docs should be updated to tell users to avoid 1.4devel openjade.
# <del>Why do psql's \z and friends not autocomplete tablenames?</del>[http://archives.postgresql.org/pgsql-hackers/2010-10/msg02002.php patch]
# <del>On OS X only: segfault in psql's autocomplete, due to buggy readline library in OS X.</del>
# CREATE TABLE newschema.newtable (LIKE oldschema.oldtable INCLUDING INDEXES) -> on 8.3, renames indexes named "foo_idx" to "foo_key", though looks like this is fixed in later branches?
# On OS X only: \dn pub[TAB] gives "public/" as the autocompleted version of schema "public"
# Fix up html builds of docs to move towards XHTML compliance
# PL/pgSQL function to handle atomic table swaps - useful for snapshot materialized view refreshes with no locks held on the original table until close to COMMIT time.
# make distclean doesn't get rid of .html files created by "make html"?
# ugliness in makefiles for doc builds
# document steps to get doc builds working on OS X. Not easy at all.. having problems with gettext/getopt, which are dependencies of 'xmlto'.
# have psql give an error message upon a badly formatted .pgpass file
; Contact Info
: [mailto:josh**at**kupershmidt.org Email: Josh Kupershmidt]
: [http://kupershmidt.org Josh Kupershmidt's Homepage]

User:Schmiddy

2011-04-08T20:04:13Z

Schmiddy: /* TODO Items / Things to Investigate */

== PostgreSQL Notes and Misc ==

=== TODO Items / Things to Investigate ===

# <del>column-level UPDATE privs + LOCK TABLE: [http://archives.postgresql.org/pgsql-hackers/2010-10/msg01042.php patch]</del>
# segfaults in openjade while building PDF of docs. Docs should be updated to tell users to avoid 1.4devel openjade.
# <del>Why do psql's \z and friends not autocomplete tablenames?</del>[http://archives.postgresql.org/pgsql-hackers/2010-10/msg02002.php patch]
# <del>On OS X only: segfault in psql's autocomplete, due to buggy readline library in OS X.</del>
# CREATE TABLE newschema.newtable (LIKE oldschema.oldtable INCLUDING INDEXES) -> on 8.3, renames indexes named "foo_idx" to "foo_key", though looks like this is fixed in later branches?
# On OS X only: \dn pub[TAB] gives "public/" as the autocompleted version of schema "public"
# Fix up html builds of docs to move towards XHTML compliance
# PL/pgSQL function to handle atomic table swaps - useful for snapshot materialized view refreshes with no locks held on the original table until close to COMMIT time.
# make distclean doesn't get rid of .html files created by "make html"?
# ugliness in makefiles for doc builds
# document steps to get doc builds working on OS X. Not easy at all.. having problems with gettext/getopt, which are dependencies of 'xmlto'.
; Contact Info
: [mailto:josh**at**kupershmidt.org Email: Josh Kupershmidt]
: [http://kupershmidt.org Josh Kupershmidt's Homepage]

User:Schmiddy

2011-04-08T20:01:16Z

Schmiddy:

== PostgreSQL Notes and Misc ==

=== TODO Items / Things to Investigate ===

# <del>column-level UPDATE privs + LOCK TABLE: [http://archives.postgresql.org/pgsql-hackers/2010-10/msg01042.php patch]</del>
# <del>segfaults in openjade while building PDF of docs</del>
# <del>Why do psql's \z and friends not autocomplete tablenames?</del>[http://archives.postgresql.org/pgsql-hackers/2010-10/msg02002.php patch]
# <del>On OS X only: segfault in psql's autocomplete, due to buggy readline library in OS X.</del>
# CREATE TABLE newschema.newtable (LIKE oldschema.oldtable INCLUDING INDEXES) -> on 8.3, renames indexes named "foo_idx" to "foo_key", though looks like this is fixed in later branches?
# On OS X only: \dn pub[TAB] gives "public/" as the autocompleted version of schema "public"
# Fix up html builds of docs to move towards XHTML compliance
# PL/pgSQL function to handle atomic table swaps - useful for snapshot materialized view refreshes with no locks held on the original table until close to COMMIT time.
# make distclean doesn't get rid of .html files created by "make html"?
# ugliness in makefiles for doc builds

; Contact Info
: [mailto:josh**at**kupershmidt.org Email: Josh Kupershmidt]
: [http://kupershmidt.org Josh Kupershmidt's Homepage]

Warm Standby

2010-12-02T02:10:22Z

Schmiddy: Mention that archive_mode must be on for any of this to work. Make explicit whether we're talking about master or standby in a few places.

__NOTOC__
There are a couple available Projects available to help you setup a warm standby system:

* Use the walmgr.py portion of Skype's [https://developer.skype.com/SkypeGarage/DbProjects/SkyTools SkyTools] package which will handle PITR backups from a primary to a single slave
* Utilize Command Prompt's [https://projects.commandprompt.com/public/pitrtools PITR tools] to set everything up

But to actually get a warm standby up manually is actually a pretty simple process. The following are notes only and intended to help your understanding. If you want to get this working correctly then please follow the manual, which is comprehensive and accurately maintained.

[http://www.postgresql.org/docs/current/static/warm-standby.html Warm Standby Manual]

== Pre-process recommendations ==
*Use [http://www.postgresql.org/docs/current/static/pgstandby.html pg_standby] for your restore_command in the recovery.conf file on the standby. pg_standby is included in PostgreSQL 8.3, and you can copy the source from there to compile it for 8.2 yourself. It isn't compatible with 8.1.
*Set up your standby host's environment and directory structure exactly the same as your primary. Otherwise you'll need to spend time changing any symlinks you've created on the primary for xlogs, tablespaces, or whatnot which is really just opportunity for error.
*Pre-configure both the postgresql.conf and recovery.conf files for your standby. I usually keep all of my different config files for all of my different servers in a single, version-controlled directory that I can then check out and symlink to. Again, consistent environment & directory setups make symlinks your best friend.
*Use ssh keys for simply, and safely, transferring files between hosts.
*Follow all of the advice in the manual with respect to handling errors.

== Outline of steps to get warm standby working ==
* Make sure archive_mode is on in the master's postgresql.conf.
* Set archive_command in the master's postgresql.conf. rysnc is a popular choice or you can just use one of the examples from the docs. I use:
<code><pre>
rsync -a %p postgres@standbyhost:/path/to/wal_archive/%f
</pre></code>
**You must use a command here that does atomic copies, meaning that the file will never appear under the destination filename until it has been completely copied over. This keeps the standby server from trying to read a partial file. rsync is known to work. A notable command that isn't atomic is scp. If you want to use scp for this purpose, you will need to transfer files into another directory on the secondary, then move them to where the restore command looks for them after the transfer is complete.
***If you're using pg_standby, it will refuse to apply files unless they are the right length, which lowers the risk of non-atomic copies being applied. On Windows it even sleeps a bit after that to give time for things to settle. Performing the copy non-atomically is still a bad idea you should avoid.
*Reload the master's config -- either: SELECT pg_reload_conf(); from psql or: pg_ctl reload -D data_dir/ . If you had to set archive_mode on, you'll have to restart your postgres server: pg_ctl restart -D data_dir/ .
*Verify that the WALs are being shipped to their destination.
*In psql, SELECT pg_start_backup('some_label');
*Run your base backup. Again, rsync is good for this with something as simple as:
<code><pre>
rsync -a --progress /path/to/data_dir/* postgres@standbyhost:/path/to/data_dir/
</pre></code>
*I'd suggest running this in a screen term window, the --progress flag will let you watch to see how far along the rsync is. The -a flag will preserve symlinks as well as all file permissions & ownership.
*In psql, SELECT pg_stop_backup();
**This drops a file to be archived that will have the same name as the first WAL shipped after the call to pg_start_backup() with a .backup suffix. Inside will be the start & stop WAL records defining the range of WAL files needed to be replayed before you can consider bringing the standby out of recovery.
*Drop in, or symlink, your recovery.conf file in the standby's data_dir.
**The restore command should use pg_standby (its help/README are simple and to the point). I'd recommend redirecting all output from pg_standby to a log file that you can then watch to verify that everything is working correctly once you've started things.
*Drop in, or symlink, your standby's postgresql.conf file.
**If you don't symlink your pg_xlog directory to write WALs to a separate drive, you can safely delete everything under data_dir/pg_xlog on the standby host.
*Start the standby db server with a normal: pg_ctl start -D /path/to/data_dir/
*run a: tail -f on your standby log and watch to make sure that it's replaying logs. If everything's cool you'll see some info on each WAL file, in order, that the standby looks for along with 'success' messages. If it can't find the files for some reason, you'll see repeated messages like: 'WAL file not present yet. Checking for trigger file...' (assuming you set up pg_standby to look for a trigger file in your recovery_command).

Execute this entire process at least a couple times, bringing up the standby into normal operations mode once it's played through all of the necessary WAL files (as noted in the .backup file) so that you can connect to it and verify that everything looks good, before doing all of this and leaving it running indefinitely. Once you do it a couple times, it becomes dirt simple.

== Adjusting frequency of WAL updates in 8.1 ==

Often people want to know that their secondary is never more than some amount behind the primary. The archive_timeout feature introduced into 8.2 allows doing that. If you're using WAL replication with 8.1, you can force 16MB worth of WAL activity that doesn't leave any changes behind with a hack like this:

<code><pre>
create table xlog_switch as
select '0123456789ABCDE' from generate_series(1,1000000);
drop table xlog_switch;
</pre></code>

If you put that into cron etc. to run via psql and you can make the window for log shipping as fine as you'd like even with no activity.
If you do it too often you're increasing the odds it will interfere with real transactions though and it will use up more disk space; every couple of minutes is probably as often as you'd want to do this. Using archive_timeout doesn't have this issue, the manual suggests it can be set to only a few seconds if necessary.

== Additional resources ==
*[http://www.kennygorman.com/wordpress/?p=249 pg_standby lag monitoring]
*[http://scale-out-blog.blogspot.com/2009/02/simple-ha-with-postgresql-point-in-time.html Simple HA with PITR]
*[http://www.travishegner.com/2009/06/postgresql-83-warm-stand-by-replication.html PostgreSQL 8.3 Warm Stand-by Replication]: tutorial with Ubuntu specifics
*[http://michsan.blogspot.com/2008/08/using-pgstandby-for-high-availability.html Using pg_standby for high availability of Postgresql]: tutorial that covers Debian, using 8.3 pg_standby on 8.2
*Source material:
** [http://archives.postgresql.org/pgsql-general/2008-01/msg01587.php warm standby examples]
** [http://archives.postgresql.org/sydpug/2006-10/msg00001.php Creating an 8.2 warm-standby demo system]
** [http://archives.postgresql.org/pgsql-general/2007-06/msg00015.php PITR Base Backup on an idle 8.1 server]

[[Category:Replication]][[Category:Backup]]

User:Schmiddy

2010-10-29T19:12:21Z

Schmiddy: /* PostgreSQL Notes and Misc */

== PostgreSQL Notes and Misc ==

=== TODO Items / Things to Investigate ===

# CREATE TABLE newschema.newtable (LIKE oldschema.oldtable INCLUDING INDEXES) -> on 8.3, renames indexes named "foo_idx" to "foo_key", though looks like this is fixed in later branches?
# <del>column-level UPDATE privs + LOCK TABLE: [http://archives.postgresql.org/pgsql-hackers/2010-10/msg01042.php patch]</del>
# segfaults in openjade while building PDF of docs
# <del>Why do psql's \z and friends not autocomplete tablenames?</del>[http://archives.postgresql.org/pgsql-hackers/2010-10/msg02002.php patch]
# On OS X only: \dn pub[TAB] gives "public/" as the autocompleted version of schema "public"
# On OS X only: segfault in psql's autocomplete, due to buggy readline library in OS X.
# Fix up html builds of docs to move towards XHTML compliance
# PL/pgSQL function to handle atomic table swaps - useful for snapshot materialized view refreshes with no locks held on the original table until close to COMMIT time.
# make distclean doesn't get rid of .html files created by "make html"?

; Contact Info
: [mailto:josh**at**kupershmidt.org Email: Josh Kupershmidt]
: [http://kupershmidt.org Josh Kupershmidt's Homepage]

User:Schmiddy

2010-10-17T03:39:23Z

Schmiddy: /* TODO Items / Things to Investigate */

== PostgreSQL Notes and Misc ==

=== TODO Items / Things to Investigate ===

# CREATE TABLE newschema.newtable (LIKE oldschema.oldtable INCLUDING INDEXES) -> on 8.3, renames indexes named "foo_idx" to "foo_key", though looks like this is fixed in later branches?
# <del>column-level UPDATE privs + LOCK TABLE: [http://archives.postgresql.org/pgsql-hackers/2010-10/msg01042.php patch]</del>
# segfaults in openjade while building PDF of docs
# Why do psql's \z and friends not autocomplete tablenames?
# On OS X only: \dn pub[TAB] gives "public/" as the autocompleted version of schema "public"
# On OS X only: segfault in psql's autocomplete, due to buggy readline library in OS X.
# Fix up html builds of docs to move towards XHTML compliance
# PL/pgSQL function to handle atomic table swaps - useful for snapshot materialized view refreshes with no locks held on the original table until close to COMMIT time.
# make distclean doesn't get rid of .html files created by "make html"?

; Contact Info
: [mailto:josh**at**kupershmidt.org Email: Josh Kupershmidt]
: [http://kupershmidt.org Josh Kupershmidt's Homepage]

User:Schmiddy

2010-10-17T01:00:53Z

Schmiddy: /* Josh Kupershmidt */

== PostgreSQL Notes and Misc ==

=== TODO Items / Things to Investigate ===

# CREATE TABLE newschema.newtable (LIKE oldschema.oldtable INCLUDING INDEXES) -> on 8.3, renames indexes named "foo_idx" to "foo_key", though looks like this is fixed in later branches?
# <del>column-level UPDATE privs + LOCK TABLE: [http://archives.postgresql.org/pgsql-hackers/2010-10/msg01042.php patch]</del>
# segfaults in openjade while building PDF of docs
# Why do psql's \z and friends not autocomplete tablenames?
# On OS X only: \dn pub[TAB] gives "public/" as the autocompleted version of schema "public"
# On OS X only: segfault in psql's autocomplete, due to buggy readline library in OS X.
# Fix up html builds of docs to move towards XHTML compliance
# PL/pgSQL function to handle atomic table swaps - useful for snapshot materialized view refreshes with no locks held on the original table until close to COMMIT time.

; Contact Info
: [mailto:josh**at**kupershmidt.org Email: Josh Kupershmidt]
: [http://kupershmidt.org Josh Kupershmidt's Homepage]

Todo

2010-09-26T17:45:13Z

Schmiddy: /* CLUSTER */ add thread link to "clustering system catalog indexes"

<div style="margin: 1ex 1em; float: right;">
__TOC__
</div>

This list contains '''all known PostgreSQL bugs and feature requests'''. If you would like to work on an item, please read the [[Developer FAQ]] first. There is also a [[Development_information|development information page]].

* {{TodoPending}} - marks ordinary, incomplete items
* {{TodoEasy}} - marks items that are easier to implement
* {{TodoDone}} - marks changes that are done, and will appear in the PostgreSQL 9.1 release.

For help on editing this list, please see [[Talk:Todo]]. <b>Please do not add items here without discussion on the mailing list.</b>

<div style="padding: 1ex 4em;">
== Administration ==

{{TodoItem
|Allow administrators to cancel multi-statement idle transactions
|This allows locks to be released, but it is complex to report the cancellation back to the client.
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg01340.php <nowiki>Cancelling idle in transaction state</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg00441.php <nowiki>Re: Cancelling idle in transaction state</nowiki>]
}}

{{TodoItem
|Check for unreferenced table files created by transactions that were in-progress when the server terminated abruptly
* [http://archives.postgresql.org/pgsql-patches/2006-06/msg00096.php <nowiki>Removing unreferenced files</nowiki>]
}}

{{TodoItem
|Set proper permissions on non-system schemas during db creation
|Currently all schemas are owned by the super-user because they are copied from the template1 database. However, since all objects are inherited from the template database, it is not clear that setting schemas to the db owner is correct.}}

{{TodoItem
|Allow log_min_messages to be specified on a per-module basis
|This would allow administrators to see more detailed information from specific sections of the backend, e.g. checkpoints, autovacuum, etc. Another idea is to allow separate configuration files for each module, or allow arbitrary SET commands to be passed to them. See also [[Logging Brainstorm]].}}

{{TodoItem
|Simplify ability to create partitioned tables
|This would allow creation of partitioned tables without requiring creation of triggers or rules for INSERT/UPDATE/DELETE, and constraints for rapid partition selection. Options could include range and hash partition selection. See also [[Table partitioning]]
}}

{{TodoItem
|Allow auto-selection of partitioned tables for min/max() operations
|There was a patch on -hackers from July 2009, but it has not been merged: [http://archives.postgresql.org/pgsql-hackers/2009-07/msg01115.php <nowiki>MIN/MAX optimization for partitioned table</nowiki>]}}

{{TodoItem
|Allow custom variables to appear in pg_settings()
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00850.php <nowiki>Re: count(*) performance improvement ideas</nowiki>]
}}

{{TodoItem
|Have custom variables be transaction safe
* {{MessageLink|4B577E9F.8000505@dunslane.net|Custom GUCs still a bit broken}}
}}

{{TodoItem
|Implement the SQL standard mechanism whereby REVOKE ROLE revokes only the privilege granted by the invoking role, and not those granted by other roles
* [http://archives.postgresql.org/pgsql-bugs/2007-05/msg00010.php <nowiki>Re: Grantor name gets lost when grantor role dropped</nowiki>]
}}

{{TodoItem
|Improve server security options
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01875.php <nowiki>Re: [0/4] Proposal of SE-PostgreSQL patches</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00000.php <nowiki>Re: [0/4] Proposal of SE-PostgreSQL patches</nowiki>]
}}

{{TodoItem
|Prevent query cancel packets from being replayed by an attacker, especially when using SSL
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg00345.php <nowiki>Replay attack of query cancel</nowiki>]
}}

{{TodoItem
|Provide a way to query the log collector subprocess to determine what the currently active log file is
* [http://archives.postgresql.org/pgsql-general/2008-11/msg00418.php <nowiki>Current log files when rotating?</nowiki>]
}}

{{TodoItemDone
|Allow the client to authenticate the server in a Unix-domain socket connection, e.g., using SO_PEERCRED
* http://archives.postgresql.org/message-id/20090401173756.GB21229@svana.org
}}

{{TodoItem
|Allow custom daemons to be automatically stopped/started along postmaster
|This allows easier administration of daemons like user job schedulers or replication-related daemons.
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01701.php <nowiki>Re: scheduler in core</nowiki>]
}}

=== Configuration files ===
{{TodoSubsection}}

{{TodoItem
|Allow pg_hba.conf to specify host names along with IP addresses
|Host name lookup could occur when the postmaster reads the pg_hba.conf file, or when the backend starts. Another solution would be to reverse lookup the connection IP and check that hostname against the host names in pg_hba.conf. We could also then check that the host name maps to the IP address.
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00569.php <nowiki>TODO Item: Allow pg_hba.conf to specify host names along with IP addresses</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00613.php
}}

{{TodoItem
|Allow postgresql.conf file values to be changed via an SQL API, perhaps using SET GLOBAL}}

{{TodoItem
|Allow the server to be stopped/restarted via an SQL API}}

{{TodoItem
|Consider normalizing fractions in postgresql.conf, perhaps using '%'
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00550.php <nowiki>Fractions in GUC variables</nowiki>]
}}

{{TodoItem
|Allow Kerberos to disable stripping of realms so we can check the username@realm against multiple realms
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00009.php <nowiki>krb_match_realm patch</nowiki>]
}}

{{TodoItem
|Add functions to check correctness of configuration files before they are loaded "live"}}

{{TodoItem
|Improve LDAP authentication configuration options
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01745.php <nowiki>Proposed Patch - LDAPS support for servers on port 636 w/o TLS</nowiki>]
}}

{{TodoItem
|Add external tool to auto-tune some postgresql.conf parameters
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00000.php <nowiki>Re: Overhauling GUCS</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00033.php <nowiki>Simple postgresql.conf wizard</nowiki>]
}}

{{TodoItem
|Add 'hostgss' pg_hba.conf option to allow GSS link-level encryption
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg01454.php <nowiki>Re: Plans for 8.4</nowiki>]
}}

{{TodoItem
|Process pg_hba.conf keywords as case-insensitive
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00432.php <nowiki>More robust pg_hba.conf parsing/error logging</nowiki>]
}}

{{TodoEndSubsection}}

=== Tablespaces ===
{{TodoSubsection}}

{{TodoItem
|Allow a database in tablespace t1 with tables created in tablespace t2 to be used as a template for a new database created with default tablespace t2
|Currently all objects in the default database tablespace must have default tablespace specifications. This is because new databases are created by copying directories. If you mix default tablespace tables and tablespace-specified tables in the same directory, creating a new database from such a mixed directory would create a new database with tables that had incorrect explicit tablespaces. To fix this would require modifying pg_class in the newly copied database, which we don't currently do.}}

{{TodoItem
|Allow reporting of which objects are in which tablespaces
|This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.}}

{{TodoItem
|Allow WAL replay of CREATE TABLESPACE to work when the directory structure on the recovery computer is different from the original}}

{{TodoItem
|Allow per-tablespace quotas}}

{{TodoEndSubsection}}

=== Statistics Collector ===
{{TodoSubsection}}

{{TodoItem
|Allow statistics last vacuum/analyze execution times to be displayed without requiring track_counts to be enabled
* [http://archives.postgresql.org/pgsql-docs/2007-04/msg00028.php <nowiki>row-level stats and last analyze time</nowiki>]
}}

{{TodoItem
|Clear table counters on TRUNCATE
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00169.php <nowiki>Small TRUNCATE glitch</nowiki>]
}}

{{TodoItem
| Allow the clearing of cluster-level statistics
* [http://archives.postgresql.org/pgsql-hackers/2009-03/msg00917.php <nowiki>Resetting cluster-wide statistics</nowiki>]
}}

{{TodoEndSubsection}}

=== Point-In-Time Recovery (PITR) ===
{{TodoSubsection}}

{{TodoItemEasy
|Create dump tool for write-ahead logs for use in determining transaction id for point-in-time recovery
|This is useful for checking PITR recovery.}}

{{TodoItem
|Allow recovery.conf to support the same syntax as postgresql.conf, including quoting
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00497.php <nowiki>recovery.conf parsing problems</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg00684.php
}}

{{TodoItem
|Allow archive_mode to be changed without server restart?
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01655.php <nowiki>Enabling archive_mode without restart</nowiki>]
}}

{{TodoItem
|Consider avoiding WAL switching via archive_timeout if there has been no database activity
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01469.php <nowiki>archive_timeout behavior for no activity</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00395.php <nowiki>Re: archive_timeout behavior for no activity</nowiki>]
}}

{{TodoItemEasy
|[http://archives.postgresql.org/message-id/4B901D73.8030003@agliodbs.com Expose pg_controldata via SQL interface]
|Helpful for monitoring replicated databases; [http://archives.postgresql.org/message-id/4B959D7A.6010907@joeconway.com initial patch]}}

{{TodoEndSubsection}}

=== SSL ===
{{TodoSubsection}}

{{TodoItem
|Allow SSL authentication/encryption over unix domain sockets
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00924.php <nowiki>Re: Spoofing as the postmaster</nowiki>]
}}

{{TodoItem
|Allow SSL key file permission checks to be optionally disabled when sharing SSL keys with other applications
* [http://archives.postgresql.org/pgsql-bugs/2007-12/msg00069.php <nowiki>BUG #3809: SSL "unsafe" private key permissions bug</nowiki>]
}}

{{TodoItem
|Allow SSL CRL files to be re-read during configuration file reload, rather than requiring a server restart
|Unlike SSL CRT files, CRL (Certificate Revocation List) files are updated frequently
* [http://archives.postgresql.org/pgsql-general/2008-12/msg00832.php <nowiki>Automatic CRL reload</nowiki>]
Alternatively or additionally supporting OCSP (online certificate security protocol) would provide real-time revocation discovery without reloading
}}

{{TodoItem
| Allow automatic selection of SSL client certificates from a certificate store
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00406.php <nowiki>Allow multiple certificates or keys in the postgresql.crt/.key files</nowiki>]
}}

{{TodoItem
| Send full certificate server chain to client
* [http://archives.postgresql.org/pgsql-bugs/2009-12/msg00145.php BUG #5245: Full Server Certificate Chain Not Sent to client]
}}

{{TodoEndSubsection}}

=== Standby server mode ===
{{TodoSubsection}}

{{TodoItem
| Allow pg_xlogfile_name() to be used in recovery mode
* [http://archives.postgresql.org/message-id/3f0b79eb1001190135vd9f62f1sa7868abc1ea61d12@mail.gmail.com <nowiki>Streaming replication and pg_xlogfile_name()</nowiki>]
}}

{{TodoItem
| Fix things so that any such variables inherited from the server environment are intentionally *NOT* used for making SR connections.
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01011.php <nowiki>Re: Parameter name standby_mode</nowiki>]
}}

{{TodoItem
| Add a new privilege for connecting for streaming replication
* [http://archives.postgresql.org/message-id/3f0b79eb1003040247p6b092241of91784a505e9abd8@mail.gmail.com <nowiki>Streaming replication and privilege</nowiki>]
}}

{{TodoItem
| Add support for synchronous replication.
}}

{{TodoItem
| Add capability to take and send a base backup over the streaming replication connection, making it possible to initialize a new standby server from a running primary server without a WAL archive or other access to the primary server.
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg00136.php
}}

{{TodoItem
| Find a way to do hot file system backups on standby servers
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01727.php
}}

{{TodoItem
| Change walsender so that it applies per-role settings
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg00642.php
}}

{{TodoEndSubsection}}

== Data Types ==

{{TodoItem
|Change NUMERIC to enforce the maximum precision}}

{{TodoItemDone
|Reduce storage space for small NUMERICs
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg01331.php <nowiki>Saving space for common kinds of numeric values</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-02/msg00505.php <nowiki>Numeric patch to add special-case representations for < 8 bytes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php <nowiki>Re: Reducing NUMERIC size for 8.3</nowiki>]
}}

{{TodoItem
|Fix data types where equality comparison isn't intuitive, e.g. box}}

{{TodoItem
|Add support for public SYNONYMs
* [http://archives.postgresql.org/pgsql-hackers/2006-03/msg00519.php <nowiki>Proposal for SYNONYMS</nowiki>]
}}

{{TodoItem
|Add support for SQL-standard GENERATED/IDENTITY columns
* [http://archives.postgresql.org/pgsql-hackers/2006-07/msg00543.php <nowiki>Re: Three weeks left until feature freeze</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-08/msg00038.php <nowiki>GENERATED ... AS IDENTITY, Was: Re: Feature Freeze</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00344.php <nowiki>Behavior of GENERATED columns per SQL2003</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00076.php <nowiki>Re: [HACKERS] Behavior of GENERATED columns per SQL2003</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00604.php <nowiki>IDENTITY/GENERATED patch</nowiki>]
}}

{{TodoItem
|Consider placing all sequences in a single table, or create a system view
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php <nowiki>Re: newbie: renaming sequences task</nowiki>]
}}

{{TodoItem
|Consider a special data type for regular expressions
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg01067.php <nowiki>Why is there a tsquery data type?</nowiki>]
}}

{{TodoItem
|Reduce BIT data type overhead using short varlena headers
* [http://archives.postgresql.org/pgsql-general/2007-12/msg00273.php <nowiki>storage size of "bit" data type..</nowiki>]
}}

{{TodoItem
|Allow adding/renaming/removing enumerated values to an existing enumerated data type
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01718.php <nowiki>Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing</nowiki>]
}}

{{TodoItem
|Support scoped IPv6 addresses in inet type
* [http://archives.postgresql.org/pgsql-bugs/2007-05/msg00111.php <nowiki>strange problem with ip6</nowiki>]
}}

{{TodoItem
|Add JSON (JavaScript Object Notation) data type
|This would behave similar to the XML data type, which is stored as text, but allows element lookup and conversion functions.
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg01494.php <nowiki>PATCH: Add hstore_to_json()</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg00001.php <nowiki>Re: PATCH: Add hstore_to_json()</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-03/msg01092.php <nowiki>Proposal: Add JSON support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00057.php <nowiki>Re: Proposal: Add JSON support</nowiki>]
}}

{{TodoItem
|Considering improving performance of computing CHAR() value lengths
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00900.php <nowiki>char() overhead on read-only workloads not so insignifcant as the docs claim it is...</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01787.php <nowiki>Re: [PATCH] backend: compare word-at-a-time in bcTruelen</nowiki>]
}}

=== Domains ===
{{TodoSubsection}}

{{TodoItem
|Fix CREATE CAST on DOMAINs
* [http://archives.postgresql.org/pgsql-hackers/2006-05/msg00072.php <nowiki>bug? non working casts for domain</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-09/msg01681.php <nowiki>TODO: Fix CREATE CAST on DOMAINs</nowiki>]
}}

{{TodoItem
|Allow domains to be cast
* [http://archives.postgresql.org/pgsql-hackers/2003-06/msg01206.php <nowiki>Domain casting still doesn't work right</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00289.php <nowiki>domain casting?</nowiki>]
}}

{{TodoItem
|Make domains work better with polymorphic functions
* [http://archives.postgresql.org/message-id/4887.1228700773@sss.pgh.pa.us Polymorphic types vs. domains]
* [http://archives.postgresql.org/message-id/15535.1238774571@sss.pgh.pa.us some difficulties with fixing it]
}}

{{TodoEndSubsection}}

=== Dates and Times ===
{{TodoSubsection}}

{{TodoItem
|Allow infinite intervals just like infinite timestamps}}

{{TodoItem
|Allow TIMESTAMP WITH TIME ZONE to store the original timezone information, either zone name or offset from UTC
|If the TIMESTAMP value is stored with a time zone name, interval computations should adjust based on the time zone rules.
* [http://archives.postgresql.org/pgsql-hackers/2004-10/msg00705.php <nowiki>timestamp with time zone a la sql99</nowiki>]
}}

{{TodoItem
|Fix SELECT '0.01 years'::interval, '0.01 months'::interval}}

{{TodoItem
|Have timestamp subtraction not call justify_hours()?
* [http://archives.postgresql.org/pgsql-sql/2006-10/msg00059.php <nowiki>timestamp subtraction (was Re: formatting intervals with to_char)</nowiki>]
}}

{{TodoItem
|Improve timestamptz subtraction to be DST-aware
|Currently subtracting one date from another that crosses a daylight savings time adjustment can return '1 day 1 hour', but adding that back to the first date returns a time one hour in the future. This is caused by the adjustment of '25 hours' to '1 day 1 hour', and '1 day' is the same time the next day, even if daylight savings adjustments are involved.}}

{{TodoItem
|Fix interval display to support values exceeding 2^31 hours}}

{{TodoItem
|Add overflow checking to timestamp and interval arithmetic}}

{{TodoItem
|Add function to allow the creation of timestamps using parameters
* http://archives.postgresql.org/pgsql-performance/2010-06/msg00232.php
}}

{{TodoEndSubsection}}

=== Arrays ===
{{TodoSubsection}}

{{TodoItem
|Add support for arrays of domains
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00114.php <nowiki>Re: updated WIP: arrays of composites</nowiki>]
}}

{{TodoItem
|Allow single-byte header storage for array elements}}

{{TodoItem
|Add function to detect if an array is empty
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00475.php <nowiki>Re: array_length()</nowiki>]
}}

{{TodoItem
|Improve handling of empty arrays
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01033.php <nowiki>So what's an "empty" array anyway?</nowiki>]
}}

{{TodoItem
|Improve handling of NULLs in arrays
* [http://archives.postgresql.org/pgsql-bugs/2008-11/msg00009.php <nowiki>BUG #4509: array_cat's null behaviour is inconsistent</nowiki>]
}}

{{TodoEndSubsection}}

=== Binary Data ===
{{TodoSubsection}}

{{TodoItem
|Improve vacuum of large objects, like contrib/vacuumlo?}}

{{TodoItem
|Auto-delete large objects when referencing row is deleted
|contrib/lo offers this functionality.}}

{{TodoItem
|Allow read/write into TOAST values like large objects
|This requires the TOAST column to be stored EXTERNAL.}}

{{TodoItem
|Add API for 64-bit large object access
* [http://archives.postgresql.org/pgsql-hackers/2005-09/msg00781.php <nowiki>64-bit API for large objects</nowiki>]
}}

{{TodoEndSubsection}}

=== MONEY Data Type ===
{{TodoSubsection}}

{{TodoItem
|Add locale-aware MONEY type, and support multiple currencies
* [http://archives.postgresql.org/pgsql-general/2005-08/msg01432.php <nowiki>A real currency type</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01181.php <nowiki>Money type todos?</nowiki>]
}}

{{TodoItem
|MONEY dumps in a locale-specific format making it difficult to restore to a system with a different locale}}

{{TodoItem
|Allow MONEY to be easily cast to/from other numeric data types}}

{{TodoEndSubsection}}

=== Text Search ===
{{TodoSubsection}}

{{TodoItem
|Allow dictionaries to change the token that is passed on to later dictionaries
* [http://archives.postgresql.org/pgsql-patches/2007-11/msg00081.php <nowiki>a tsearch2 (8.2.4) dictionary that only filters out stopwords</nowiki>]
}}

{{TodoItem
|Consider a function-based API for '@@' searches
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00511.php <nowiki>Simplifying Text Search</nowiki>]
}}

{{TodoItem
|Improve text search error messages
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00966.php <nowiki>Poorly designed tsearch NOTICEs</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg01146.php <nowiki>Re: Poorly designed tsearch NOTICEs</nowiki>]
}}

{{TodoItem
|Consider changing error to warning for strings larger than one megabyte
* [http://archives.postgresql.org/pgsql-bugs/2008-02/msg00190.php <nowiki>BUG #3975: tsearch2 index should not bomb out of 1Mb limit</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-03/msg00062.php <nowiki>Re: [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit</nowiki>]
}}

{{TodoItem
|tsearch and tsdicts regression tests fail in Turkish locale on glibc
* [http://archives.postgresql.org/message-id/49749645.5070801@gmx.net tsearch with Turkish locale]
}}

{{TodoItem
|tsquery negator operator treated as part of lexeme
* [http://archives.postgresql.org/pgsql-bugs/2009-06/msg00346.php BUG #4887: inclusion operator (@>) on tsqeries behaves not conforming to documentation]
}}

{{TodoEndSubsection}}

=== XML ===
{{TodoSubsection}}

{{TodoItem
|Allow xml arrays to be cast to other data types
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00981.php <nowiki>proposal casting from XML[] to int[], numeric[], text[]</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00231.php <nowiki>Re: proposal casting from XML[] to int[], numeric[], text[]</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00471.php <nowiki>Re: proposal casting from XML[] to int[], numeric[], text[]</nowiki>]
}}

{{TodoItem
|Add XML Schema validation and xmlvalidate function (SQL:2008)}}

{{TodoItem
|Add xmlvalidatedtd variant to support validating against a DTD?}}

{{TodoItem
|Relax-NG validation; libxml2 supports this already}}

{{TodoItem
|Make it work reliably for non-UTF8 server encoding (xpath()) in particular is known to not work)
* [http://archives.postgresql.org/pgsql-bugs/2009-01/msg00135.php <nowiki>BUG #4622: xpath only work in utf-8 server encoding</nowiki>]
* http://archives.postgresql.org/message-id/4110.1238973350@sss.pgh.pa.us}}

{{TodoItem
|Extra functions from SQL:2006: XMLDOCUMENT, XMLCAST, XMLTEXT}}

{{TodoItem
|XMLNAMESPACES support in XMLELEMENT and elsewhere}}

{{TodoItem
|XSLT support; already available in contrib/xml2, but needs API fixes and adaptation to xml type.}}

{{TodoItem
|XML Canonical: Convert XML documents to canonical form to compare them. libxml2 has support for this.}}

{{TodoItem
|Pretty-printing XML: Parse a document and serialize it back in some indented form. libxml2 might support this.}}

{{TodoItem
|XMLQUERY (from SQL/XML standard)}}

{{TodoItem
|In some cases shredding could be better option (if there is no need in keeping XML docs entirely; if we have already developed tools that understand only relational data; etc) -- it would be a separate module that implements annotated schema decomposition technique, similar to DB2 and SQL Server functionality.}}

{{TodoItem
| Nested or repeated xpath() apparently mess up namespaces [http://archives.postgresql.org/pgsql-bugs/2008-03/msg00097.php] [http://archives.postgresql.org/pgsql-bugs/2008-03/msg00144.php] [http://archives.postgresql.org/pgsql-general/2008-03/msg00295.php] [http://archives.postgresql.org/pgsql-bugs/2008-07/msg00054.php] [http://archives.postgresql.org/message-id/004f01c90e91$138e9d10$3aabd730$@anstett@iaas.uni-stuttgart.de]}}

{{TodoItem
|XPath: Adding the <x> at the root causes problems [http://archives.postgresql.org/pgsql-bugs/2008-05/msg00184.php] [http://archives.postgresql.org/pgsql-bugs/2008-07/msg00054.php] [http://archives.postgresql.org/pgsql-general/2008-07/msg00613.php]}}

{{TodoItem
|xpath_table needs to be implemented/implementable to get rid of contrib/xml2 [http://archives.postgresql.org/pgsql-general/2008-05/msg00823.php]}}

{{TodoItem
|xpath_table is pretty broken anyway [http://archives.postgresql.org/pgsql-hackers/2010-02/msg02424.php]}}

{{TodoItem
|better handling of XPath data types [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00616.php] [http://archives.postgresql.org/message-id/004a01c90e90$4b986d90$e2c948b0$@anstett@iaas.uni-stuttgart.de]}}

{{TodoItemDone
|xpath_exists() is needed. It checks, whether or not the path specified exists in the XML value. (W/o this function we need to use weird "array_dims(xpath(...)) IS NOT NULL" syntax.)}}

{{TodoItem
|better handling of PIs and DTDs in xmlconcat() [http://archives.postgresql.org/message-id/200904211211.n3LCB09p008988@wwwmaster.postgresql.org]}}

{{TodoEndSubsection}}

== Functions ==

{{TodoItem
|Allow INET subnet tests using non-constants to be indexed}}

{{TodoItem
|Add INET overlaps operator, for use by exclusion constraints
* http://archives.postgresql.org/pgsql-hackers/2010-03/msg00845.php
}}

{{TodoItem
|Allow to_date() and to_timestamp() to accept localized month names}}

{{TodoItem
|Add missing parameter handling in to_char()
* [http://archives.postgresql.org/pgsql-hackers/2005-12/msg00948.php <nowiki>Re: to_char and i18n</nowiki>]
}}

{{TodoItem
|Throw an error from to_char() instead of printing a string of "#" when a number doesn't fit in the desired output format.
* discussed in [http://archives.postgresql.org/message-id/37ed240d0907290836w42187222n18664dfcbcb445b1@mail.gmail.com "to_char, support for EEEE format"]
}}

{{TodoItem
|Allow to_char() on interval values to accumulate the highest unit requested
|2= Some special format flag would be required to request such accumulation. Such functionality could also be added to EXTRACT. Prevent accumulation that crosses the month/day boundary because of the uneven number of days in a month.
* to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
* to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
* to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
* to_char(INTERVAL '3 years 5 months','MM') => 41}}

{{TodoItem
|Allow SQL-language functions to reference parameters by parameter name
|Currently SQL-language functions can only refer to dollar parameters, e.g. $1}}

{{TodoItem
|Add SPI_gettypmod() to return the typemod for a TupleDesc}}

{{TodoItem
|Enforce typmod for function inputs, function results and parameters for spi_prepare'd statements called from PLs
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg01403.php <nowiki>Re: BUG #2917: spi_prepare doesn't accept typename aliases</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-11/msg01160.php <nowiki>RFC for adding typmods to functions</nowiki>]
}}

{{TodoItem
|Allow holdable cursors in SPI}}

{{TodoItem
|Tighten function permission checks
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00568.php <nowiki>Re: Security leak with trigger functions?</nowiki>]
}}

{{TodoItem
|Fix IS OF so it matches the ISO specification, and add documentation
* [http://archives.postgresql.org/pgsql-patches/2003-08/msg00060.php <nowiki>Re: [HACKERS] IS OF</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00060.php <nowiki>ToDo: add documentation for operator IS OF</nowiki>]
}}

{{TodoItem
|Add overlaps geometric operators that ignore point overlaps
* http://archives.postgresql.org/pgsql-hackers/2010-03/msg00861.php
}}

{{TodoItem
|Implement Boyer-Moore searching in LIKE queries
* {{messageLink|27645.1220635769@sss.pgh.pa.us|TODO item: Implement Boyer-Moore searching (First time hacker)}}
}}

{{TodoItem
|Prevent malicious functions from being executed with the permissions of unsuspecting users
|Index functions are safe, so VACUUM and ANALYZE are safe too. Triggers, CHECK and DEFAULT expressions, and rules are still vulnerable.
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00268.php <nowiki>Some notes about the index-functions security vulnerability</nowiki>]
}}

{{TodoItem
|Reduce memory usage of aggregates in set returning functions
* [http://archives.postgresql.org/pgsql-performance/2008-01/msg00031.php <nowiki>Re: Performance of aggregates over set-returning functions</nowiki>]
}}

{{TodoItem
|Fix /contrib/ltree operator
* [http://archives.postgresql.org/pgsql-bugs/2007-11/msg00044.php <nowiki>BUG #3720: wrong results at using ltree</nowiki>]
}}

{{TodoItem
|<nowiki>Fix inconsistent precedence of =, >, and < compared to <>, >=, and <=</nowiki>
* [http://archives.postgresql.org/pgsql-bugs/2007-12/msg00145.php <nowiki>BUG #3822: Nonstandard precedence for comparison operators</nowiki>]
}}

{{TodoItem
|Fix regular expression bug when using complex back-references
* [http://archives.postgresql.org/pgsql-bugs/2007-10/msg00000.php <nowiki>BUG #3645: regular expression back references seem broken</nowiki>]
}}

{{TodoItem
|Have /contrib/dblink reuse unnamed connections
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00895.php <nowiki>dblink un-named connection doesn't get re-used</nowiki>]
}}

{{TodoItem
|Improve formatting of pg_get_viewdef() output
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg01648.php <nowiki>pg_get_viewdef formattiing</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01885.php <nowiki>Re: pretty print viewdefs</nowiki>]
}}

{{TodoItem
|Add printf()-like functionality
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00367.php <nowiki>RfD: more powerful "any" types</nowiki>]
}}

{{TodoItem
|Fix to_number() handling for values not matching the format string
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg01447.php <nowiki>Re: numeric_to_number() function skipping some digits</nowiki>]
}}

{{TodoItem
|Add function to dump pg_depend information cleanly
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00226.php <nowiki>Elementary dependency look-up</nowiki>]
}}

== Multi-Language Support ==

{{TodoItem
|Add NCHAR (as distinguished from ordinary varchar),}}

{{TodoItem
|Allow more fine-grained collation selection; add CREATE COLLATION.
|Right now the collation is fixed at database creation time.
* [http://archives.postgresql.org/pgsql-hackers/2005-03/msg00932.php <nowiki>Re: Patch for collation using ICU</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2005-08/msg00039.php <nowiki>FW: Win32 unicode vs ICU</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2005-08/msg00309.php <nowiki>Re: FW: Win32 unicode vs ICU</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2005-09/msg00110.php <nowiki>Proof of concept COLLATE support with patch</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2005-09/msg00020.php <nowiki>For review: Initial support for COLLATE</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2005-12/msg01121.php <nowiki>Proposed COLLATE implementation</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-01/msg00767.php <nowiki>TODO item: locale per database patch (new iteration)</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-03/msg00233.php <nowiki>Re: FW: Win32 unicode vs ICU</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-09/msg00662.php <nowiki>Re: Fixed length data types issue</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00557.php <nowiki>[WIP] collation support revisited (phase 1)</nowiki>]
* [[Todo:Collate]]
* [[Todo:ICU]]
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg01362.php <nowiki>WIP patch: Collation support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg00012.php <nowiki>Re: WIP patch: Collation support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00868.php <nowiki>PGDay.it collation discussion notes</nowiki>]
* [http://www.unicode.org/unicode/reports/tr10/ Unicode collation algorithm]
}}

{{TodoItem
|Add a LOCALE option to CREATE DATABASE, as a shorthand
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00119.php <nowiki> Re: 8.4 open items list</nowiki>]
}}

{{TodoItem
|Support multiple simultaneous character sets, per SQL:2008}}

{{TodoItem
|Improve UTF8 combined character handling?}}

{{TodoItem
|Add octet_length_server() and octet_length_client()}}

{{TodoItem
|Make octet_length_client() the same as octet_length()?}}

{{TodoItem
|Fix problems with wrong runtime encoding conversion for NLS message files}}

{{TodoItem
|Add URL to more complete multi-byte regression tests
* [http://archives.postgresql.org/pgsql-hackers/2005-07/msg00272.php <nowiki>Multi-byte and client side character encoding tests for copy command..</nowiki>]
}}

{{TodoItem
|Fix contrib/fuzzystrmatch to work with multibyte encodings
* [http://archives.postgresql.org/pgsql-bugs/2009-04/msg00047.php <nowiki> soundex function returns UTF-16 characters</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00138.php <nowiki> dmetaphone woes</nowiki>]
}}

{{TodoItem
|Set client encoding based on the client operating system encoding
|Currently client_encoding is set in postgresql.conf, which defaults to the server encoding.
* [http://archives.postgresql.org/pgsql-hackers/2006-08/msg01696.php <nowiki>Re: [GENERAL] invalid byte sequence ?</nowiki>]
}}

{{TodoItem
|Change memory allocation for multi-byte functions so memory is allocated inside conversion functions
|Currently we preallocate memory based on worst-case usage.}}

{{TodoItem
|Add ability to use case-insensitive regular expressions on multi-byte characters
|ILIKE already works with multi-byte characters
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00433.php <nowiki>Regexps vs. locale</nowiki>]
* {{MessageLink|20091201210024.B1393753FB7@cvs.postgresql.org|A partial solution for UTF-8}}
}}

{{TodoItem
|Improve encoding of connection startup messages sent to the client
|Currently some authentication error messages are sent in the server encoding
* [http://archives.postgresql.org/pgsql-general/2008-12/msg00801.php <nowiki>encoding of PostgreSQL messages</nowiki>]
* [http://archives.postgresql.org/pgsql-general/2009-01/msg00005.php <nowiki>Re: encoding of PostgreSQL messages</nowiki>]
}}

{{TodoItem
|Have pg_stat_activity display query strings in the correct client encoding
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00131.php <nowiki>pg_stats queries versus per-database encodings</nowiki>]
}}

{{TodoItem
|More sensible support for Unicode combining characters, normal forms
* http://archives.postgresql.org/message-id/200904141532.44618.peter_e@gmx.net
}}

== Views / Rules ==

{{TodoItem
|Automatically create rules on views so they are updateable, per SQL:2008
|We can only auto-create rules for simple views. For more complex cases users will still have to write rules manually.
* [http://archives.postgresql.org/pgsql-hackers/2006-03/msg00586.php <nowiki>Proposal for updatable views</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-08/msg00255.php <nowiki>Updatable views</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg01746.php <nowiki>Re: [COMMITTERS] pgsql: Automatic view update rules Bernd Helmle</nowiki>]
* http://wiki.postgresql.org/wiki/Updatable_views
}}

{{TodoItem
|Add the functionality for WITH CHECK OPTION clause of CREATE VIEW}}

{{TodoItem
|Allow VIEW/RULE recompilation when the underlying tables change
|This is both difficult and controversial.
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg01723.php Re: About "Allow VIEW/RULE recompilation when the underlying tables change"]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg01724.php Re: About "Allow VIEW/RULE recompilation when the underlying tables change"]
}}

{{TodoItem
|Make it possible to use RETURNING together with conditional DO INSTEAD rules, such as for partitioning setups
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00577.php <nowiki>RETURNING and DO INSTEAD ... Intentional or not?</nowiki>]
}}

{{TodoItem
|Add the ability to automatically create materialized views
|Right now materialized views require the user to create triggers on the main table to keep the summary table current. SQL syntax should be able to manage the triggers and summary table automatically. A more sophisticated implementation would automatically retrieve from the summary table when the main table is referenced, if possible. See [[Materialized Views]] for implementation details
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00479.php <nowiki>GSoC - proposal - Materialized Views in PostgreSQL</nowiki>]
}}

{{TodoItem
|Improve ability to modify views via ALTER TABLE
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00691.php <nowiki>Re: idea: storing view source in system catalogs</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg01410.php <nowiki>modifying views</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg00300.php <nowiki>Re: patch: Add columns via CREATE OR REPLACE VIEW</nowiki>]
}}

{{TodoItem
|Prevent low-cost functions from seeing unauthorized view rows
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01346.php <nowiki>Using views for row-level access control is leaky</nowiki>]
}}

== SQL Commands ==

{{TodoItem
|Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT}}

{{TodoItem
|Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php <nowiki>WIP: grouping sets support</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00466.php <nowiki>Implementation of GROUPING SETS (T431: Extended grouping capabilities)</nowiki>]
}}

{{TodoItemEasy
|Fix TRUNCATE ... RESTART IDENTITY so its effect on sequences is rolled back on transaction abort
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00550.php <nowiki>Re: [PATCHES] TRUNCATE TABLE with IDENTITY</nowiki>]
}}

{{TodoItem
|Allow PREPARE of cursors}}

{{TodoItem
|Allow finer control over the caching of prepared query plans
|Currently anonymous (un-named) queries prepared via the wire protocol are replanned every time bind parameters are supplied --- allow SQL PREPARE to do the same. Also, allow control over replanning prepared queries either manually or automatically when statistics for execute parameters differ dramatically from those used during planning.
* http://archives.postgresql.org/message-id/201002151911.o1FJBYh22763@momjian.us
}}

{{TodoItem
|Improve logging of prepared transactions recovered during startup
* [http://archives.postgresql.org/pgsql-hackers/2006-11/msg00092.php <nowiki>"recovering prepared transaction" after server restart message</nowiki>]
}}

{{TodoItem
|Allow prepared transactions with temporary tables created and dropped in the same transaction, and when an ON COMMIT DELETE ROWS temporary table is accessed
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00047.php <nowiki>Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables</nowiki>]
* [http://archives.postgresql.org/message-id/492543D5.9050904@enterprisedb.com A suggestion on how to implement this]
}}

{{TodoItem
|Add a GUC variable to warn about non-standard SQL usage in queries}}

{{TodoItem
|Add SQL-standard MERGE/REPLACE/UPSERT command
|MERGE is typically used to merge two tables. REPLACE or UPSERT command does UPDATE, or on failure, INSERT. See [[SQL MERGE]] for notes on the implementation details.
}}

{{TodoItem
|Add NOVICE output level for helpful messages like automatic sequence/index creation}}

{{TodoItem
|Add GUC to issue notice about statements that use unjoined tables}}

{{TodoItem
|Allow EXPLAIN to identify tables that were skipped because of constraint_exclusion}}

{{TodoItemDone
|Enable standard_conforming_strings by default
|When this is done, backslash-quote should be prohibited in non-E<nowiki>''</nowiki> strings because of possible confusion over how such strings treat backslashes. Basically, <nowiki>''</nowiki> is always safe for a literal single quote, while \' might or might not be based on the backslash handling rules.}}

{{TodoItem
|Simplify dropping roles that have objects in several databases}}

{{TodoItem
|Allow the count returned by SELECT, etc to be represented as an int64 to allow a higher range of values}}

{{TodoItem
|Add support for WITH RECURSIVE ... CYCLE
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00291.php <nowiki>WITH RECURSIVE ... CYCLE in vanilla SQL: issues with arrays of rows</nowiki>]}}

{{TodoItem
|Add DEFAULT .. AS OWNER so permission checks are done as the table owner
|This would be useful for SERIAL nextval() calls and CHECK constraints.}}

{{TodoItem
|Allow DISTINCT to work in multiple-argument aggregate calls}}

{{TodoItem
|Add column to pg_stat_activity that shows the progress of long-running commands like CREATE INDEX and VACUUM
* [http://archives.postgresql.org/pgsql-patches/2008-04/msg00203.php <nowiki>EXPLAIN progress info</nowiki>]
}}

{{TodoItem
|Allow INSERT/UPDATE/DELETE ... RETURNING inside a SELECT 'FROM' clause or target list
|Actually it would be saner to allow this in WITH
* [http://archives.postgresql.org/pgsql-general/2006-09/msg00803.php <nowiki>8.2: select from an INSERT returning?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg00693.php <nowiki>Re: SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00124.php <nowiki>cannot use result of (insert..returning)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00979.php <nowiki>insert ... delete ... returning ... ?</nowiki>]
* [http://archives.postgresql.org/pgsql-general/2009-06/msg00357.php Using results from DELETE ... RETURNING]
}}

{{TodoItem
|Allow INSERT/UPDATE/DELETE ... RETURNING in common table expressions
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg00472.php <nowiki>Writeable CTEs and side effects</nowiki>]
}}

{{TodoItem
|Add comments on system tables/columns using the information in catalogs.sgml
|Ideally the information would be pulled from the SGML file automatically.}}

{{TodoItem
|Prevent the specification of conflicting transaction read/write options
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00684.php <nowiki>Re: SET TRANSACTION and SQL Standard</nowiki>]
}}

{{TodoItem
|Support LATERAL subqueries
|Lateral subqueries can reference columns of tables defined outside the subquery at the same level, i.e. ''laterally''.
For example, a LATERAL subquery in a FROM clause could reference tables defined in the same FROM clause.
Currently only the columns of tables defined ''above'' subqueries are recognized.
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00292.php <nowiki>LATERAL</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg00991.php <nowiki>Re: LATERAL</nowiki>]
}}

{{TodoItem
|Add support for functional dependencies
|This would allow omitting GROUP BY columns when grouping by the primary key.
}}

{{TodoItem
|Optimize ON COMMIT DELETE ROWS
|Currently deletions are excessively performed
* http://archives.postgresql.org/pgsql-performance/2010-03/msg00392.php
* http://archives.postgresql.org/pgsql-performance/2010-04/msg00046.php
}}

=== CREATE ===
{{TodoSubsection}}

{{TodoItem
|Allow CREATE TABLE AS to determine column lengths for complex expressions like SELECT col1 || col2}}

{{TodoItem
|Have WITH CONSTRAINTS also create constraint indexes
* [http://archives.postgresql.org/pgsql-patches/2007-04/msg00149.php <nowiki>Re: CREATE TABLE LIKE INCLUDING INDEXES support</nowiki>]
}}

{{TodoItem
|Move NOT NULL constraint information to pg_constraint
|Currently NOT NULL constraints are stored in pg_attribute without any designation of their origins, e.g. primary keys. One manifest problem is that dropping a PRIMARY KEY constraint does not remove the NOT NULL constraint designation. Another issue is that we should probably force NOT NULL to be propagated from parent tables to children, just as CHECK constraints are. (But then does dropping PRIMARY KEY affect children?)
* http://archives.postgresql.org/message-id/19768.1238680878@sss.pgh.pa.us
* http://archives.postgresql.org/message-id/200909181005.n8IA5Ris061239@wwwmaster.postgresql.org
}}

{{TodoItem
|Prevent concurrent CREATE TABLE from sometimes returning a cryptic error message
* [http://archives.postgresql.org/pgsql-bugs/2007-10/msg00169.php <nowiki>BUG #3692: Conflicting create table statements throw unexpected error</nowiki>]
}}

{{TodoItemDone
|Allow CREATE TABLE to optionally create a table if it does not already exist, without throwing an error
|The fact that tables contain data makes this more complex than other CREATE OR REPLACE operations.
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg01300.php <nowiki>Add column if not exists (CINE)</nowiki>]
}}

{{TodoItem
|Add CREATE SCHEMA ... LIKE that copies a schema}}

{{TodoItem
|CREATE OR REPLACE FUNCTION might leave dependent objects depending on the function in inconsistent state
* [http://archives.postgresql.org/pgsql-general/2008-08/msg00985.php indexes on functions and create or replace function]
}}

{{TodoItem
|Allow temporary tables to exist as empty by default in all sessions
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00006.php <nowiki>what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg01329.php <nowiki>idea: global temp tables</nowiki>]
* [http://archives.postgresql.org//pgsql-hackers/2009-05/msg00016.php <nowiki>Re: idea: global temp tables</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg01098.php <nowiki>global temporary tables</nowiki>]
}}

{{TodoItem
|Allow the creation of "distinct" types
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01647.php <nowiki>Distinct types</nowiki>]
}}

{{TodoItem
|Consider analyzing temporary tables when they are first used in a query
|Autovacuum cannot analyze or vacuum temporary tables.
* [http://archives.postgresql.org/pgsql-hackers/2010-04/msg00416.php <nowiki>autovacuum and temp tables support</nowiki>]
}}

{{TodoEndSubsection}}

=== UPDATE ===
{{TodoSubsection}}

{{TodoItem
|<nowiki>Allow UPDATE tab SET ROW (col, ...) = (SELECT...)</nowiki>
* [http://archives.postgresql.org/pgsql-hackers/2006-07/msg01308.php <nowiki>Re: [PATCHES] extension for sql update</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00865.php <nowiki>UPDATE using sub selects</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-04/msg00315.php <nowiki>UPDATE using sub selects</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-03/msg00237.php <nowiki>Re: UPDATE using sub selects</nowiki>]
}}

{{TodoItem
|Research self-referential UPDATEs that see inconsistent row versions in read-committed mode
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00507.php <nowiki>Concurrently updating an updatable view</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00016.php <nowiki>Re: Do we need a TODO? (was Re: Concurrently updating anupdatable view)</nowiki>]
}}

{{TodoItem
|Improve performance of EvalPlanQual mechanism that rechecks already-updated rows
|This is related to the previous item, which questions whether it even has the right semantics
* [http://archives.postgresql.org/pgsql-bugs/2008-09/msg00045.php <nowiki>BUG #4401: concurrent updates to a table blocks one update indefinitely</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2009-07/msg00302.php <nowiki>BUG #4945: Parallel update(s) gone wild</nowiki>]
}}

{{TodoEndSubsection}}

=== ALTER ===
{{TodoSubsection}}

{{TodoItem
|Have ALTER TABLE RENAME rename SERIAL sequence names
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php <nowiki>Re: newbie: renaming sequences task</nowiki>]
}}

{{TodoItemEasy
|Allow ALTER TABLE ... ALTER CONSTRAINT ... RENAME
* [http://archives.postgresql.org/pgsql-patches/2006-02/msg00168.php <nowiki>ALTER CONSTRAINT RENAME patch reverted</nowiki>]
}}

{{TodoItem
|Add ALTER TABLE RENAME CONSTRAINT}}

{{TodoItem
|Have ALTER SEQUENCE RENAME rename the sequence name stored in the sequence table
* [http://archives.postgresql.org/pgsql-bugs/2007-09/msg00092.php <nowiki>BUG #3619: Renaming sequence does not update its 'sequence_name' field</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2007-10/msg00007.php <nowiki>Re: BUG #3619: Renaming sequence does not update its 'sequence_name' field</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00008.php <nowiki>Re: newbie: renaming sequences task</nowiki>]
}}

{{TodoItem
|Add ALTER DOMAIN to modify the underlying data type}}

{{TodoItemEasy
|Allow ALTER TABLE to change constraint deferrability and actions}}

{{TodoItem
|Add missing object types for ALTER ... SET SCHEMA}}

{{TodoItem
|Allow ALTER TABLESPACE to move to different directories}}

{{TodoItem
|Allow moving system tables to other tablespaces, where possible
|Currently non-global system tables must be in the default database tablespace. Global system tables can never be moved.}}

{{TodoItem
|Have ALTER INDEX update the name of a constraint using that index}}

{{TodoItem
|Allow column display reordering by recording a display, storage, and permanent id for every column?
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00782.php <nowiki>Re: column ordering, was Re: [PATCHES] Enums patch v2</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg01029.php <nowiki>Column reordering in pg_dump</nowiki>]
}}

{{TodoItem
|Allow an existing index to be marked as a table's primary key
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00500.php <nowiki>Setting a pre-existing index as a primary key</nowiki>]
}}

{{TodoItem
|Allow ALTER TYPE on composite types to perform operations similar to ALTER TABLE
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00245.php <nowiki>ALTER composite type does not work, but ALTER TABLE which ROWTYPE is used as a type - works fine</nowiki>]
}}

{{TodoItem
|Don't require table rewrite on ALTER TABLE ... ALTER COLUMN TYPE, when the old and new data types are binary compatible
* http://archives.postgresql.org/message-id/200903040137.n241bAUV035002@wwwmaster.postgresql.org
* [http://archives.postgresql.org/pgsql-patches/2006-10/msg00154.php <nowiki>Eliminating phase 3 requirement for varlen increases via ALTER COLUMN</nowiki>]
}}

{{TodoItemDone
|Reduce locking required for ALTER commands
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg00533.php <nowiki>ALTER TABLE SET STATISTICS requires AccessExclusiveLock</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01083.php <nowiki>Re: ALTER TABLE SET STATISTICS requires AccessExclusiveLock</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2008-10/msg01248.php
* http://archives.postgresql.org/pgsql-hackers/2008-10/msg00242.php
}}

{{TodoEndSubsection}}

=== CLUSTER ===
{{TodoSubsection}}

{{TodoItem
|Automatically maintain clustering on a table
|This might require some background daemon to maintain clustering during periods of low usage. It might also require tables to be only partially filled for easier reorganization. Another idea would be to create a merged heap/index data file so an index lookup would automatically access the heap data too. A third idea would be to store heap rows in hashed groups, perhaps using a user-supplied hash function.
* [http://archives.postgresql.org/pgsql-performance/2004-08/msg00350.php <nowiki>Equivalent praxis to CLUSTERED INDEX?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00155.php <nowiki>Re: Grouped Index Tuples</nowiki>]
* http://community.enterprisedb.com/git/
* [http://archives.postgresql.org/pgsql-performance/2009-10/msg00346.php <nowiki>Re: maintain_cluster_order_v5.patch</nowiki>]
}}

{{TodoItemEasy
|Add default clustering to system tables
|To do this, determine the ideal cluster index for each system table and set the cluster setting during initdb.
* [http://archives.postgresql.org/pgsql-hackers/2004-05/msg00989.php <nowiki>Re: Clustering system catalog indexes</nowiki>]
}}

{{TodoItem
|Improve CLUSTER performance by sorting to reduce random I/O
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php <nowiki>Our CLUSTER implementation is pessimal</nowiki>]
}}

{{TodoItemEasy
|Make CLUSTER VERBOSE more verbose.
|It is also used by new VACUUM FULL VERBOSE.}}

{{TodoEndSubsection}}

=== COPY ===
{{TodoSubsection}}

{{TodoItem
|Allow COPY to report error lines and continue
|This requires the use of a savepoint before each COPY line is processed, with ROLLBACK on COPY failure.
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00572.php <nowiki>Re: VLDB Features</nowiki>]
}}

{{TodoItem
|Allow COPY on a newly-created table to skip WAL logging
|On crash recovery, the table involved in the COPY would be removed or have its heap and index files truncated. One issue is that no other backend should be able to add to the table at the same time, which is something that is currently allowed. This currently is done if the table is created inside the same transaction block as the COPY because no other backends can see the table.}}

{{TodoItem
|Allow COPY FROM to create index entries in bulk
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00811.php <nowiki>Batch update of indexes on data loading</nowiki>]
}}

{{TodoItem
|Allow COPY in CSV mode to control whether a quoted zero-length string is treated as NULL
|Currently this is always treated as a zero-length string, which generates an error when loading into an integer column
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00905.php <nowiki>Re: [PATCHES] allow CSV quote in NULL</nowiki>]
}}

{{TodoItem
|Improve COPY performance
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00954.php <nowiki>Re: 8.3 / 8.2.6 restore comparison</nowiki>]
}}

{{TodoItem
|Allow COPY to report errors sooner
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01169.php <nowiki>Timely reporting of COPY errors</nowiki>]
}}

{{TodoItem
|Allow COPY to handle other number formats eg. the German notation. Best would be something like WITH DECIMAL ','.
}}

{{TodoItem
|Allow a stalled COPY to exit if the backend is terminated
* [http://archives.postgresql.org/pgsql-bugs/2009-04/msg00067.php <nowiki>Re: possible bug not in open items</nowiki>]
}}

{{TodoEndSubsection}}

=== GRANT/REVOKE ===
{{TodoSubsection}}

{{TodoItem
|Allow SERIAL sequences to inherit permissions from the base table?}}

{{TodoItem
|Allow dropping of a role that has connection rights
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00736.php <nowiki>DROP ROLE dependency tracking ...</nowiki>]
}}
{{TodoEndSubsection}}

=== DECLARE CURSOR ===
{{TodoSubsection}}

{{TodoItem
|Prevent DROP TABLE from dropping a table referenced by its own open cursor?}}

{{TodoItem
|Provide some guarantees about the behavior of cursors that invoke volatile functions
* [http://archives.postgresql.org/message-id/20997.1244563664@sss.pgh.pa.us Re: Cursor with hold emits the same row more than once across commits in 8.3.7]
}}

{{TodoEndSubsection}}

=== INSERT ===
{{TodoSubsection}}

{{TodoItem
|Allow INSERT/UPDATE of the system-generated oid value for a row}}

{{TodoItem
|In rules, allow VALUES() to contain a mixture of 'old' and 'new' references}}

{{TodoEndSubsection}}

=== SHOW/SET ===
{{TodoSubsection}}

{{TodoItem
|Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM ANALYZE, and CLUSTER}}

{{TodoItem
|Rationalize the discrepancy between settings that use values in bytes and SHOW that returns the object count
* [http://archives.postgresql.org/pgsql-docs/2008-07/msg00007.php <nowiki>Re: [ADMIN] shared_buffers and shmmax</nowiki>]
}}

{{TodoEndSubsection}}

=== LISTEN/NOTIFY ===
{{TodoSubsection}}

{{TodoItem
|Allow NOTIFY in rules involving conditionals}}

{{TodoEndSubsection}}

=== Window Functions ===
See {{messageLink|357.1230492361@sss.pgh.pa.us|TODO items for window functions}}.
{{TodoSubsection}}
{{TodoItem
|Support creation of user-defined window functions.
|We have the ability to create new window functions written in C. Is it
worth the effort to create an API that would let them be written in PL/pgsql, etc?}}

{{TodoItem
|Implement full support for window framing clauses.
|In addition to done clauses described in the [http://developer.postgresql.org/pgdocs/postgres/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS latest doc], these clauses are not implemented yet.
* RANGE BETWEEN ... PRECEDING/FOLLOWING
* EXCLUDE
}}

{{TodoItem
|Look at tuplestore performance issues.
|The tuplestore_in_memory() thing is just a band-aid, we ought to try to solve it properly. tuplestore_advance seems like a weak spot as well.
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00152.php <nowiki>tuplestore potential performance problem</nowiki>]
}}

{{TodoItem|Do we really need so much duplicated code between Agg and WindowAgg?}}

{{TodoItem
|Teach planner to evaluate multiple windows in the optimal order.
|Currently windows are always evaluated in the query-specified order.
* http://archives.postgresql.org/message-id/3CDAD71E9D70417290FCF66F0178D1E1@amd64
}}

{{TodoItem
|Implement DISTINCT clause in window aggregates.
|Some proprietary RDBMSs have implemented it already, so it helps with porting from those.}}

{{TodoEndSubsection}}

== Integrity Constraints ==
=== Keys ===

{{TodoSubsection}}

{{TodoItem
|Improve deferrable unique constraints for cases with many conflicts
|The current implementation fires a trigger for each potentially conflicting row. This might not scale well for an update that changes many key values at once.
}}

{{TodoEndSubsection}}

=== Referential Integrity ===
{{TodoSubsection}}

{{TodoItem
|Add MATCH PARTIAL referential integrity}}

{{TodoItem
|Change foreign key constraint for array -> element to mean element in array?}}

{{TodoItem
|Fix problem when cascading referential triggers make changes on cascaded tables, seeing the tables in an intermediate state
* [http://archives.postgresql.org/pgsql-hackers/2005-09/msg00174.php <nowiki>Re: [PATCHES] Work-in-progress referential action trigger timing</nowiki>]
}}

{{TodoItem
|Optimize referential integrity checks
* [http://archives.postgresql.org/pgsql-performance/2005-10/msg00458.php <nowiki>Re: Effects of cascading references in foreign keys</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00744.php <nowiki>Can't ri_KeysEqual() consider two nulls as equal?</nowiki>]
}}

{{TodoEndSubsection}}

== Server-Side Languages ==

{{TodoItem
|Add support for polymorphic arguments and return types to languages other than PL/PgSQL}}

{{TodoItem
|Add support for OUT and INOUT parameters to languages other than PL/PgSQL}}

{{TodoItem
|Add more fine-grained specification of functions taking arbitrary data types
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00367.php <nowiki>RfD: more powerful "any" types</nowiki>]
}}

{{TodoItem
|Implement stored procedures
|This might involve the control of transaction state and the return of multiple result sets
* [http://archives.postgresql.org/pgsql-general/2008-10/msg00454.php <nowiki>PL/pgSQL stored procedure returning multiple result sets (SELECTs)?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg01375.php <nowiki>Proposal: real procedures again (8.4)</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg00542.php
}}

=== PL/pgSQL ===
{{TodoSubsection}}

{{TodoItem
|Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]}}

{{TodoItem
|<nowiki>Allow listing of record column names, and access to record columns via variables, e.g. columns := r.(*), tval2 := r.(colname)</nowiki>
* [http://archives.postgresql.org/pgsql-patches/2005-07/msg00458.php <nowiki>Re: PL/PGSQL: Dynamic Record Introspection</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-05/msg00302.php <nowiki>Re: PL/PGSQL: Dynamic Record Introspection</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2006-06/msg00031.php <nowiki>Re: PL/PGSQL: Dynamic Record Introspection</nowiki>]
}}

{{TodoItem
|Add support for SCROLL cursors}}

{{TodoItem
|Add support for WITH HOLD cursors}}

{{TodoItem
|Allow row and record variables to be set to NULL constants, and allow NULL tests on such variables
|Because a row is not scalar, do not allow assignment from NULL-valued scalars.
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg00070.php <nowiki>NULL and plpgsql rows</nowiki>]
}}

{{TodoItem
|Consider keeping separate cached copies when search_path changes
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg01009.php <nowiki>pl/pgsql Plan Invalidation and search_path</nowiki>]
}}

{{TodoItem
|Improve handling of NULL row values vs. NULL rows
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg01758.php <nowiki>Null row vs. row of nulls in plpgsql</nowiki>]
}}

{{TodoEndSubsection}}

=== PL/Perl ===
{{TodoSubsection}}

{{TodoItem
|Allow regex operations in plperl using UTF8 characters in non-UTF8 encoded databases.}}

{{TodoEndSubsection}}

=== PL/Python ===
{{TodoSubsection}}

{{TodoItem
|Add table function support}}

{{TodoItem
|Add tracebacks
* [http://archives.postgresql.org/pgsql-patches/2006-02/msg00288.php <nowiki>Re: plpython tracebacks</nowiki>]
}}

{{TodoItem
|Develop a trusted variant of PL/Python.}}

{{TodoItem
|Create a new restricted execution class that will allow passing function arguments in as locals. Passing them as globals means functions cannot be called recursively.}}

{{TodoItem
|Functions cache the input and output functions for their arguments, so the following will make PostgreSQL unhappy:

create table users (first_name text, last_name text);
create function user_name(user) returns text as 'mycode' language plpython;
select user_name(user) from users;
alter table add column user_id integer;
select user_name(user) from users;

You have to drop and create the function(s) each time its arguments
are modified (not nice), or don't cache the input and output functions
(slower?), or check if the structure of the argument has been
altered (is this possible, easy, quick?) and recreate cache.}}

{{TodoItem
|Better documentation}}

{{TodoItem
|Add a DB-API compliant interface on top of the SPI interface.}}

{{TodoEndSubsection}}

=== PL/Tcl ===
{{TodoSubsection}}

{{TodoItem
|Add table function support}}

{{TodoItem
|check encoding validity of values passed back to Postgres in function returns, trigger tuple changes, or SPI calls.}}

{{TodoEndSubsection}}

== Clients ==

{{TodoItem
|Add a function like pg_get_indexdef() that report more detailed index information
* [http://archives.postgresql.org/pgsql-bugs/2007-12/msg00166.php <nowiki>BUG #3829: Wrong index reporting from pgAdmin III (v1.8.0 rev 6766-6767)</nowiki>]
}}

=== pg_ctl ===
{{TodoSubsection}}

{{TodoItem
|Allow pg_ctl to work properly with configuration files located outside the PGDATA directory
|pg_ctl can not read the pid file because it isn't located in the config directory but in the PGDATA directory. The solution is to allow pg_ctl to read and understand postgresql.conf to find the data_directory value.
* [http://archives.postgresql.org/pgsql-bugs/2009-10/msg00024.php <nowiki>BUG #5103: "pg_ctl -w (re)start" fails with custom unix_socket_directory</nowiki>]
}}

{{TodoItem
|Have the postmaster write a random number to a file on startup that pg_ctl checks against the contents of a pg_ping response on its initial connection (without login)
|This will protect against connecting to an old instance of the postmaster in a different or deleted subdirectory.
* [http://archives.postgresql.org/pgsql-bugs/2009-10/msg00110.php <nowiki>Re: BUG #5118: start-status-insert-fatal</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2009-10/msg00156.php <nowiki>Re: BUG #5118: start-status-insert-fatal</nowiki>]
}}

{{TodoItem
|Modify pg_ctl behavior and exit codes to make it easier to write an LSB conforming init script
|It may be desirable to condition some of the changes on a command-line switch, to avoid breaking existing scripts. A Linux shell (sh) script is referenced which has been tested and seems to provide a high degree of conformance in multiple environments. Study of this script might suggest areas where pg_ctl could be modified to make writing an LSB conforming script easier; however, some aspects of that script would be unnecessary with other suggested changes to pg_ctl, and discussion on the lists did not reach consensus on support for all aspects of this script. Further discussion of particular changes is needed before beginning any work.
* [[Lsb_conforming_init_script|LSB conforming init script]]
These threads should be studied for other ideas on improvements:
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01390.php <nowiki>We should Axe /contrib/start-scripts</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01843.php <nowiki>Linux LSB init script</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00008.php <nowiki>Re: Linux LSB init script</nowiki>]
}}

{{TodoEndSubsection}}

=== psql ===
{{TodoSubsection}}

{{TodoItem
|Have psql \ds show all sequences and their settings
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00916.php <nowiki>Re: TODO item: Have psql show current values for a sequence</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00401.php <nowiki>Quick patch: Display sequence owner</nowiki>]
}}

{{TodoItem
|Have \d on a sequence indicate if the sequences is owned by a table}}

{{TodoItem
|Move psql backslash database information into the backend, use mnemonic commands?
|This would allow non-psql clients to pull the same information out of the database as psql.
* [http://archives.postgresql.org/pgsql-hackers/2004-01/msg00191.php <nowiki>Re: psql \d option list overloaded</nowiki>]
}}

{{TodoItem
|Make psql's \d commands more consistent in its handling of schemas
* [http://archives.postgresql.org/pgsql-hackers/2004-11/msg00014.php <nowiki>Re: psql and schemas</nowiki>]
}}

{{TodoItem
|Consistently display privilege information for all objects in psql}}

{{TodoItem
|Add auto-expanded mode so expanded output is used if the row length is wider than the screen width.
|Consider using auto-expanded mode for backslash commands like \df+.}}

{{TodoItem
|Prevent tab completion of SET TRANSACTION from querying the database and therefore preventing the transaction isolation level from being set.
|Currently SET <tab> causes a database lookup to check all supported session variables. This query causes problems because setting the transaction isolation level must be the first statement of a transaction.}}

{{TodoItem
|Add a \set variable to control whether \s displays line numbers
|Another option is to add \# which lists line numbers, and allows command execution.
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00255.php <nowiki>Re: psql possible TODO</nowiki>]
}}

{{TodoItem
|Prevent escape string warnings when object names have backslashes
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00227.php <nowiki>Psql command-line completion bug</nowiki>]
}}

{{TodoItem
|Have \d show child tables that inherit from the specified parent}}

{{TodoItem
|Include the symbolic SQLSTATE name in verbose error reports
* [http://archives.postgresql.org/pgsql-general/2007-09/msg00438.php <nowiki>Re: Checking is TSearch2 query is valid</nowiki>]
}}

{{TodoItem
|Add prompt escape to display the client and server versions
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00310.php <nowiki>WIP patch for TODO Item: Add prompt escape to display the client and server versions</nowiki>]
}}

{{TodoItem
|Add option to wrap column values at whitespace boundaries, rather than chopping them at a fixed width.
|Currently, "wrapped" format chops values into fixed widths. Perhaps the word wrapping could use the same algorithm documented in the W3C specification.
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00404.php <nowiki>Re: psql wrapped format default for backslash-d commands</nowiki>]
* http://www.w3.org/TR/CSS21/tables.html#auto-table-layout}}

{{TodoItem
|Add "auto" expanded mode that outputs in expanded format if "wrapped" mode can't wrap the output to the screen width
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00417.php <nowiki>Re: psql wrapped format default for backslash-d commands</nowiki>]
}}

{{TodoItem
|Support the ReST table output format
|Details about the ReST format: http://docutils.sourceforge.net/rst.html#reference-documentation
* [http://archives.postgresql.org/pgsql-hackers/2008-08/msg01007.php <nowiki>Proposal: new border setting in psql</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00518.php <nowiki>Re: Proposal: new border setting in psql</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-01/msg00609.php <nowiki>Re: Proposal: new border setting in psql</nowiki>]
}}

{{TodoItem
|Add option to print advice for people familiar with other databases
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01845.php <nowiki>MySQL-ism help patch for psql</nowiki>]
}}

{{TodoItem
|Consider showing TOAST and index sizes in \dt+
* [http://archives.postgresql.org/pgsql-general/2010-01/msg00912.php <nowiki>\dt+ sizes don't include TOAST data</nowiki>]
}}

{{TodoItem
|Allow \dd to show constraint comments
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00436.php <nowiki>Re: More robust pg_hba.conf parsing/error logging</nowiki>]
* [http://archives.postgresql.org/pgsql-general/2009-09/msg00199.php <nowiki>comment on constraint</nowiki>]
}}

{{TodoItem
|Add ability to edit views with \ev
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg00023.php <nowiki>Adding \ev view editor?</nowiki>]
}}
{{TodoItem
|Add \dL to show languages
* [http://archives.postgresql.org/pgsql-hackers/2009-07/msg00915.php <nowiki>Re: [PATCH] Psql List Languages</nowiki>]
}}

{{TodoItemDone
|Distinguish between unique indexes and unique constraints in \d+
* http://archives.postgresql.org/message-id/8780.1271187360@sss.pgh.pa.us
}}

{{TodoItem
|Fix FETCH_COUNT to handle SELECT ... INTO and WITH queries
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01565.php
* http://archives.postgresql.org/pgsql-bugs/2010-05/msg00192.php
}}

{{TodoItem
|Prevent psql from sending remaining single-line multi-statement queries after reconnecting
* http://archives.postgresql.org/pgsql-bugs/2010-05/msg00159.php
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01283.php
}}

{{TodoEndSubsection}}

=== pg_dump / pg_restore ===
{{TodoSubsection}}

{{TodoItemEasy
|<nowiki>Add full object name to the tag field. eg. for operators we need '=(integer, integer)', instead of just '='.</nowiki>}}

{{TodoItem
|Add pg_dumpall custom format dumps?
* [http://archives.postgresql.org/pgsql-general/2010-05/msg00509.php pg_dumpall custom format]
|}}

{{TodoItem
|Avoid using platform-dependent locale names in pg_dumpall output
|Using native locale names puts roadblocks in the way of porting a dump to another platform. One possible solution is to get
CREATE DATABASE to accept some agreed-on set of locale names and fix them up to meet the platform's requirements.
* http://archives.postgresql.org/message-id/21396.1241716688@sss.pgh.pa.us
}}

{{TodoItem
|Allow selection of individual object(s) of all types, not just tables}}

{{TodoItem
|In a selective dump, allow dumping of an object and all its dependencies}}

{{TodoItem
|Add options like pg_restore -l and -L to pg_dump}}

{{TodoItem
|Add support for multiple pg_restore -t options, like pg_dump
|pg_restore's -t switch is less useful than pg_dump's in quite a few ways: no multiple switches, no pattern matching, no ability to pick up indexes and other dependent items for a selected table. It should be made to handle this switch just like pg_dump does.}}

{{TodoItem
|Stop dumping CASCADE on DROP TYPE commands in clean mode}}

{{TodoItem
|Allow pg_dump --clean to drop roles that own objects or have privileges
|tgl says: if this is about pg_dumpall, it's done as of 8.4. If it's really about pg_dump, what does it mean? pg_dump has no business dropping roles.}}

{{TodoItem
|Remove unnecessary function pointer abstractions in pg_dump source code}}

{{TodoItem
|Allow pg_dump to utilize multiple CPUs and I/O channels by dumping multiple objects simultaneously
|The difficulty with this is getting multiple dump processes to produce a single dump output file. It also would require several sessions to share the same snapshot.
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php <nowiki>pg_dump additional options for performance</nowiki>]
}}

{{TodoItem
|Allow pg_restore to load different parts of the COPY data for a single table simultaneously}}

{{TodoItem
|Remove support for dumping from pre-7.3 servers
|In 7.3 and later, we can get accurate dependency information from the server. pg_dump still contains a lot of crufty code
to try to deal with the lack of dependency info in older servers, but the usefulness of maintaining that code grows small.}}

{{TodoItem
|Allow pre/data/post files when schema and data are dumped separately, for performance reasons
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php <nowiki>pg_dump additional options for performance</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-07/msg00185.php <nowiki>Re: pg_dump additional options for performance</nowiki>]
}}

{{TodoItem
|Refactor handling of database attributes between pg_dump and pg_dumpall
|Currently only pg_dumpall emits database attributes, such as ALTER DATABASE SET commands and database-level GRANTs.
Many people wish that pg_dump would do that. One proposal is to let pg_dump issue such commands if the -C switch was used,
but it's unclear whether that will satisfy the demand.
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg01031.php <nowiki>ALTER DATABASE vs pg_dump</nowiki>]
* [http://archives.postgresql.org/pgsql-bugs/2010-05/msg00010.php summary of the issues]
}}

{{TodoItem
|Change pg_dump so that a comment on the dumped database is applied to the loaded database, even if the database has a different name.
|This will require new backend syntax, perhaps COMMENT ON CURRENT DATABASE. This is related to the previous item.}}

{{TodoItem
|Allow parallel restore of tar dumps
* [http://archives.postgresql.org/pgsql-hackers/2009-02/msg01154.php <nowiki>Re: parallel restore</nowiki>]
}}

{{TodoEndSubsection}}

=== ecpg ===
{{TodoSubsection}}

{{TodoItem
|Docs
|Document differences between ecpg and the SQL standard and information about the Informix-compatibility module.}}

{{TodoItem
|Solve cardinality > 1 for input descriptors / variables?}}

{{TodoItem
|Add a semantic check level, e.g. check if a table really exists}}

{{TodoItem
|fix handling of DB attributes that are arrays}}

{{TodoItem
|Fix nested C comments}}

{{TodoItemEasy
|sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified}}

{{TodoItem
|Make SET CONNECTION thread-aware, non-standard?}}

{{TodoItem
|Allow multidimensional arrays}}

{{TodoItem
|Implement COPY FROM STDIN}}

{{TodoItem
|Provide a way to specify size of a bytea parameter
* [http://archives.postgresql.org/message-id/200906192131.n5JLVoMo044178@wwwmaster.postgresql.org <nowiki>BUG #4866: ECPG and BYTEA</nowiki>]
}}

{{TodoItemEasy
|Fix small memory leaks in ecpg
|Memory leaks in a short running application like ecpg are not really a problem, but make debugging more complicated}}

{{TodoItem
|Allow re-usage of cursor name variables
* [http://archives.postgresql.org/message-id/20100329113435.GA3430@feivel.credativ.lan <nowiki>Problems with variable cursorname in ecpg</nowiki>]
}}

{{TodoEndSubsection}}

=== libpq ===
{{TodoSubsection}}

{{TodoItem
|Prevent PQfnumber() from lowercasing unquoted column names
|PQfnumber() should never have been doing lowercasing, but historically it has so we need a way to prevent it}}

{{TodoItem
|Allow statement results to be automatically batched to the client
|Currently all statement results are transferred to the libpq client before libpq makes the results available to the application. This feature would allow the application to make use of the first result rows while the rest are transferred, or held on the server waiting for them to be requested by libpq. One complexity is that a statement like SELECT 1/col could error out mid-way through the result set.}}

{{TodoItem
|Consider disallowing multiple queries in PQexec() as an additional barrier to SQL injection attacks
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg00184.php <nowiki>Re: InitPostgres and flatfiles question</nowiki>]
}}

{{TodoItem
|Add PQexecf() that allows complex parameter substitution
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01803.php <nowiki>Last minute mini-proposal (I know, know) for PQexecf()</nowiki>]
}}

{{TodoItem
|Add SQLSTATE and severity to errors generated within libpq itself
* [http://archives.postgresql.org/pgsql-interfaces/2007-11/msg00015.php <nowiki>v8.1: Error severity on libpq PGconn*</nowiki>]
* http://archives.postgresql.org/pgsql-hackers/2010-08/msg01425.php
}}

{{TodoItem
|Add code to detect client encoding and locale from the operating system environment
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg01040.php <nowiki>Determining client_encoding from client locale</nowiki>]
}}

{{TodoItem
|Add support for interface/ipaddress binding to libpq
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01811.php <nowiki>SR/libpq - outbound interface/ipaddress binding</nowiki>]
}}

{{TodoEndSubsection}}

== Triggers ==

{{TodoItem
|Improve storage of deferred trigger queue
|Right now all deferred trigger information is stored in backend memory. This could exhaust memory for very large trigger queues. This item involves dumping large queues into files, or doing some kind of join to process all the triggers, some bulk operation, or a bitmap.
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00876.php <nowiki>Re: BUG #4204: COPY to table with FK has memory leak</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg00464.php <nowiki>Scaling up deferred unique checks and the after trigger queue</nowiki>]
}}

{{TodoItem
|Allow triggers to be disabled in only the current session.
|This is currently possible by starting a multi-statement transaction, modifying the system tables, performing the desired SQL, restoring the system tables, and committing the transaction. ALTER TABLE ... TRIGGER requires a table lock so it is not ideal for this usage.}}

{{TodoItem
|With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN KEY
|If the dump is known to be valid, allow foreign keys to be added without revalidating the data.}}

{{TodoItem
|Allow statement-level triggers to access modified rows}}

{{TodoItem
|When statement-level triggers are defined on a parent table, have them fire only on the parent table, and fire child table triggers only where appropriate
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg01883.php <nowiki>Statement-level triggers and inheritance</nowiki>]
}}

{{TodoItem
|Allow AFTER triggers on system tables
|System tables are modified in many places in the backend without going through the executor and therefore not causing triggers to fire. To complete this item, the functions that modify system tables will have to fire triggers.}}

{{TodoItem
|Tighten trigger permission checks
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00564.php <nowiki>Security leak with trigger functions?</nowiki>]
}}

{{TodoItem
|Allow BEFORE INSERT triggers on views
* [http://archives.postgresql.org/pgsql-general/2007-02/msg01466.php <nowiki>Re: Why can't I put a BEFORE EACH ROW trigger on a view?</nowiki>]
}}

{{TodoItem
|Add database and transaction-level triggers
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00451.php <nowiki>Proposal for db level triggers</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00620.php <nowiki>triggers on prepare, commit, rollback... ?</nowiki>]
}}

{{TodoItem
|Reduce locking requirements for creating a trigger
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg00635.php <nowiki>Re: Change lock requirements for adding a trigger</nowiki>]
}}

== Inheritance ==

{{TodoItem
|Allow inherited tables to inherit indexes, UNIQUE constraints, and primary/foreign keys
* [http://archives.postgresql.org/pgsql-hackers/2010-05/msg00285.php <nowiki>Partitioning/inherited tables vs FKs</nowiki>]
}}

{{TodoItem
|Honor UNIQUE INDEX on base column in INSERTs/UPDATEs on inherited table, e.g. INSERT INTO inherit_table (unique_index_col) VALUES (dup) should fail
|The main difficulty with this item is the problem of creating an index that can span multiple tables.}}

{{TodoItem
|Determine whether ALTER TABLE / SET SCHEMA should work on inheritance hierarchies (and thus support ONLY). If yes, implement it.}}

{{TodoItem
|ALTER TABLE variants sometimes support recursion and sometimes not, but this is poorly/not documented, and the ONLY marker would then be silently ignored. Clarify the documentation, and reject ONLY if it is not supported.}}

== Indexes ==

{{TodoItem
|Prevent index uniqueness checks when UPDATE does not modify the column
|Uniqueness (index) checks are done when updating a column even if the column is not modified by the UPDATE.
However, HOT already short-circuits this in common cases, so more work might not be helpful.}}

{{TodoItem
|Allow the creation of on-disk bitmap indexes which can be quickly combined with other bitmap indexes
|Such indexes could be more compact if there are only a few distinct values. Such indexes can also be compressed. Keeping such indexes updated can be costly.
* [http://archives.postgresql.org/pgsql-patches/2005-07/msg00512.php <nowiki>Re: Bitmap index AM</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg01107.php <nowiki>Bitmap index thoughts</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00265.php <nowiki>Stream bitmaps</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01214.php <nowiki>Re: Bitmapscan changes - Requesting further feedback</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00013.php <nowiki>Updated bitmap index patch</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00741.php <nowiki>Reviewing new index types (was Re: [PATCHES] Updated bitmap indexpatch)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01023.php <nowiki>Bitmap Indexes: request for feedback</nowiki>]
* http://archives.postgresql.org/message-id/800923.27831.qm@web29010.mail.ird.yahoo.com
}}

{{TodoItem
|Allow accurate statistics to be collected on indexes with more than one column or expression indexes, perhaps using per-index statistics
* [http://archives.postgresql.org/pgsql-performance/2006-10/msg00222.php <nowiki>Re: Simple join optimized badly?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01131.php <nowiki>Stats for multi-column indexes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00741.php <nowiki>Cross-column statistics revisited</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg01431.php <nowiki>Multi-Dimensional Histograms</nowiki>]
}}

{{TodoItem
|Consider having a larger statistics target for indexed columns and expression indexes.
}}

{{TodoItem
|Consider smaller indexes that record a range of values per heap page, rather than having one index entry for every heap row
|This is useful if the heap is clustered by the indexed values.
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00341.php <nowiki>Grouped Index Tuples</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg01264.php <nowiki>Grouped Index Tuples</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00465.php <nowiki>Grouped Index Tuples / Clustered Indexes</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-03/msg00163.php <nowiki>Bitmapscan changes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00014.php <nowiki>Re: GIT patch</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00487.php <nowiki>Re: Index Tuple Compression Approach?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01589.php <nowiki>Re: Index AM change proposals, redux</nowiki>]
}}

{{TodoItem
|Add REINDEX CONCURRENTLY, like CREATE INDEX CONCURRENTLY
|This is difficult because you must upgrade to an exclusive table lock to replace the existing index file. CREATE INDEX CONCURRENTLY does not have this complication. This would allow index compaction without downtime.
* [http://archives.postgresql.org/pgsql-performance/2007-08/msg00289.php <nowiki>Re: When/if to Reindex</nowiki>]
}}

{{TodoItem
|Allow multiple indexes to be created concurrently, ideally via a single heap scan
|pg_restore allows parallel index builds, but it is done via subprocesses, and there is no SQL interface for this.
}}

{{TodoItem
|Consider sorting entries before inserting into btree index
* [http://archives.postgresql.org/pgsql-general/2008-01/msg01010.php <nowiki>Re: ATTN: Clodaldo was Performance problem. Could it be related to 8.3-beta4?</nowiki>]
}}

{{TodoItem
|Allow index scans to return matching index keys, not just the matching heap locations
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01657.php <nowiki>Re: Is this TODO item done?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01477.php <nowiki>Index-only quals</nowiki>]
}}

{{TodoItem
|Allow creation of an index that can do comparisons to test if a value is between two column values
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00757.php <nowiki>Proposal: temporal extension "period" data type</nowiki>]
}}

{{TodoItem
|Consider using "effective_io_concurrency" for index scans
* Currently only bitmap scans use this, which might be fine because most multi-row index scans use bitmap scans.
}}

=== GIST ===
{{TodoSubsection}}

{{TodoItem
|Add more GIST index support for geometric data types}}

{{TodoItem
|Allow GIST indexes to create certain complex index types, like digital trees (see Aoki)}}

{{TodoItem
|Fix performance issues in contrib/seg and contrib/cube GiST support
* [http://archives.postgresql.org/message-id/alpine.DEB.2.00.0904161633160.4053@aragorn.flymine.org GiST index performance]
* [http://archives.postgresql.org/message-id/alpine.DEB.2.00.0904221704470.22330@aragorn.flymine.org draft patch]
* [http://archives.postgresql.org/pgsql-performance/2009-05/msg00069.php <nowiki>Re: GiST index performance</nowiki>]
* [http://archives.postgresql.org/pgsql-performance/2009-06/msg00068.php <nowiki>GiST index performance</nowiki>]
}}

{{TodoEndSubsection}}

=== GIN ===
{{TodoSubsection}}

{{TodoItem
|Support empty indexed values (such as zero-element arrays) properly
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00237.php contrib/intarray vs empty arrays]
* [http://archives.postgresql.org/pgsql-bugs/2009-05/msg00118.php BUG #4806: Bug with GiST index and empty integer array]
}}

{{TodoItem
|Behave correctly for cases where some elements of an indexed value are NULL
* [http://archives.postgresql.org/pgsql-hackers/2009-03/msg01003.php <nowiki>GIN versus zero-key queries</nowiki>]
}}

{{TodoItem
|Support queries that require a full scan
* [http://archives.postgresql.org/pgsql-general/2009-05/msg00402.php Issue report]
* [http://archives.postgresql.org/pgsql-general/2007-06/msg01132.php Older issue report]
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg01581.php Original discussion of issue and proposed resolution]
}}

{{TodoEndSubsection}}

=== Hash ===
{{TodoSubsection}}

{{TodoItem
|Add UNIQUE capability to hash indexes}}

{{TodoItem
|Add hash WAL logging for crash recovery}}

{{TodoItem
|Allow multi-column hash indexes}}

{{TodoEndSubsection}}

== Sorting ==

{{TodoItem
|Consider whether duplicate keys should be sorted by block/offset
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00558.php <nowiki>Remove hacks for old bad qsort() implementations?</nowiki>]
}}

{{TodoItem
|Consider being smarter about memory and external files used during sorts
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg01101.php <nowiki>Sorting Improvements for 8.4</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00045.php <nowiki>Re: Sorting Improvements for 8.4</nowiki>]
}}

{{TodoItem
|Consider detoasting keys before sorting}}

== Fsync ==

{{TodoItem
|Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
|Ideally this requires a separate test program that can be run at initdb time or optionally later. Consider O_SYNC when O_DIRECT exists.}}

{{TodoItem
|Add program to test if fsync has a delay compared to non-fsync}}

{{TodoItem
|Consider sorting writes during checkpoint
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg00541.php <nowiki>Sorted writes in checkpoint</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-07/msg00050.php <nowiki>Re: Sorting writes during checkpoint</nowiki>]
}}

== Cache Usage ==

{{TodoItem
|Speed up COUNT(*)
|We could use a fixed row count and a +/- count to follow MVCC visibility rules, or a single cached value could be used and invalidated if anyone modifies the table. Another idea is to get a count directly from a unique index, but for this to be faster than a sequential scan it must avoid access to the heap to obtain tuple visibility information.}}

{{TodoItem
|Provide a way to calculate an "estimated COUNT(*)"
|Perhaps by using the optimizer's cardinality estimates or random sampling.
* [http://archives.postgresql.org/pgsql-hackers/2005-11/msg00943.php <nowiki>Re: Improving count(*)</nowiki>]
}}

{{TodoItem
|Allow data to be pulled directly from indexes
|Currently indexes do not have enough tuple visibility information to allow data to be pulled from the index without also accessing the heap. One way to allow this is to set a bit on index tuples to indicate if a tuple is currently visible to all transactions when the first valid heap lookup happens. This bit would have to be cleared when a heap tuple is expired.
Another idea is to maintain a bitmap of heap pages where all rows are visible to all backends, and allow index lookups to reference that bitmap to avoid heap lookups, perhaps the same bitmap we might add someday to determine which heap pages need vacuuming. Frequently accessed bitmaps would have to be stored in shared memory. One 8k page of bitmaps could track 512MB of heap pages.
A third idea would be for a heap scan to check if all rows are visible and if so set a per-table flag which can be checked by index scans. Any change to the table would have to clear the flag. To detect changes during the heap scan a counter could be set at the start and checked at the end --- if it is the same, the table has not been modified --- any table change would increment the counter.
* [http://archives.postgresql.org/pgsql-patches/2007-10/msg00166.php <nowiki>Re: [HACKERS] Including Snapshot Info with Indexes</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-01/msg00049.php <nowiki>Re: [HACKERS] Including Snapshot Info with Indexes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg01094.php <nowiki>TODO item: Allow data to be pulled directly from indexes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg00003.php <nowiki>Re: [PATCHES] VACUUM Improvements - WIP Patch</nowiki>]
}}

{{TodoItem
|Consider automatic caching of statements at various levels:
* Parsed query tree
* Query execute plan
* Query results

:
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00823.php <nowiki>Cached Query Plans (was: global prepared statements)</nowiki>]
}}

{{TodoItem
|Consider increasing internal areas (NUM_CLOG_BUFFERS) when shared buffers is increased
* [http://archives.postgresql.org/pgsql-hackers/2005-10/msg01419.php <nowiki>Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00030.php <nowiki>clog_buffers to 64 in 8.3?</nowiki>]
* [http://archives.postgresql.org/pgsql-performance/2007-08/msg00024.php <nowiki>CLOG Patch</nowiki>]
}}

{{TodoItem
|Consider decreasing the amount of memory used by PrivateRefCount
|
* [http://archives.postgresql.org/pgsql-hackers/2006-11/msg00797.php <nowiki>PrivateRefCount (for 8.3)</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-01/msg00752.php <nowiki>Re: PrivateRefCount (for 8.3)</nowiki>]
}}

{{TodoItem
|Consider allowing higher priority queries to have referenced buffer cache pages stay in memory longer
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00562.php <nowiki>Re: How to keep a table in memory?</nowiki>]
}}

== Vacuum ==

{{TodoItem
|Auto-fill the free space map by scanning the buffer cache or by checking pages written by the background writer
* [http://archives.postgresql.org/pgsql-hackers/2006-02/msg01125.php <nowiki>Dead Space Map</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-03/msg00011.php <nowiki>Re: Automatic free space map filling</nowiki>]
}}

{{TodoItem
|Consider having single-page pruning update the visibility map
* <nowiki>https://commitfest.postgresql.org/action/patch_view?id=75</nowiki>
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg02344.php <nowiki>Re: visibility maps and heap_prune</nowiki>]
}}

{{TodoItem
|Improve tracking of total relation tuple counts now that vacuum doesn't always scan the whole heap
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00531.php Partial vacuum versus pg_class.reltuples]
}}

{{TodoItem
|Bias FSM towards returning free space near the beginning of the heap file, in hopes that empty pages at the end can be truncated by VACUUM}}

{{TodoItem
|Make FSM return free space based on table clustering, to assist in maintaining clustering?}}

{{TodoItem
|Consider a more compact data representation for dead tuple locations within VACUUM
* [http://archives.postgresql.org/pgsql-patches/2007-05/msg00143.php <nowiki>Re: Have vacuum emit a warning when it runs out of maintenance_work_mem</nowiki>]
}}

{{TodoItem
|Provide more information in order to improve user-side estimates of dead space bloat in relations
* [http://archives.postgresql.org/pgsql-general/2009-05/msg01039.php <nowiki>Re: Bloated Table</nowiki>]
}}

=== Auto-vacuum ===
{{TodoSubsection}}

{{TodoItemEasy
|Issue log message to suggest VACUUM FULL if a table is nearly empty?}}

{{TodoItem
|Prevent long-lived temporary tables from causing frozen-xid advancement starvation
|The problem is that autovacuum cannot vacuum them to set frozen xids; only the session that created them can do that.
* [http://archives.postgresql.org/pgsql-general/2007-06/msg01645.php <nowiki>Re: AutoVacuum Behaviour Question</nowiki>]
}}

{{TodoItem
|Prevent autovacuum from running if an old transaction is still running from the last vacuum
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00899.php <nowiki>Re: Autovacuum and OldestXmin</nowiki>]
}}

{{TodoItem
|Have free space allocation bias away from using trailing table pages
|This improves the chances of truncating the table during vacuum
* [http://archives.postgresql.org/pgsql-hackers/2009-09/msg01124.php <nowiki>FSM search modes</nowiki>]
}}

{{TodoItem
|Have autoanalyze of parent tables occur when child tables are modified
* http://archives.postgresql.org/message-id/AANLkTinx8lLTEKWcyEQ1rxVz6WMJVKNezfXW5TKnNAU6@mail.gmail.com
}}

{{TodoEndSubsection}}

== Locking ==

{{TodoItem
|Fix priority ordering of read and write light-weight locks
* [http://archives.postgresql.org/pgsql-hackers/2004-11/msg00893.php <nowiki>lwlocks and starvation</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2004-11/msg00905.php <nowiki>Re: lwlocks and starvation</nowiki>]
}}

{{TodoItem
|Fix problem when multiple subtransactions of the same outer transaction hold different types of locks, and one subtransaction aborts
* [http://archives.postgresql.org/pgsql-hackers/2006-11/msg01011.php <nowiki>FOR SHARE vs FOR UPDATE locks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg00001.php <nowiki>Re: FOR SHARE vs FOR UPDATE locks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00435.php <nowiki>Re: [PATCHES] [pgsql-patches] Phantom Command IDs, updated patch</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00773.php <nowiki>Re: savepoints and upgrading locks</nowiki>]
}}

{{TodoItem
|Allow UPDATEs on only non-referential integrity columns not to conflict with referential integrity locks
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00073.php <nowiki>Referential Integrity and SHARE locks</nowiki>]
}}

{{TodoItem
|Add idle_in_transaction_timeout GUC so locks are not held for long periods of time}}

{{TodoItem
|Improve deadlock detection when a page cleaning lock conflicts with a shared buffer that is pinned
* [http://archives.postgresql.org/pgsql-bugs/2008-01/msg00138.php <nowiki>BUG #3883: Autovacuum deadlock with truncate?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00873.php <nowiki>Thoughts about bug #3883</nowiki>]
* [http://archives.postgresql.org/pgsql-committers/2008-01/msg00365.php <nowiki>Re: pgsql: Add checks to TRUNCATE, CLUSTER, and REINDEX to prevent</nowiki>]
}}

{{TodoItem
|Detect deadlocks involving LockBufferForCleanup()
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00873.php <nowiki>Thoughts about bug #3883</nowiki>]
}}

{{TodoItem
|Consider a lock timeout parameter
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00485.php <nowiki>SELECT ... FOR UPDATE [WAIT integer | NOWAIT] for 8.5</nowiki>]
}}

{{TodoItem
|Consider improving serialized transaction behavior to avoid anomalies
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00217.php <nowiki>Serializable Isolation without blocking</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg01136.php <nowiki>User-facing aspects of serializable transactions</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00035.php <nowiki>Re: User-facing aspects of serializable transactions</nowiki>]
}}

== Startup Time Improvements ==

{{TodoItem
|Experiment with multi-threaded backend for backend creation
|This would prevent the overhead associated with process creation. Most operating systems have trivial process creation time compared to database startup overhead, but a few operating systems (Win32, Solaris) might benefit from threading. Also explore the idea of a single session using multiple threads to execute a statement faster.}}

== Write-Ahead Log ==

{{TodoItem
|Eliminate need to write full pages to WAL before page modification
|Currently, to protect against partial disk page writes, we write full page images to WAL before they are modified so we can correct any partial page writes during recovery. These pages can also be eliminated from point-in-time archive files.
* [http://archives.postgresql.org/pgsql-hackers/2002-06/msg00655.php <nowiki>Re: Index Scans become Seq Scans after VACUUM ANALYSE</nowiki>]
}}

{{TodoItem
|When full page writes are off, write CRC to WAL and check file system blocks on recovery
|If CRC check fails during recovery, remember the page in case a later CRC for that page properly matches.}}

{{TodoItem
|Write full pages during file system write and not when the page is modified in the buffer cache
|This allows most full page writes to happen in the background writer. It might cause problems for applying WAL on recovery into a partially-written page, but later the full page will be replaced from WAL.}}

{{TodoItem
|Reduce WAL traffic so only modified values are written rather than entire rows
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg01589.php <nowiki>Reduction in WAL for UPDATEs</nowiki>]
}}

{{TodoItem
|Allow WAL information to recover corrupted pg_controldata
* [http://archives.postgresql.org/pgsql-patches/2006-06/msg00025.php <nowiki>Re: [HACKERS] pg_resetxlog -r flag</nowiki>]
}}

{{TodoItem
|Find a way to reduce rotational delay when repeatedly writing last WAL page
|Currently fsync of WAL requires the disk platter to perform a full rotation to fsync again. One idea is to write the WAL to different offsets that might reduce the rotational delay.
* [http://archives.postgresql.org/pgsql-hackers/2002-11/msg00483.php <nowiki>500 tpsQL + WAL log implementation</nowiki>]
}}

{{TodoItem
|Allow WAL logging to be turned off for a table, but the table might be dropped or truncated during crash recovery
|Allow tables to bypass WAL writes and just fsync() dirty pages on commit. This should be implemented using ALTER TABLE, e.g. <nowiki>ALTER TABLE PERSISTENCE [ DROP | TRUNCATE | DEFAULT ]</nowiki>. Tables using non-default logging should not use referential integrity with default-logging tables. A table without dirty buffers during a crash could perhaps avoid the drop/truncate.
* [http://archives.postgresql.org/pgsql-hackers/2005-12/msg01016.php <nowiki>Re: [Bizgres-general] WAL bypass for INSERT, UPDATE and</nowiki>]
}}

{{TodoItem
|Allow WAL logging to be turned off for a table, but the table would avoid being truncated/dropped
|To do this, only a single writer can modify the table, and writes must happen only on new pages so the new pages can be removed during crash recovery. Readers can continue accessing the table. Such tables probably cannot have indexes. One complexity is the handling of indexes on TOAST tables.
* [http://archives.postgresql.org/pgsql-hackers/2005-12/msg01016.php <nowiki>Re: [Bizgres-general] WAL bypass for INSERT, UPDATE and</nowiki>]
}}

{{TodoItem
|Speed WAL recovery by allowing more than one page to be prefetched
|This should be done utilizing the same infrastructure used for prefetching in general to avoid introducing complex error-prone code in WAL replay.
* [http://archives.postgresql.org/pgsql-general/2007-12/msg00683.php <nowiki>Slow PITR restore</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00497.php <nowiki>Re: [GENERAL] Slow PITR restore</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg01279.php <nowiki>Read-ahead and parallelism in redo recovery</nowiki>]
}}

{{TodoItem
|Improve WAL concurrency by increasing lock granularity
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00556.php <nowiki>Reworking WAL locking</nowiki>]
}}

{{TodoItem
|Be more aggressive about creating WAL files
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg01325.php <nowiki>Re: PANIC caused by open_sync on Linux</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2004-07/msg01075.php <nowiki>PreallocXlogFiles</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2005-04/msg00556.php <nowiki>WAL/PITR additional items</nowiki>]
}}

{{TodoItem
|Have resource managers report the duration of their status changes
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg01468.php <nowiki>Recovery of Multi-stage WAL actions</nowiki>]
}}

{{TodoItem
|Move pgfoundry's xlogdump to /contrib and have it rely more closely on the WAL backend code
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00035.php <nowiki>xlogdump</nowiki>]
}}

{{TodoItem
|Close deleted WAL files held open in *nix by long-lived read-only backends
* [http://archives.postgresql.org/pgsql-hackers/2009-11/msg01754.php <nowiki>Deleted WAL files held open by backends in Linux</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg00060.php <nowiki>Re: Deleted WAL files held open by backends in Linux</nowiki>]
}}

== Optimizer / Executor ==

{{TodoItem
|Improve selectivity functions for geometric operators}}

{{TodoItem
|Precompile SQL functions to avoid overhead}}

{{TodoItem
|Create utility to compute accurate random_page_cost value}}

{{TodoItem
|Consider increasing the default values of from_collapse_limit, join_collapse_limit, and/or geqo_threshold
* [http://archives.postgresql.org/message-id/4136ffa0905210551u22eeb31bn5655dbe7c9a3aed5@mail.gmail.com from_collapse_limit vs. geqo_threshold]
}}

{{TodoItem
|Improve ability to display optimizer analysis using OPTIMIZER_DEBUG}}

{{TodoItem
|Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage}}

{{TodoItem
|Have EXPLAIN ANALYZE report rows as floating-point numbers
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg01363.php <nowiki>explain analyze rows=%.0f</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-06/msg00108.php <nowiki>Re: explain analyze rows=%.0f</nowiki>]
}}

{{TodoItem
|Log statements where the optimizer row estimates were dramatically different from the number of rows actually found?}}

{{TodoItem
|Improve how ANALYZE computes in-doubt tuples
* [http://archives.postgresql.org/pgsql-hackers/2007-11/msg00771.php <nowiki>VACUUM/ANALYZE counting of in-doubt tuples</nowiki>]
}}

{{TodoItem
|Consider compressed annealing to search for query plans
|This might replace GEQO.
* http://archives.postgresql.org/message-id/15658.1241278636%40sss.pgh.pa.us
}}

{{TodoItem
|Consider using a hash for joining to a large IN (VALUES ...) list
* [http://archives.postgresql.org/pgsql-hackers/2007-05/msg00450.php <nowiki>Planning large IN lists</nowiki>]
}}

{{TodoItem
|Allow single batch hash joins to preserve outer pathkeys
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg00806.php Re: Potential Join Performance Issue]
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|"lazy" hash tables - look up only the tuples that are actually requested
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|Avoid building the same hash table more than once during the same query
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|Avoid hashing for distinct and then re-hashing for hash join
* [http://archives.postgresql.org/message-id/4136ffa0902191346g62081081v8607f0b92c206f0a@mail.gmail.com Re: Fixing Grittner's planner issues]
* [http://archives.postgresql.org/pgsql-hackers/2009-04/msg00153.php a few crazy ideas about hash joins]
}}

{{TodoItem
|Allow hashing to be used on arrays, if the element type is hashable
* http://archives.postgresql.org/message-id/11087.1244905821@sss.pgh.pa.us
}}

{{TodoItem
|Improve use of expression indexes for ORDER BY
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01553.php <nowiki>Resjunk sort columns, Heikki's index-only quals patch, and bug #5000</nowiki>]
}}

== Background Writer ==

{{TodoItem
|Consider having the background writer update the transaction status hint bits before writing out the page
|Implementing this requires the background writer to have access to system catalogs and the transaction status log.}}

{{TodoItem
|Consider adding buffers the background writer finds reusable to the free list
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00781.php <nowiki>Background LRU Writer/free list</nowiki>]
}}

{{TodoItem
|Automatically tune bgwriter_delay based on activity rather then using a fixed interval
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00781.php <nowiki>Background LRU Writer/free list</nowiki>]
}}

{{TodoItem
|Consider whether increasing BM_MAX_USAGE_COUNT improves performance
* [http://archives.postgresql.org/pgsql-hackers/2007-06/msg01007.php <nowiki>Bgwriter LRU cleaning: we've been going at this all wrong</nowiki>]
}}

{{TodoItem
|Test to see if calling PreallocXlogFiles() from the background writer will help with WAL segment creation latency
* [http://archives.postgresql.org/pgsql-patches/2007-06/msg00340.php <nowiki>Re: Load Distributed Checkpoints, final patch</nowiki>]
}}

== Concurrent Use of Resources ==

{{TodoItem
|Do async I/O for faster random read-ahead of data
|Async I/O allows multiple I/O requests to be sent to the disk with results coming back asynchronously.
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg00820.php <nowiki>Asynchronous I/O Support</nowiki>]
* [http://archives.postgresql.org/pgsql-performance/2007-09/msg00255.php <nowiki>Re: random_page_costs - are defaults of 4.0 realistic for SCSI RAID 1</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00027.php <nowiki>There's random access and then there's random access</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-01/msg00170.php <nowiki>Bitmap index scan preread using posix_fadvise (Was: There's random access and then there's random access)</nowiki>]
The above patch is already applied as of 8.4, but it still remains to figure out how to handle plain indexscans effectively.
* [http://archives.postgresql.org//pgsql-hackers/2009-01/msg00806.php Problems with the patch submitted for posix_fadvise in index scans]
}}

{{TodoItem
|Experiment with multi-threaded backend for better I/O utilization
|This would allow a single query to make use of multiple I/O channels simultaneously. One idea is to create a background reader that can pre-fetch sequential and index scan pages needed by other backends. This could be expanded to allow concurrent reads from multiple devices in a partitioned table.}}

{{TodoItem
|Experiment with multi-threaded backend for better CPU utilization
|This would allow several CPUs to be used for a single query, such as for sorting or query execution.
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00945.php <nowiki>Multi CPU Queries - Feedback and/or suggestions wanted!</nowiki>]
}}

{{TodoItem
|SMP scalability improvements
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00439.php <nowiki>Straightforward changes for increased SMP scalability</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00206.php <nowiki>Re: Reducing Transaction Start/End Contention</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00361.php <nowiki>Re: Reducing Transaction Start/End Contention</nowiki>]
}}

== TOAST ==

{{TodoItem
|Allow user configuration of TOAST thresholds
* [http://archives.postgresql.org/pgsql-hackers/2007-02/msg00213.php <nowiki>Re: Proposed adjustments in MaxTupleSize and toastthresholds</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00082.php <nowiki>pg_lzcompress strategy parameters</nowiki>]
}}

{{TodoItem
|Reduce unnecessary cases of deTOASTing
* [http://archives.postgresql.org/pgsql-hackers/2007-09/msg00895.php <nowiki>Re: [PATCHES] Eliminate more detoast copies for packed varlenas</nowiki>]
}}

{{TodoItem
|Reduce costs of repeat de-TOASTing of values
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg01096.php <nowiki>WIP patch: reducing overhead for repeat de-TOASTing</nowiki>]
}}

== Miscellaneous Performance ==

{{TodoItem
|Use mmap() rather than SYSV shared memory or to write WAL files?
|This would remove the requirement for SYSV SHM but would introduce portability issues. Anonymous mmap (or mmap to /dev/zero) is required to prevent I/O overhead.}}

{{TodoItem
|Consider mmap()'ing files into a backend?
|Doing I/O to large tables would consume a lot of address space or require frequent mapping/unmapping. Extending the file also causes mapping problems that might require mapping only individual pages, leading to thousands of mappings. Another problem is that there is no way to _prevent_ I/O to disk from the dirty shared buffers so changes could hit disk before WAL is written.}}

{{TodoItem
|Add a script to ask system configuration questions and tune postgresql.conf}}

{{TodoItem
|Consider ways of storing rows more compactly on disk:
* Reduce the row header size?
* Consider reducing on-disk varlena length from four bytes to two because a heap row cannot be more than 64k in length}}

{{TodoItem
|Consider transaction start/end performance improvements
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00948.php <nowiki>Reducing Transaction Start/End Contention</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00361.php <nowiki>Re: Reducing Transaction Start/End Contention</nowiki>]
}}

{{TodoItem
|Allow configuration of backend priorities via the operating system
|Though backend priorities make priority inversion during lock waits possible, research shows that this is not a huge problem.
* [http://archives.postgresql.org/pgsql-general/2007-02/msg00493.php <nowiki>Priorities for users or queries?</nowiki>]
}}

{{TodoItem
|Consider increasing the minimum allowed number of shared buffers
* [http://archives.postgresql.org/pgsql-bugs/2008-02/msg00157.php <nowiki>Re: [PATCH] Don't bail with legitimate -N/-B options</nowiki>]
}}

{{TodoItem
|Consider if CommandCounterIncrement() can avoid its AcceptInvalidationMessages() call
* [http://archives.postgresql.org/pgsql-committers/2007-11/msg00585.php <nowiki>pgsql: Avoid incrementing the CommandCounter when</nowiki>]
}}

{{TodoItem
|Consider Cartesian joins when both relations are needed to form an indexscan qualification for a third relation
* [http://archives.postgresql.org/pgsql-performance/2007-12/msg00090.php <nowiki>Re: TB-sized databases</nowiki>]
}}

{{TodoItem
|Consider not storing a NULL bitmap on disk if all the NULLs are trailing
* [http://archives.postgresql.org/pgsql-hackers/2007-12/msg00624.php <nowiki>Proposal for Null Bitmap Optimization(for Trailing NULLs)</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2007-12/msg00109.php <nowiki>Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)</nowiki>]
}}

{{TodoItem
|Sort large UPDATE/DELETEs so it is done in heap order
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg01119.php <nowiki>Possible future performance improvement: sort updates/deletes by ctid</nowiki>]
}}

{{TodoItem
|Allow one transaction to see tuples using the snapshot of another transaction
|This would assist multiple backends in working together.
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00400.php <nowiki>Transaction Snapshot Cloning</nowiki>]
}}

{{TodoItem
|Consider decreasing the I/O caused by updating tuple hint bits
* [http://archives.postgresql.org/pgsql-hackers/2008-05/msg00847.php <nowiki>Hint Bits and Write I/O</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-07/msg00199.php <nowiki>Re: [HACKERS] Hint Bits and Write I/O</nowiki>]
}}

{{TodoItem
|Avoid the requirement of freezing pages that are infrequently modified
|If all rows on a page are visible, it is possible to set a bit in the visibility map (once the visibility map is 100% reliable) and not need to freeze the page, avoiding a page rewrite
* http://archives.postgresql.org/message-id/4BF701CF.2090205@agliodbs.com
* http://archives.postgresql.org/pgsql-hackers/2010-06/msg00082.php
}}

{{TodoItem
|Avoid reading in b-tree pages when replaying vacuum records in hot standby mode
* [http://archives.postgresql.org/message-id/1272571938.4161.14739.camel@ebony <nowiki>Hot Standby tuning for btree_xlog_vacuum()</nowiki>]
}}

== Miscellaneous Other ==

{{TodoItem
|Deal with encoding issues for filenames in the server filesystem
* {{MessageLink|20090413184335.39BE.52131E4D@oss.ntt.co.jp|a proposed patch here}}
* {{MessageLink|8484.1244655656@sss.pgh.pa.us|some issues about it here}}
* {{MessageLink|20100107103740.97A5.52131E4D@oss.ntt.co.jp|Windows-specific patch here}}
}}

{{TodoItem
|Deal with encoding issues in the output of localeconv()
* [http://archives.postgresql.org/message-id/40c6d9160904210658y590377cfw6dbbecb53d2b8be0@mail.gmail.com bug report]
* [http://archives.postgresql.org/message-id/49EF8DA0.90008@tpf.co.jp draft patch]
* [http://archives.postgresql.org/message-id/21710.1243620986@sss.pgh.pa.us review of patch]
}}

{{TodoItem
|Provide schema name and other fields available from SQL GET DIAGNOSTICS in error reports
* [http://archives.postgresql.org/message-id/dcc563d10810211907n3c59a920ia9eb7cd2a6d5ea58@mail.gmail.com <nowiki>How to get schema name which violates fk constraint</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-11/msg00846.php <nowiki>patch - Report the schema along table name in a referential failure error message</nowiki>]
* {{MessageLink|3191.1263306359@sss.pgh.pa.us|Re: NOT NULL violation and error-message}}
* [http://archives.postgresql.org/pgsql-hackers/2009-08/msg00213.php <nowiki>the case for machine-readable error fields</nowiki>]
}}

{{TodoItemEasy
| Provide [http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html#LIBPQ-CONNECT-FALLBACK-APPLICATION-NAME fallback_application_name] in contrib/pgbench, oid2name, and dblink.
* {{MessageLink|w2g9837222c1004070216u3bc46b3ahbddfdffdbfb46212@mail.gmail.com|fallback_application_name and pgbench}}
}}

== Source Code ==

{{TodoItem
|Add use of 'const' for variables in source tree}}

{{TodoItemEasy
|Remove warnings created by -Wcast-align}}

{{TodoItem
|Move platform-specific ps status display info from ps_status.c to ports}}

{{TodoItem
|Add optional CRC checksum to heap and index pages
|One difficulty is how to prevent hint bit changes from affecting the computed CRC checksum.
* http://archives.postgresql.org/message-id/19934.1226601952%40sss.pgh.pa.us
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00002.php <nowiki>Re: Block-level CRC checks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01028.php <nowiki>double-buffering page writes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00524.php <nowiki>Re: Block-level CRC checks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg01101.php <nowiki>Re: Block-level CRC checks</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-12/msg00011.php <nowiki>Re: Block-level CRC checks</nowiki>]
}}

{{TodoItem
|Consider a faster CRC32 algorithm
* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01112.php
}}

{{TodoItem
|Allow cross-compiling by generating the zic database on the target system}}

{{TodoItem
|Improve NLS maintenance of libpgport messages linked onto applications}}

{{TodoItem
|Improve the module installation experience (/contrib, etc)
* [http://archives.postgresql.org/pgsql-hackers/2008-04/msg00132.php <nowiki>modules</nowiki>]
* {{messageLink|ca33c0a30807231640n6fb4035dod8121a18aa1fa29c@mail.gmail.com|Re: PostgreSQL extensions packaging}}
* {{messageLink|ca33c0a30804061349s41b4d8fcsa9c579454b27ecd2@mail.gmail.com|Database owner installable modules patch}}
* [http://archives.postgresql.org//pgsql-hackers/2009-03/msg00855.php <nowiki>Re: contrib function naming, and upgrade issues</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2009-05/msg00912.php <nowiki>search_path vs extensions</nowiki>]
}}

{{TodoItem
|Use UTF8 encoding for NLS messages so all server encodings can read them properly}}

{{TodoItem
|Allow creation of universal binaries for Darwin
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00884.php <nowiki>Getting to universal binaries for Darwin</nowiki>]
}}

{{TodoItem
|Consider GnuTLS if OpenSSL license becomes a problem
* [http://archives.postgresql.org/pgsql-patches/2006-05/msg00040.php <nowiki>[PATCH] Add support for GnuTLS</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-12/msg01213.php <nowiki>TODO: GNU TLS</nowiki>]
}}

{{TodoItem
|Consider making NAMEDATALEN more configurable in future releases}}

{{TodoItem
|Research use of signals and sleep wake ups
* [http://archives.postgresql.org/pgsql-hackers/2007-07/msg00003.php <nowiki>Restartable signals 'n all that</nowiki>]
}}

{{TodoItem
|Allow C++ code to more easily access backend code
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg00302.php <nowiki>Mostly Harmless: Welcoming our C++ friends</nowiki>]
}}

{{TodoItem
|Consider simplifying how memory context resets handle child contexts
* [http://archives.postgresql.org/pgsql-patches/2007-08/msg00067.php <nowiki>Re: Memory leak in nodeAgg</nowiki>]
}}

{{TodoItem
|Create three versions of libpgport to simplify client code
* [http://archives.postgresql.org/pgsql-hackers/2007-10/msg00154.php <nowiki>8.4 TODO item: make src/port support libpq and ecpg directly</nowiki>]
}}

{{TodoItem
|Improve detection of shared memory segments being used by others by checking the SysV shared memory field 'nattch'
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00656.php <nowiki>postgresql in FreeBSD jails: proposal</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00673.php <nowiki>Re: postgresql in FreeBSD jails: proposal</nowiki>]
}}

{{TodoItem
|Implement the non-threaded Avahi service discovery protocol
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00939.php <nowiki>Re: [PATCHES] Avahi support for Postgresql</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-02/msg00097.php <nowiki>Re: Avahi support for Postgresql</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg01211.php <nowiki>Re: [PATCHES] Avahi support for Postgresql</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-04/msg00001.php <nowiki>Re: [HACKERS] Avahi support for Postgresql</nowiki>]
}}

{{TodoItem
|Reduce data row alignment requirements on some 64-bit systems
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg00369.php <nowiki>[WIP] Reduce alignment requirements on 64-bit systems.</nowiki>]
}}

{{TodoItem
|Restructure TOAST internal storage format for greater flexibility
* [http://archives.postgresql.org/pgsql-hackers/2008-11/msg00049.php <nowiki>Re: PG_PAGE_LAYOUT_VERSION 5 - time for change</nowiki>]
}}

{{TodoItem
| Add regression tests for pg_dump/restore
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01967.php <nowiki>"make install-check-pg_dump" target in src/regress]</nowiki>]
}}

=== Documentation ===
{{TodoSubsection}}

{{TodoItem
|Convert single quotes to apostrophes in the PDF documentation
* [http://archives.postgresql.org/pgsql-docs/2007-12/msg00059.php <nowiki>SGML docs and pdf single-quotes</nowiki>]
}}

{{TodoItem
|Provide a manpage for postgresql.conf
* {{messageLink|20080819194311.GH4428@alvh.no-ip.org|A smaller default postgresql.conf}}
* {{messageLink|200808211910.37524.peter_e@gmx.net|A smaller default postgresql.conf}}
}}

{{TodoItem
|Change the manpage-generating toolchain to use the new XML-based docbook2x tools
* {{messageLink|200808211910.37524.peter_e@gmx.net|A smaller default postgresql.conf}}
}}

{{TodoItem
|Consider changing documentation format from SGML to XML
* [http://archives.postgresql.org/pgsql-docs/2006-12/msg00152.php <nowiki>Re: Authoring Tools WAS: Switching to XML</nowiki>]
}}

{{TodoItem
|Document support for N<nowiki>' '</nowiki> national character string literals, if it matches the SQL standard
* http://archives.postgresql.org/message-id/1275895438.1849.1.camel@fsopti579.F-Secure.com
}}

{{TodoItem
|Add diagrams to the documentation
* http://archives.postgresql.org/pgsql-docs/2010-07/msg00001.php
}}

{{TodoEndSubsection}}

=== /contrib/pg_upgrade ===
{{TodoSubsection}}

{{TodoItem
|Remove copy_dir() code, or use it
}}

{{TodoItem
|Handle large object comments
|This is difficult to do because the large object doesn't exist when --schema-only is loaded.
}}

{{TodoItem
|Consider using pg_depend for checking object usage in version.c
}}

{{TodoItem
|If reindex is necessary, allow it to be done in parallel with pg_dump custom format
}}

{{TodoItem
|Migrate pg_statistic by dumping it out as a flat file, so analyze is not necessary
|pg_class.oid is not preserved so schema.tablename must be used.
}}

{{TodoItem
|Improve testing, perhaps using the buildfarm
|The buildfarm has access to multiple versions of PostgreSQL.
}}

{{TodoItem
|Create machine-readable output of pg_controldata
|This would avoid parsing its output.
}}

{{TodoEndSubsection}}

=== Windows ===
{{TodoSubsection}}

{{TodoItem
|Remove configure.in check for link failure when cause is found}}

{{TodoItem
|Remove readdir() errno patch when runtime/mingwex/dirent.c rev 1.4 is released}}

{{TodoItem
|Allow psql to use readline once non-US code pages work with backslashes}}

{{TodoItem
|Fix problem with shared memory on the Win32 Terminal Server}}

{{TodoItem
|Improve signal handling
* [http://archives.postgresql.org/pgsql-patches/2005-06/msg00027.php <nowiki>Simplify Win32 Signaling code</nowiki>]
}}

{{TodoItem
|Convert MSVC build system to remove most batch files
* [http://archives.postgresql.org/pgsql-hackers/2007-08/msg00961.php <nowiki>MSVC build system</nowiki>]
}}

{{TodoItem
|Support pgxs when using MSVC}}

{{TodoItem
|Fix MSVC NLS support, like for to_char()
* [http://archives.postgresql.org/pgsql-hackers/2008-02/msg00485.php <nowiki>NLS on MSVC strikes back!</nowiki>]
* [http://archives.postgresql.org/pgsql-patches/2008-02/msg00038.php <nowiki>Fix for 8.3 MSVC locale (Was [HACKERS] NLS on MSVC strikes back!)</nowiki>]
}}

{{TodoItem
|Find a correct rint() substitute on Windows
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00808.php <nowiki>Minor bug in src/port/rint.c</nowiki>]
}}

{{TodoItem
|Fix global namespace issues when using multiple terminal server sessions
* [http://archives.postgresql.org/message-id/48F3BFCC.8030107@dunslane.net problems with Windows global namespace]}}

{{TodoItem
|Change from the current autoconf/gmake build system to cmake
* [http://archives.postgresql.org/pgsql-hackers/2008-12/msg01869.php <nowiki>About CMake (was Re: [COMMITTERS] pgsql: Append major version number and for libraries soname major)</nowiki>]
}}

{{TodoItem
|Improve consistency of path separator usage
* http://archives.postgresql.org/message-id/49C0BDC5.4010002@hagander.net
}}

{{TodoEndSubsection}}

=== Wire Protocol Changes ===
{{TodoSubsection}}

{{TodoItem
|Allow dynamic character set handling}}

{{TodoItem
|Add decoded type, length, precision}}

{{TodoItem
|Use compression?}}

{{TodoItem
|Update clients to use data types, typmod, schema.table.column names of result sets using new statement protocol}}

{{TodoEndSubsection}}

== Exotic Features ==

{{TodoItem
|Add pre-parsing phase that converts non-ISO syntax to supported syntax
|This could allow SQL written for other databases to run without modification.}}

{{TodoItem
|Allow plug-in modules to emulate features from other databases}}

{{TodoItem
|Add features of Oracle-style packages
|A package would be a schema with session-local variables, public/private functions, and initialization functions. It is also possible to implement these capabilities in any schema and not use a separate "packages" syntax at all.
* [http://archives.postgresql.org/pgsql-hackers/2006-08/msg00384.php <nowiki>proposal for PL packages for 8.3.</nowiki>]
}}

{{TodoItem
|Consider allowing control of upper/lower case folding of unquoted identifiers
* [http://archives.postgresql.org/pgsql-hackers/2004-04/msg00818.php <nowiki>Bringing PostgreSQL torwards the standard regarding case folding</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg01527.php <nowiki>Re: [SQL] Case Preservation disregarding case sensitivity?</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-03/msg00849.php <nowiki>TODO Item: Consider allowing control of upper/lower case folding of unquoted, identifiers</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00415.php <nowiki>Identifier case folding notes</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2008-07/msg00415.php <nowiki>Identifier case folding notes</nowiki>]
}}

{{TodoItem
|Add autonomous transactions
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php <nowiki>autonomous transactions</nowiki>]
}}

{{TodoItem
|Give query progress indication
* [[Query progress indication]]
}}

{{TodoItem
|Rethink our type system
* [[Rethinking datatypes]]
}}

== Features We Do ''Not'' Want ==

{{TodoItem
|All backends running as threads in a single process (not wanted)
|This eliminates the process protection we get from the current setup. Thread creation is usually the same overhead as process creation on modern systems, so it seems unwise to use a pure threaded model, and MySQL and DB2 have demonstrated that threads introduce as many issues as they solve. Threading specific operations such as I/O, seq scans, and connection management has been discussed and will probably be implemented to enable specific performance features. Moving to a threaded engine would also require halting all other work on PostgreSQL for one to two years.}}

{{TodoItem
|Optimizer hints (not wanted)
|Optimizer hints are used to work around problems in the optimizer and introduce upgrade and maintenance issues. We would rather have the problems reported and fixed. We have discussed a more sophisticated system of per-class cost adjustment instead, but a specification remains to be developed.
* [http://archives.postgresql.org/pgsql-hackers/2006-08/msg00506.php <nowiki>Re: An Idea for planner hints</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg00517.php <nowiki>Index Tuning Features</nowiki>]
* [http://archives.postgresql.org/pgsql-hackers/2006-10/msg00663.php <nowiki>Re: [PERFORM] Hints proposal</nowiki>]
}}

{{TodoItem
|Embedded server (not wanted)
|While PostgreSQL clients runs fine in limited-resource environments, the server requires multiple processes and a stable pool of resources to run reliably and efficiently. Stripping down the PostgreSQL server to run in the same process address space as the client application would add too much complexity and failure cases. Besides, there are several very mature embedded SQL databases already available.}}

{{TodoItem
|Obfuscated function source code (not wanted)
|Obfuscating function source code has minimal protective benefits because anyone with super-user access can find a way to view the code. At the same time, it would greatly complicate backups and other administrative tasks. To prevent non-super-users from viewing function source code, remove SELECT permission on pg_proc.
* [http://archives.postgresql.org/pgsql-general/2008-09/msg00668.php <nowiki>Obfuscated stored procedures (was Re: Oracle and Postgresql)</nowiki>]
}}

{{TodoItem
|Indeterminate behavior for the GROUP BY clause (not wanted)
|At least one other database product allows specification of a subset of the result columns which GROUP BY would need to be able to provide predictable results; the server is free to return any value from the group. This is not viewed as a desirable feature. PostgreSQL 9.1 will allow result columns that are not referenced by GROUP BY if a primary key for the same table is referenced in GROUP BY.
* [http://archives.postgresql.org/pgsql-hackers/2010-03/msg00297.php <nowiki>Re: SQL compatibility reminder: MySQL vs PostgreSQL</nowiki>]
}}

</div>

[[Category:Todo]]

Submitting a Patch

2010-09-24T00:59:24Z

Schmiddy: /* Patch submission */ replace info about diff generation with cvs to git

== Initial patch design ==

If you have a trivial patch that serves an obvious need, you may be able to write the patch and submit it directly to the [http://archives.postgresql.org/pgsql-hackers/ pgsql-hackers] mailing list without having its design reviewed first. But in general, a non-trivial change should be discussed (potentially before the code is even written) on the [http://archives.postgresql.org/pgsql-hackers/ pgsql-hackers] list before being submitted as a patch.

For general coding style guidelines, see the [[Developer FAQ]] and the [http://developer.postgresql.org/pgdocs/postgres/source.html PostgreSQL Coding Conventions].

=== Design your interface first ===

Ask yourself these questions:

* Will the user interact with this new feature? if so, how?
* What is the syntax? Have ideas, and the ability to defend technical decisions you believe strongly in.
* What are the exact semantics/behaviors?
* Are there any backward compatibility issues?
* Get community buy-in at this level of detail before you start coding. But not necessarily consensus.
* Write an opening paragraph to your email to the -hackers list that answers these questions:
**This is the kind of problem I'm trying to solve
**This is what it is doing right now
**This is what it will do.

Mostly, get someone from the community involved in your ideas as early as possible so that you can even get half-baked ideas vetted early, rather than creating something in a vacuum. Similarly, it's easier to make progress and keep patches focused if you concentrate on the smallest portion of the idea you can execute perfectly. Resist the temptation to build a giant patch all at once, as those are much less likely to be reviewed usefully and therefore committed. You should take a look at how your patch will eventually be [[Reviewing a Patch|reviewed]], so you can make sure that review is likely to succeed.

=== Save us the trouble of reformatting your code ===

Please read [http://developer.postgresql.org/pgdocs/postgres/source.html PostgreSQL Coding Conventions]. Also, follow the style of the adjacent code! [http://archives.postgresql.org/pgsql-hackers/2009-08/msg01001.php Suggestions from Tom] clarify some of the trickier situations you might run into.

Naming things is important, and when in doubt, ask someone else to help you with names. We tend to use [http://en.wikipedia.org/wiki/CamelCase CamelCase] or underscores: thisStyleIsOkay or this_is_okay_too.

Generally, try to blend in with the surrounding code. Do not use #ifdef's to enable your changes. Comments are for clarification not for delineating your code from the surroundings.

Please remove any spurious whitespace. "git diff --color" makes them stand out like a sore thumb, in red.

== Patch submission==

Once you believe your patch is complete, you can submit it via e-mail to the [http://archives.postgresql.org/pgsql-hackers/ pgsql-hackers] mailing list. At that time, or after you wait for initial feedback, you should also add it to the page for the next [https://commitfest.postgresql.org/ CommitFest].

Normally changes should be submitted as a single patch that includes every file touched. If the patch is large and can be logically separated into distinct and separately commit-able sections for easier review, with a clear order they get applied in described when applicable, that can be more straightforward for reviewers to work with for more complicated patches. Patches in [http://en.wikipedia.org/wiki/Diff#Context_format Context Diff] format are preferred. See [[Working_with_Git#Context_diffs_with_Git|Working with git]] for ways to do this. [http://petereisentraut.blogspot.com/2009/09/how-to-submit-patch-by-email.html How to submit a patch by email] for more details about mailing in patches. If you're a new submitter, the suggestion there about using your judgment on patch formatting is not a recommended practice however--you should be using the standard context diff format.

Please note that PostgreSQL is licensed under a BSD license. By posting a patch to the public PostgreSQL mailing lists, you are giving the PostgreSQL Global Development Group the non-revocable right to distribute your patch under the BSD license.

Please include all of the following information with each patch submitted

* Project name.
* Uniquely identifiable file name, so we can tell difference between your v1 and v24.
* What the patch does in a short paragraph.
* Whether the patch is for discussion or for application (see WIP notes below)
* Which CVS branch the patch is against (ordinarily this will be HEAD). For more on branches in PostgreSQL, see [[CVS Branch Management]].
* Whether it compiles and tests successfully, so we know nothing obvious is broken.
* Whether it contains any platform-specific items and if so, has it been tested on other platforms.
* Confirm that the patch includes [[Regression test authoring|regression tests]] to check the new feature actually works as described.
* Include some docs on how to use the new feature, including examples.
* Describe the effect your patch has on performance, if any. If the patch is intended to improve performance, it's a good idea to include some reproducible tests to demonstrate the improvement. If a reviewer cannot duplicate your claimed performance improvement in a short period of time, it's very likely your patch will be bounced. Do not expect that a reviewer is going to find your performance feature so interesting that they will build an entire test suite to prove it works. You should have done that as part of patch validation, and included the necessary framework for testing with the submission.
* Try to include a few lines about why you chose to do things particular ways, rather than let your reviewer guess what was happening. This can be done as code comments, but it might also be an additional reviewers' guide, or additions to a README file in one of the code directories.
* If your patch addresses a [[Todo]] item, please give the name of the Todo item in your email. This is so that the reviewers will know that the item needs to be marked as done if your patch is applied.

The objective of all of these suggested items is to ensure that the reviewer's time is not wasted. You spent time writing the code, but that does '''not''' mean you can demand time, energy and interest from a reviewer. Make it easy on yourself and others so that your patch is accepted quickly, easily and with good humor on all sides.

It is helpful for early patches, ones not intended to be of commit quality, to be labelled clearly as such so that the appropriate style of review is done. The abbreviation WIP ("Work in Progress") is the standard shorthand to attach to patches intended for review not as a commit candidate, but for design feedback. Labelling your patch as a WIP on your e-mail subject line and on the matching description in the CF application will advise reviewers to focus more on the general approach, rather than on things like coding style that can normally be ignored in the early portion of a patch's lifecycle.

Submitting the patch is just the first step towards getting it committed. Very few patches are committed exactly as originally submitted, even those submitted by experienced professional developers. For any non-trivial patch you should plan for at least 3 versions before final acceptance.

The easiest way to get your patch rejected is to make lots of unrelated changes, like reformatting lines, correcting comments you felt were poorly worded etc. Each patch should have the minimum set of changes required to fulfil the single stated objective.

=== Submission timing ===

You need to pay attention to what the community work cycle is. If you're sending in a brand new idea in the beta phase, don't be surprised if no one is paying attention because they are focused on release work. Come back when the beta is done, please!

PostgreSQL development is also organized with periodic [[CommitFest|CommitFests]], which are periods where new development halts in order to focus on patch review and committing. It's best if you can avoid sending in a new patch during the occasional weeks when there is an active CommitFest; you can check the schedule via the [https://commitfest.postgresql.org/ CommitFest application]. If your schedule doesn't allow waiting until an active CommitFest is over, you should explicitly label your submission as intended for the next CommitFest, not the current one, so that it's clear it's not intended to be part of the active review process.

== Patch review and commit ==

There's basically three different workflows a patch can follow after it's been submitted that lead to it being commited:

Workflow A:
# You post patch to pgsql-hackers
# A committer picks it up immediately and commits it.

Workflow B:
# You post a patch to pgsql-hackers
# You add the patch to the [http://commitfest.postgresql.org/action/commitfest_view/open open commitfest] queue
# A committer picks up the patch from the queue, and commits it

Workflow C:
# You post a patch to pgsql-hackers
# Bruce adds the patch to a list of unapplied patches
# At the beginning of the next commit fest, Alvaro (with the help from others, I hope) goes through the list, and adds the patch to the [http://commitfest.postgresql.org/action/commitfest_view/open open commitfest] queue
# A [[Committers|committer]] picks up the patch from the queue, and commits it

At any of these stages, your patch might instead be rejected for technical, style, or other reasons. These rejections will normally come with feedback on whether an improved version of that patch would be more acceptable. In those cases, you should consider updating your patch based on that feedback and re-submit.

== Followup on submissions ==

=== How do you get someone to respond to you? ===

You've sent an email to -hackers and no one has responded. What do you do?

* Make sure you've added your patch to the [http://commitfest.postgresql.org/action/commitfest_view/open open commitfest] queue.
* Start out by reviewing a patch or responding to email on the lists. Even if it is unrelated to what you're doing.
* Start with submitting a patch that is small and uncontroversial to help them understand you, and to get you familiar with the overall process.
* People are more willing to listen and work with someone who is already contributing.
* Also, in our community -- if no one objects, then there is implicit approval. Within reason!

Participating in community is a process, not a single event.

=== Submitting patch updates ===

When submitting a new version of a previously submitted patch, you should do a few additional things:

* Uniquely identify the new version, usually done via an incremented suffix on the name of the patch
* Make sure it's easy to find any earlier discussion of the patch. Don't expect that everyone will still be able to find previous submissions on their own. Either fully duplicate the information about the patch from your original messages, or provide a clear link to the earlier message via the [http://archives.postgresql.org/pgsql-hackers/ archives]. Note that you can link to your earlier post using the e-mail message ID of what you sent earlier, perhaps by checking your sent e-mail for it. That type of link is preferred because links to the archive by message number might sometimes get renumbered. See [[Template:MessageLink]] for more details.

PostgreSQL for Oracle DBAs

2010-05-03T17:39:45Z

Schmiddy: /* REDO and Archiving */

= Introduction =

The following article contains information to help an Oracle DBA understand
some terms and the management of a PostgreSQL database. This article is
intended to be an introduction to PostgreSQL, not a tutorial or a complete
definition of how to administer a PostgreSQL database. For complete
documentation refer to the [http://www.postgresql.org/docs/manuals/ PostgreSQL manuals].

= Oracle =

== Brief description: ==

* An Oracle database server consists of an Oracle instance and an Oracle database.
* An Oracle instance consists of the Oracle background processes and the allocated memory within the shared global area (SGA) and the program global area (PGA).
* The Oracle background processes consist of the following:
** Database Writer Process (DBWn)
** Log Writer Process (LGWR)
** Checkpoint Process (CKPT)
** System Monitor Process (SMON)
** Process Monitor Process (PMON)
** Recoverer Process (RECO)
** Archiver Processes (ARCn)
* An Oracle database consists of the database datafiles, control files, redo log files, archive log files, and parameter file.
* To remotely access an Oracle database, there exists a separate process referred to as the Oracle listener.
* In the Dedicated Server configuration (versus the Shared Server configuration) every established database session has its own process executing on the server.

To keep things simple any comparisons with an Oracle database will always refer to a single instance managing a single database, RAC and Data Guard will not be mentioned. Note: PostgreSQL also has the concept of a warm standby (since 8.2) with the shipping of archive logs (introduced in 8.0).

= PostgreSQL =

== Database Server Processes ==

The database server program postgres are all of the server processes. There are no separately named processes like in Oracle for the different duties within the database environment. If you were to look at the process list (ps) the name of the processes would be postgres. However, on most platforms, PostgreSQL modifies its command title so that individual server processes can readily be identified. You may need to adjust the parameters used for commands such as ps and top to show these updated titles in place of the process name ("postgres").

The processes seen in a process list can be some of the following:

* Master process - launches the other processes, background and session processes.
* Writer process - background process that coordinates database writes, log writes and checkpoints.
* Stats collector process - background process collecting information about server activity.
* User session processes.

The server processes communicate with each other using semaphores and shared memory to ensure data integrity throughout concurrent data access.

== PostgreSQL Database Cluster ==

Within a server, one or more Oracle instances can be built. The databases are separate from one another usually sharing only the Oracle listener process. PostgreSQL has the concept of a ''database cluster''. A database cluster is a collection of databases that is stored at a common file system location (the "data area"). It is possible to have multiple database clusters, so long as they use different data areas and different communication ports.

The processes along with the file system components are all shared within the database cluster. All the data needed for a database cluster is stored within the cluster's data directory, commonly referred to as ''PGDATA'' (after the name of the environment variable that can be used to define it). The PGDATA directory contains several subdirectories and configuration files.

The following are some of the cluster configuration files:

* postgresql.conf - Parameter or main server configuration file.
* pg_hba.conf - Client authentication configuration file.
* pg_ident.conf - Map from OS account to PostgreSQL account file.

The cluster subdirectories:

* base - Subdirectorycontaining per-database subdirectories
* global - Subdirectory containing cluster-wide tables
** pg_auth - Authorization file containing user and role definitions.
** pg_control - Control file.
** pg_database - Information of databases within the cluster.
* pg_clog - Subdirectory containing transaction commit status data
* pg_multixact - Subdirectory containing multitransaction status data (used for shared row locks)
* pg_subtrans - Subdirectory containing subtransaction status data
* pg_tblspc - Subdirectory containing symbolic links to tablespaces
* pg_twophase - Subdirectory containing state files for prepared transactions
* pg_xlog - Subdirectory containing WAL (Write Ahead Log) files

By default, for each database in the cluster there is a subdirectory within PGDATA/base, named after the database's OID (object identifier) in pg_database. This subdirectory is the default location for the database's files; in particular, its system catalogs are stored there. Each table and index is stored in a separate file, named after the table or index's filenode number, which can be found in pg_class.relfilenode.

Several components that Oracle DBAs usually equate to one database are shared between databases within a PostgreSQL cluster, including the parameter file, control file, redo logs, tablespaces, accounts, roles, and background processes.

== Tablespaces and Object Data Files ==

PostgreSQL introduced tablespace management in version 8.0. The physical representation of a tablespace within PostgreSQL is simple: it is a directory on the file system, and the mapping is done via symbolic links.

When a database is created, the default tablespace is where by default all of the database objects are stored. In Oracle this would be similar to the System, User, and Temporary tablespaces. If no default tablespace is defined during creation, the data files will go into a subdirectory of the PGDATA/base. Preferably the location of the system catalog information and the application data structures would reside in separately managed tablespaces. This is available.

As in Oracle, the definition of a PostgreSQL table determines which tablespace the object resides. However, there exists no size limitation except physical boundaries placed on the device by the OS.

The individual table's data is stored within a file within the tablespace (or directory). The database software will split the table across multiple datafiles in the event the table's data surpasses 1 GB.

Since version 8.1, it's possible to partition a table over separate (or the same) tablespaces. This is based on PostgreSQL's table inheritance feature, using a capability of the query planner referred to as constraint exclusion.

There exists no capacity for separating out specific columns (like LOBs) into separately defined tablespaces. However, in addition to the data files that represent the table (in multiples of 1 GB) there is a separation of data files for columns within a table that are TOASTed. The PostgreSQL storage system called TOAST (The Oversized-Attribute Storage Technique) automatically stores values larger than a single database page into a secondary storage area per table. The TOAST technique allows for data columns up to 1 GB in size.

As in Oracle, the definition of an index determines which tablespace it resides within. Therefore, it is possible to gain the performance advantage of separating the disks that a table's data versus its indexing reside, relieving I/O contention during data manipulation.

In Oracle there exists temporary tablespaces where sort information and temporary evaluation space needed for distinct statements and the like are used. PostgreSQL does not have this concept of a temporary tablespace; however it does require storage to be able to perform these activities as well. Within the "default" tablespace of the database (defined at database creation) there is a directory called pgsql_tmp. This directory holds the temporary storage needed for the evaluation. The files that get created within the directory exist only while the SQL statement is executing. They grow very fast, and are most likely not designed for space efficiency but rather speed. Be aware that disk fragmentation could result from this, and there needs to be sufficient space on the disk to support the user queries. With the release of 8.3, there are definitions of temporary tablespaces using the parameter ''temp_tablespaces''.

== REDO and Archiving ==

PostgreSQL uses ''Write-Ahead Logging'' (WAL) as its approach to transaction logging. WAL's central concept is that changes to data files (where tables and indexes reside) must be written only after those changes have been logged, that is, when log records describing the changes have been flushed to permanent storage. If we follow this procedure, we do not need to flush data pages to disk on every transaction commit, because we know that in the event of a crash we will be able to recover the database using the log: any changes that have not been applied to the data pages can be redone from the log records. (This is roll-forward recovery, also known as REDO.)

PostgreSQL maintains its (WAL) in the ''pg_xlog'' subdirectory of the cluster's data directory.

WAL was introduced into PostgreSQL in version 7.1. To maintain database consistency in case of a failure, previous releases forced all data modifications to disk before each transaction commit. With WAL, only one log file must be flushed to disk, greatly improving performance while adding capabilities like Point-In-Time Recovery and transaction archiving.

A PostgreSQL system theoretically produces an indefinitely long sequence of WAL records. The system physically divides this sequence into WAL segment files, which are normally 16MB apiece. The system normally creates a few segment files and then "recycles" them by renaming no-longer-needed segment files to higher segment numbers. If you were to perform a listing of the pg_xlog directory there would always be a handful of files changing names over time.

To add archiving of the WAL files there exists a parameter within the parameter file where a command is added to execute the archival process. Once this is done, Operation System "on-line" backups even become available by executing the ''pg_start_backup'' and the ''pg_stop_backup'' commands, which suspend and resume writing to the datafiles while continuing to write the transactions to the WAL files and executing the archival process.

Inclusion of WAL archiving and the on-line backup commands were added in version 8.0.

== Rollback or Undo ==

It is interesting how the dynamic allocation of disk space is used for the storage and processing of records within tables. The files that represent the table grow as the table grows. It also grows with transactions that are performed against it. In Oracle there is a concept of rollback or undo segments that hold the information for rolling back a transaction. In PostgreSQL the data is stored within the file that represents the table. So when deletes and updates are performed on a table, the file that represents the object will contain the previous data. This space gets reused but to force recovery of used space, a maintenance process called ''vacuum'' must be executed.

== Server Log File ==

Oracle has the alert log file. PostgreSQL has the server log file. A configuration option would even have the connection information we normally see within the Oracle's listener.log appear in PostgreSQL's server log. The parameters within the server configuration file (postgresql.conf) determine the level, location, and name of the log file.

To help with the maintenance of the server log file (it grows rapidly), there exists functionality for rotating the server log file. Parameters can be set to determine when to rotate the file based on the size or age of the file. Management of the old files is then left to the administrator.

== Applications ==

The command ''initdb'' creates a new PostgreSQL database cluster.

The command ''psql'' starts the terminal-based front-end to PostgreSQL or SQL command prompt. Queries and commands can be executed interactively or through files. The psql command prompt has several attractive features:

* Thorough on-line help for both the psql commands and the SQL syntax.
* Command history and line editing.
* SQL commands could exist on multiple lines and are executed only after the semi-colon (;).
* Several SQL commands separated by semi-colons could be entered on a single line.
* Flexible output formatting.
* Multiple object description commands that are superior to Oracle's DESCRIBE.

Depending on the security configurations of the environments, connections can be established locally or remotely through TCP/IP. Due to these separate security connections passwords may or may not be required to connect.

The command ''pg_ctl'' is a utility for displaying status, starting, stopping, or restarting the PostgreSQL database server (postgres). Although the server can be started through the postgres executable, pg_ctl encapsulates tasks such as redirecting log output, properly detaching from the terminal and process group, and providing options for controlled shutdown.

The commands ''pg_dump'' and ''pg_restore'' are utilities designed for exporting and importing the contents of a PostgreSQL database. Dumps can be output in either script or archive file formats. The script file format creates plain-text files containing the SQL commands required to reconstruct the database to the state it was at the time it was generated. The archive file format creates a file to be used with pg_restore to rebuild the database.

The archive file formats are designed to be portable across architectures. Historically, any type of upgrade to the PostgreSQL software would require a pg_dump of the database prior to the upgrade. Then a pg_restore after the upgrade. Now, for minor releases (i.e., the third decimal – 8.2.x) upgrades can be done in place. However, changing versions at the first or second decimal still requires a pg_dump/pg_restore.

There exists a graphical tool called [http://www.pgadmin.org/ ''pgAdmin III''] developed separately. It is distributed with the Linux and Windows versions of PostgreSQL. Connection to a database server can be established remotely to perform administrative duties. Because the tool is designed to manage all aspects of the database environment, connection to the database must be through a super user account.

The pgAdmin III tool has the following standard attractive features:

* Intuitive layout
* Tree structure for creating and modifying database objects
* Reviewing and saving of SQL when altering or creating objects

[[Category:Oracle]]

Warm Standby

2010-04-20T03:34:11Z

Schmiddy: copyediting: use 'standbyhost' consistently, 2ndQ section of "known bugs" no longer exists?

__NOTOC__
There are a couple available Projects available to help you setup a warm standby system:

* Use the walmgr.py portion of Skype's [https://developer.skype.com/SkypeGarage/DbProjects/SkyTools SkyTools] package which will handle PITR backups from a primary to a single slave
* Utilize Command Prompt's [https://projects.commandprompt.com/public/pitrtools PITR tools] to set everything up

But to actually get a warm standby up manually is actually a pretty simple process. The following are notes only and intended to help your understanding. If you want to get this working correctly then please follow the manual, which is comprehensive and accurately maintained.

[http://www.postgresql.org/docs/current/static/warm-standby.html Warm Standby Manual]

== Pre-process recommendations ==
*Use [http://www.postgresql.org/docs/current/static/pgstandby.html pg_standby] for your restore_command in the recovery.conf file on the standby. pg_standby is included in PostgreSQL 8.3, and you can copy the source from there to compile it for 8.2 yourself. It isn't compatible with 8.1.
*Set up your standby host's environment and directory structure exactly the same as your primary. Otherwise you'll need to spend time changing any symlinks you've created on the primary for xlogs, tablespaces, or whatnot which is really just opportunity for error.
*Pre-configure both the postgresql.conf and recovery.conf files for your standby. I usually keep all of my different config files for all of my different servers in a single, version-controlled directory that I can then check out and symlink to. Again, consistent environment & directory setups make symlinks your best friend.
*Use ssh keys for simply, and safely, transferring files between hosts.
*Follow all of the advice in the manual with respect to handling errors.

== Outline of steps to get warm standby working ==
*Set archive_command in your postgresql.conf, rysnc is a popular choice or you can just use one of the examples from the docs. I use:
<code><pre>
rsync -a %p postgres@standbyhost:/path/to/wal_archive/%f
</pre></code>
**You must use a command here that does atomic copies, meaning that the file will never appear under the destination filename until it has been completely copied over. This keeps the standby server from trying to read a partial file. rsync is known to work. A notable command that isn't atomic is scp. If you want to use scp for this purpose, you will need to transfer files into another directory on the secondary, then move them to where the restore command looks for them after the transfer is complete.
***If you're using pg_standby, it will refuse to apply files unless they are the right length, which lowers the risk of non-atomic copies being applied. On Windows it even sleeps a bit after that to give time for things to settle. Performing the copy non-atomically is still a bad idea you should avoid.
*Reload your config -- either: SELECT pg_reload_conf(); from psql or: pg_ctl reload -D data_dir/
*Verify that the WALs are being shipped to their destination.
*In psql, SELECT pg_start_backup('some_label');
*Run your base backup. Again, rsync is good for this with something as simple as:
<code><pre>
rsync -a --progress /path/to/data_dir/* postgres@standbyhost:/path/to/data_dir/
</pre></code>
*I'd suggest running this in a screen term window, the --progress flag will let you watch to see how far along the rsync is. The -a flag will preserve symlinks as well as all file permissions & ownership.
*In psql, SELECT pg_stop_backup();
**This drops a file to be archived that will have the same name as the first WAL shipped after the call to pg_start_backup() with a .backup suffix. Inside will be the start & stop WAL records defining the range of WAL files needed to be replayed before you can consider bringing the standby out of recovery.
*Drop in, or symlink, your recovery.conf file in the standby's data_dir.
**The restore command should use pg_standby (it's help/README are simple and to the point). I'd recommend redirecting all output from pg_standby to a log file that you can then watch to verify that everything is working correctly once you've started things.
*Drop in, or symlink, your standby's postgresql.conf file.
**If you don't symlink your pg_xlog directory to write WALs to a separate drive, you can safely delete everything under data_dir/pg_xlog on the standby host.
*Start the standby db server with a normal: pg_ctl start -D /path/to/data_dir/
*run a: tail -f on your standby log and watch to make sure that it's replaying logs. If everything's cool you'll see some info on each WAL file, in order, that the standby looks for along with 'success' messages. If it can't find the files for some reason, you'll see repeated messages like: 'WAL file not present yet. Checking for trigger file...' (assuming you set up pg_standby to look for a trigger file in your recovery_command).

Execute this entire process at least a couple times, bringing up the standby into normal operations mode once it's played through all of the necessary WAL files (as noted in the .backup file) so that you can connect to it and verify that everything looks good, before doing all of this and leaving it running indefinitely. Once you do it a couple times, it becomes dirt simple.

== Adjusting frequency of WAL updates in 8.1 ==

Often people want to know that their secondary is never more than some amount behind the primary. The archive_timeout feature introduced into 8.2 allows doing that. If you're using WAL replication with 8.1, you can force 16MB worth of WAL activity that doesn't leave any changes behind with a hack like this:

<code><pre>
create table xlog_switch as
select '0123456789ABCDE' from generate_series(1,1000000);
drop table xlog_switch;
</pre></code>

If you put that into cron etc. to run via psql and you can make the window for log shipping as fine as you'd like even with no activity.
If you do it too often you're increasing the odds it will interfere with real transactions though and it will use up more disk space; every couple of minutes is probably as often as you'd want to do this. Using archive_timeout doesn't have this issue, the manual suggests it can be set to only a few seconds if necessary.

== Additional resources ==
*[http://www.kennygorman.com/wordpress/?p=249 pg_standby lag monitoring]
*[http://scale-out-blog.blogspot.com/2009/02/simple-ha-with-postgresql-point-in-time.html Simple HA with PITR]
*[http://www.travishegner.com/2009/06/postgresql-83-warm-stand-by-replication.html PostgreSQL 8.3 Warm Stand-by Replication]: tutorial with Ubuntu specifics
*[http://michsan.blogspot.com/2008/08/using-pgstandby-for-high-availability.html Using pg_standby for high availability of Postgresql]: tutorial that covers Debian, using 8.3 pg_standby on 8.2
*Source material:
** [http://archives.postgresql.org/pgsql-general/2008-01/msg01587.php warm standby examples]
** [http://archives.postgresql.org/sydpug/2006-10/msg00001.php Creating an 8.2 warm-standby demo system]
** [http://archives.postgresql.org/pgsql-general/2007-06/msg00015.php PITR Base Backup on an idle 8.1 server]

[[Category:Replication]][[Category:Backup]]

Query progress indication

2010-01-29T20:18:44Z

Schmiddy: simple pg_stat_activity check for whether query is waiting on lock?

Postgres currently doesn't give any meaningful feedback about the query execution process to the user. This would be valuable for:
# reporting whether the execution of a query is blocked on a lock; many users in #postgresql are confused about why their query takes so long to execute, when in fact it is blocked on a lock -- ''Isn't this easily gleaned from the "waiting" boolean in pg_stat_activity?''
# reporting the progress of a long-running analysis query. When interactively executing complex, long-running queries, providing feedback about how long they will take to execute (and approximate answers in the mean-time) would be cool
# reporting the progress of long-running utility queries. For example, it can be difficult to predict whether the runtime of a large CREATE INDEX will be minutes or hours; similarly, the progress of manual VACUUMs can be difficult to estimate accurately

This has been discussed in the DBMS literature (see below). Offhand, I think implementing #1 and #3 would be pretty doable (perhaps with FE/BE changes), but #2 would take some Thought.

== References ==
Relevant papers:
* [http://www.cs.toronto.edu/~cmishra/ProgressBars.pdf A Lightweight Online Framework For Query Progress Indicators]
* [http://www.cs.wisc.edu/~gangluo/workload_final.pdf Multi-query SQL Progress Indicators]
* [http://www.cs.wisc.edu/~gangluo/interface.pdf Toward a Progress Indicator for Database Queries]
* [http://www.cs.wisc.edu/~gangluo/PI.pdf Increasing the Accuracy and Coverage of SQL Progress Indicators]

Relevant pgsql-hackers threads:
* [http://archives.postgresql.org/pgsql-hackers/2006-07/msg00719.php Progress bar updates]

Installation and Administration Best practices

2009-09-25T18:23:06Z

Schmiddy: /* Directories Location Recommended */ cleaning up this whole section

=== Proposal ===

This page is under construction.

This page will contain information about the best way to install and maintain a Postgresql database, including environment variables, paths and other relevant stuff.

Fell free to add your suggestions and knowledge to make it a 'hands on' resorce.

=== Self compiled vs. package distributed ===

Before installing, consider whether to use packages distributed with your operating system or whether to roll your own. (Compiling PostgreSQL on Windows might be a difficult task. It's quite easy on Linux/Unix boxes.)

Let's compare using a pre-built package and compiling PostgreSQL yourself. (Note: This is a rather Linux-centric view. For Windows, you'll likely want to use the binary packages provided for each release.)

{| border="1" cellpadding="5" cellspacing="0"
!Using pre-built package from distribution
!Compiling yourself.
|-
|Very easy to install - just use your package manager.
|You might need to install gcc and some development packages just for building PostgreSQL.
|-
|Installation is dependent on distribution (location of config files, initial tablespace).
|You may install everything in one place, just where you want it.
|-
|Startup-scripts are included and supposed to work.
|You need to provide your own system startup scripts.
|-
|The packages might be out of date or new minor versions might not become available frequently.
|You are free to use the latest stable version and perform upgrades at your will.
|-
|The package management knows about the PostgreSQL installation and will update it.
|Your package management doesn't know anything about the installation. Dependent libraries might get uninstalled or replaced by newer, incompatible versions. ('''Note:''' This is rather unlikely. I've never seen it happen. PostgreSQL doesn't depend on any strange or fast-evolving packages.)
|}

=== Compiling and installing in Solaris ===

TODO: Add a workaround for the most common issues.

== Compiling in Solaris using Sun Studio 12 ==

<code>
./configure --prefix=/usr/local/pgsql84 CC=/opt/SUNWspro/bin/cc 'CFLAGS=-xO3 -xarch=native -xspace -W0,-Lt -W2,-Rcond_elim -Xa -xildoff -xc99=none -xCC' --datadir=/usr/local/pgsql84/data84 --enable-dtrace --enable-cassert --with-perl --with-python --with-libxml --with-libxslt --with-ossp-uuid --without-readline
</code>

Notes:
* OSol don't have readline library
* Maybe you don't need --enable-cassert option.

== Issues ==

UUID: There is a problem with UUID library in Open Solaris 200911. If some one have a workaround in this, please post it.

=== Multiple Versions on the same host ===

If you have to install multiple PostgreSQL versions at the same host, compile from source and call configure like this:

./configure --prefix=/opt/postgresql-8.2.11 --with-pgport=8200

That way, you never need to worry what version you are talking with - you just look at the port number.

Other way is changing port in postgresql.conf. Beware of that if you have am own init script, remeber to change values of PGDATA and PGUSER.

=== Making sure it starts up at system boot time ===

TODO: Provide a default init script (if there's not already one in contrib/).

=== Recommended values to be changed in big servers ===

In Linux, the SHMMAX value can often be set rather low, especially in older 32-bit distributions. Depending on your PostgreSQL configuration, you might need to tweak the values of SHMMAX and/or SHMALL.

Example of a high configuration:
<code>
# /etc/sysctl.conf
fs.file-max = 32768
kernel.shmmax = 1073741824
kernel.shmall = 536870912
</code>

How to calculate? One of the equations is: (FIXME: does this formula refer to SHMMAX?)

<code>
250kb + 8.2kb * shared_buffers +14.2kb * max_connections
</code>

The SHMMAX variable [http://ps-ax.com/shared-mem.html controls] the maximum amount of memory to be allocated for shared memory use. If you try to assign high values for e.g. the shared_buffers GUC in PostgreSQL without adjusting SHMMAX, you might see an [http://www.caktusgroup.com/blog/2009/08/13/setting-postgresqls-shmmax-in-mac-os-x-105-leopard/ error message] in Postgres' log like " ... Failed system call was shmget ... usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter", and you'll have to adjust SHMMAX upward accordingly.

More information about SHMMAX [http://vista.intersystems.com/csp/docbook/DocBook.UI.Page.cls?KEY=GCI_unixparms can be found here].

=== Directory Locations Recommended ===
==== WAL Directory ====
Write Ahead Logging (WAL) is a critical part of Postgres' operation. Due to Postgres' [http://www.postgresql.org/docs/current/static/wal-intro.html use of the WAL] to ensure data consistency, the WAL will receive significant I/O activity. Especially if your PGDATA directory isn't located on a capable RAID array, you might consider relocating the WAL (pg_xlog directory) to a separate disk to ease I/O load on the rest of the database. The WAL should be [http://archives.postgresql.org/pgsql-novice/2003-09/msg00259.php written in perfectly sequential fashion], so the I/O savings from relocating this directory can be substantial. However, be aware of additional [http://archives.postgresql.org/pgsql-sql/2006-12/msg00039.php data consistency risks] you will take on by relocating the WAL.

(FIXME: awkward and vague paragraph) Configuration: If you have the PGDATA in very different location, you maybe want to have a /etc/postgresql/data<number>. Or, if you have several versions for testing purposes you maybe want to have 'debian' like tree (/etc/postgresql/<version>/<pgdata_number>). The default way is have config files inside the PGDATA.

See [http://www.postgresql.org/files/documentation/books/aw_pgsql/hw_performance/node10.html here] for information on offloading various PostgreSQL data onto different drives.

=== Versioning sql scripts and configuration files ===

As you are doing right now (versioning the sql scripts), other best practice is to version the configuration files. Not only a simple versioning, remember that you have several envoronments (Development, Test, Production).

Other good practice, is to have versioned the DBA modifications in a separated script (SET STORAGE modifications, special indexes and rules, etc).

TODO: Paste a example.

=== Backup and Recovery strategies ===

TODO: How to perform those tasks.

=== Users athentications ===

Please for more information, read article [http://wiki.postgresql.org/wiki/Client_Authentication].

One recommended installation for access by host mode is the md5 method for encrypted password. We can draw something like this:

<code><pre>
host all all 192.168.1.0/24 md5
</pre></code>

This mode, reduces the the access to the IP addreses that are included in 192.168.1.x and using the correct password for the user.
Adding to this restriction, you must remember that in the postgres.conf you should modify listen_addresses variable [http://www.postgresql.org/docs/current/static/runtime-config-connection.html listen_addresses].

Remember that you can create users wiht '''VALID UNTIL''' option. When you create a user you can calculate the timestamp with something like this:

<code>
SELECT (CURRENT_DATE+1)::timestamp;
</code>

You might replace 1 with the number of days that you want to enable the user(7=1 week, 30=month, etc.). Then, copy result in the ALTER or CREATE statement.

Remember: The first step before change '''trust''' for other method based in passwords, you should assign one to the user.

Changes in the pg_hba.conf only need '''reload''' signal. So, there is no need downtime.

You can create superusers for the databases, but remember that if youwant to restrict the access to the databases, the superusers still have permissions to drop the other BD's and other mainteneance tasks. Try to reduce the number of superusers or almost one.

=== LDAP Auth ===

TODO: explain the best ways to put on work this method
[http://wiki.postgresql.org/wiki/Client_Authentication]

=== Monitoring Index and table accesing and use ===

TODO: Explain how to monitor the use of tables and indexes to apply modification to the way to storage them.

Installation and Administration Best practices

2009-09-25T17:59:39Z

Schmiddy: /* Recommended values to be changed in big servers */ Adjusting awkward phrasing, improving links

=== Proposal ===

This page is under construction.

This page will contain information about the best way to install and maintain a Postgresql database, including environment variables, paths and other relevant stuff.

Fell free to add your suggestions and knowledge to make it a 'hands on' resorce.

=== Self compiled vs. package distributed ===

Before installing, consider whether to use packages distributed with your operating system or whether to roll your own. (Compiling PostgreSQL on Windows might be a difficult task. It's quite easy on Linux/Unix boxes.)

Let's compare using a pre-built package and compiling PostgreSQL yourself. (Note: This is a rather Linux-centric view. For Windows, you'll likely want to use the binary packages provided for each release.)

{| border="1" cellpadding="5" cellspacing="0"
!Using pre-built package from distribution
!Compiling yourself.
|-
|Very easy to install - just use your package manager.
|You might need to install gcc and some development packages just for building PostgreSQL.
|-
|Installation is dependent on distribution (location of config files, initial tablespace).
|You may install everything in one place, just where you want it.
|-
|Startup-scripts are included and supposed to work.
|You need to provide your own system startup scripts.
|-
|The packages might be out of date or new minor versions might not become available frequently.
|You are free to use the latest stable version and perform upgrades at your will.
|-
|The package management knows about the PostgreSQL installation and will update it.
|Your package management doesn't know anything about the installation. Dependent libraries might get uninstalled or replaced by newer, incompatible versions. ('''Note:''' This is rather unlikely. I've never seen it happen. PostgreSQL doesn't depend on any strange or fast-evolving packages.)
|}

=== Compiling and installing in Solaris ===

TODO: Add a workaround for the most common issues.

== Compiling in Solaris using Sun Studio 12 ==

<code>
./configure --prefix=/usr/local/pgsql84 CC=/opt/SUNWspro/bin/cc 'CFLAGS=-xO3 -xarch=native -xspace -W0,-Lt -W2,-Rcond_elim -Xa -xildoff -xc99=none -xCC' --datadir=/usr/local/pgsql84/data84 --enable-dtrace --enable-cassert --with-perl --with-python --with-libxml --with-libxslt --with-ossp-uuid --without-readline
</code>

Notes:
* OSol don't have readline library
* Maybe you don't need --enable-cassert option.

== Issues ==

UUID: There is a problem with UUID library in Open Solaris 200911. If some one have a workaround in this, please post it.

=== Multiple Versions on the same host ===

If you have to install multiple PostgreSQL versions at the same host, compile from source and call configure like this:

./configure --prefix=/opt/postgresql-8.2.11 --with-pgport=8200

That way, you never need to worry what version you are talking with - you just look at the port number.

Other way is changing port in postgresql.conf. Beware of that if you have am own init script, remeber to change values of PGDATA and PGUSER.

=== Making sure it starts up at system boot time ===

TODO: Provide a default init script (if there's not already one in contrib/).

=== Recommended values to be changed in big servers ===

In Linux, the SHMMAX value can often be set rather low, especially in older 32-bit distributions. Depending on your PostgreSQL configuration, you might need to tweak the values of SHMMAX and/or SHMALL.

Example of a high configuration:
<code>
# /etc/sysctl.conf
fs.file-max = 32768
kernel.shmmax = 1073741824
kernel.shmall = 536870912
</code>

How to calculate? One of the equations is: (FIXME: does this formula refer to SHMMAX?)

<code>
250kb + 8.2kb * shared_buffers +14.2kb * max_connections
</code>

The SHMMAX variable [http://ps-ax.com/shared-mem.html controls] the maximum amount of memory to be allocated for shared memory use. If you try to assign high values for e.g. the shared_buffers GUC in PostgreSQL without adjusting SHMMAX, you might see an [http://www.caktusgroup.com/blog/2009/08/13/setting-postgresqls-shmmax-in-mac-os-x-105-leopard/ error message] in Postgres' log like " ... Failed system call was shmget ... usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter", and you'll have to adjust SHMMAX upward accordingly.

More information about SHMMAX [http://vista.intersystems.com/csp/docbook/DocBook.UI.Page.cls?KEY=GCI_unixparms can be found here].

=== Directories Location Recommended ===

Logs: Sometimes logs could be your best wy to determine an issue, but sometimes could add some overhead to your servers. If your RAID isn't enough powerful, you could have a separated storage for them (a disc or something else). Usually , Postgresql logs are in pg_log directory inside the PGDATA. You can have a soft link to this storage or directly an absolute path in postgresql.conf.

WAL: WAL is an important part of our databases, so it need the same level of attention. If you don't have RAID, maybe the first thing to think about is to store WAL files in other discs.

Configuration: If you have the PGDATA in very different location, you maybe want to have a /etc/postgresql/data<number>. Or, if you have several versions for test porpuses you maybe want to have 'debian' like tree (/etc/postgresql/<version>/<pgdata_number>). The default way is have config files inside the PGDATA.

=== Versioning sql scripts and configuration files ===

As you are doing right now (versioning the sql scripts), other best practice is to version the configuration files. Not only a simple versioning, remember that you have several envoronments (Development, Test, Production).

Other good practice, is to have versioned the DBA modifications in a separated script (SET STORAGE modifications, special indexes and rules, etc).

TODO: Paste a example.

=== Backup and Recovery strategies ===

TODO: How to perform those tasks.

=== Users athentications ===

Please for more information, read article [http://wiki.postgresql.org/wiki/Client_Authentication].

One recommended installation for access by host mode is the md5 method for encrypted password. We can draw something like this:

<code><pre>
host all all 192.168.1.0/24 md5
</pre></code>

This mode, reduces the the access to the IP addreses that are included in 192.168.1.x and using the correct password for the user.
Adding to this restriction, you must remember that in the postgres.conf you should modify listen_addresses variable [http://www.postgresql.org/docs/current/static/runtime-config-connection.html listen_addresses].

Remember that you can create users wiht '''VALID UNTIL''' option. When you create a user you can calculate the timestamp with something like this:

<code>
SELECT (CURRENT_DATE+1)::timestamp;
</code>

You might replace 1 with the number of days that you want to enable the user(7=1 week, 30=month, etc.). Then, copy result in the ALTER or CREATE statement.

Remember: The first step before change '''trust''' for other method based in passwords, you should assign one to the user.

Changes in the pg_hba.conf only need '''reload''' signal. So, there is no need downtime.

You can create superusers for the databases, but remember that if youwant to restrict the access to the databases, the superusers still have permissions to drop the other BD's and other mainteneance tasks. Try to reduce the number of superusers or almost one.

=== LDAP Auth ===

TODO: explain the best ways to put on work this method
[http://wiki.postgresql.org/wiki/Client_Authentication]

=== Monitoring Index and table accesing and use ===

TODO: Explain how to monitor the use of tables and indexes to apply modification to the way to storage them.

Installation and Administration Best practices

2009-09-25T17:30:54Z

Schmiddy: /* Self compiled vs. package distributed */

=== Proposal ===

This page is under construction.

This page will contain information about the best way to install and maintain a Postgresql database, including environment variables, paths and other relevant stuff.

Fell free to add your suggestions and knowledge to make it a 'hands on' resorce.

=== Self compiled vs. package distributed ===

Before installing, consider whether to use packages distributed with your operating system or whether to roll your own. (Compiling PostgreSQL on Windows might be a difficult task. It's quite easy on Linux/Unix boxes.)

Let's compare using a pre-built package and compiling PostgreSQL yourself. (Note: This is a rather Linux-centric view. For Windows, you'll likely want to use the binary packages provided for each release.)

{| border="1" cellpadding="5" cellspacing="0"
!Using pre-built package from distribution
!Compiling yourself.
|-
|Very easy to install - just use your package manager.
|You might need to install gcc and some development packages just for building PostgreSQL.
|-
|Installation is dependent on distribution (location of config files, initial tablespace).
|You may install everything in one place, just where you want it.
|-
|Startup-scripts are included and supposed to work.
|You need to provide your own system startup scripts.
|-
|The packages might be out of date or new minor versions might not become available frequently.
|You are free to use the latest stable version and perform upgrades at your will.
|-
|The package management knows about the PostgreSQL installation and will update it.
|Your package management doesn't know anything about the installation. Dependent libraries might get uninstalled or replaced by newer, incompatible versions. ('''Note:''' This is rather unlikely. I've never seen it happen. PostgreSQL doesn't depend on any strange or fast-evolving packages.)
|}

=== Compiling and installing in Solaris ===

TODO: Add a workaround for the most common issues.

== Compiling in Solaris using Sun Studio 12 ==

<code>
./configure --prefix=/usr/local/pgsql84 CC=/opt/SUNWspro/bin/cc 'CFLAGS=-xO3 -xarch=native -xspace -W0,-Lt -W2,-Rcond_elim -Xa -xildoff -xc99=none -xCC' --datadir=/usr/local/pgsql84/data84 --enable-dtrace --enable-cassert --with-perl --with-python --with-libxml --with-libxslt --with-ossp-uuid --without-readline
</code>

Notes:
* OSol don't have readline library
* Maybe you don't need --enable-cassert option.

== Issues ==

UUID: There is a problem with UUID library in Open Solaris 200911. If some one have a workaround in this, please post it.

=== Multiple Versions on the same host ===

If you have to install multiple PostgreSQL versions at the same host, compile from source and call configure like this:

./configure --prefix=/opt/postgresql-8.2.11 --with-pgport=8200

That way, you never need to worry what version you are talking with - you just look at the port number.

Other way is changing port in postgresql.conf. Beware of that if you have am own init script, remeber to change values of PGDATA and PGUSER.

=== Making sure it starts up at system boot time ===

TODO: Provide a default init script (if there's not already one in contrib/).

=== Recommended values to be changed in big servers ===

In Linux, the SHMMAX value is setted in a historical value. So, depending on your server, the first value to be changed is SHMMAX and SHMALL.

Example of a high configuration:
<code>
# /etc/sysctl.conf
fs.file-max = 32768
kernel.shmmax = 1073741824
kernel.shmall = 536870912
</code>

How to calculate? One of the equations is:
<code>
250kb + 8.2kb * shared_buffers +14.2kb * max_connections
</code>

Why this value is important? This value govern shared memory allocation with other values. If you try to assign high values in Postgresql but you don't touch this values, Postgresql may not run.

One interest link could be visited [http://vista.intersystems.com/csp/docbook/DocBook.UI.Page.cls?KEY=GCI_unixparms here.]

=== Directories Location Recommended ===

Logs: Sometimes logs could be your best wy to determine an issue, but sometimes could add some overhead to your servers. If your RAID isn't enough powerful, you could have a separated storage for them (a disc or something else). Usually , Postgresql logs are in pg_log directory inside the PGDATA. You can have a soft link to this storage or directly an absolute path in postgresql.conf.

WAL: WAL is an important part of our databases, so it need the same level of attention. If you don't have RAID, maybe the first thing to think about is to store WAL files in other discs.

Configuration: If you have the PGDATA in very different location, you maybe want to have a /etc/postgresql/data<number>. Or, if you have several versions for test porpuses you maybe want to have 'debian' like tree (/etc/postgresql/<version>/<pgdata_number>). The default way is have config files inside the PGDATA.

=== Versioning sql scripts and configuration files ===

As you are doing right now (versioning the sql scripts), other best practice is to version the configuration files. Not only a simple versioning, remember that you have several envoronments (Development, Test, Production).

Other good practice, is to have versioned the DBA modifications in a separated script (SET STORAGE modifications, special indexes and rules, etc).

TODO: Paste a example.

=== Backup and Recovery strategies ===

TODO: How to perform those tasks.

=== Users athentications ===

Please for more information, read article [http://wiki.postgresql.org/wiki/Client_Authentication].

One recommended installation for access by host mode is the md5 method for encrypted password. We can draw something like this:

<code><pre>
host all all 192.168.1.0/24 md5
</pre></code>

This mode, reduces the the access to the IP addreses that are included in 192.168.1.x and using the correct password for the user.
Adding to this restriction, you must remember that in the postgres.conf you should modify listen_addresses variable [http://www.postgresql.org/docs/current/static/runtime-config-connection.html listen_addresses].

Remember that you can create users wiht '''VALID UNTIL''' option. When you create a user you can calculate the timestamp with something like this:

<code>
SELECT (CURRENT_DATE+1)::timestamp;
</code>

You might replace 1 with the number of days that you want to enable the user(7=1 week, 30=month, etc.). Then, copy result in the ALTER or CREATE statement.

Remember: The first step before change '''trust''' for other method based in passwords, you should assign one to the user.

Changes in the pg_hba.conf only need '''reload''' signal. So, there is no need downtime.

You can create superusers for the databases, but remember that if youwant to restrict the access to the databases, the superusers still have permissions to drop the other BD's and other mainteneance tasks. Try to reduce the number of superusers or almost one.

=== LDAP Auth ===

TODO: explain the best ways to put on work this method
[http://wiki.postgresql.org/wiki/Client_Authentication]

=== Monitoring Index and table accesing and use ===

TODO: Explain how to monitor the use of tables and indexes to apply modification to the way to storage them.

Talk:Why PostgreSQL Instead of MySQL 2009

2009-09-23T20:19:08Z

Schmiddy: /* Notes for updates to include */ note on mysqldump

==Notes for updates to include==

* InnoDB doesn't even allow a consistent backup for free? http://krow.livejournal.com/594927.html
** I haven't tried out the 3rd party "Hot Backup" tool mentioned on that page, but AIUI its main advantage (in addition to speed) over mysqldump seems to be in handling mixes of InnoDB + MyISAM tables, and dumping them together in a consistent state. Using the --single-transaction flag of [http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html#option_mysqldump_single-transaction mysqldump] should dump all '''InnoDB tables only''' in a single MySQL database in a consistent state, similar to pg_dump's operation.