NewIn96

From PostgreSQL wiki
Jump to navigationJump to search

What's New in PostgreSQL 9.6

This page is a work in progress which will include details of PostgreSQL 9.6 features and changes. Many thanks to Thom Brown for assembling the original list.

Parallel Query

Parallel sequential scans

PostgreSQL can now execute a full table scan in multiple parallel processes, up to the limits set by the user.

Links:

parallel joins

Links:

parallel aggregates

Parallel Aggregates allow aggregate functions to be processed concurrently by multiple worker processes. This can significantly increase the response times for queries aggregating large amounts of tuples down to just a few aggregate groups, providing that the server has enough otherwise idle CPUs and would otherwise be bounded by the single CPU which the backend process is running on. Each worker process operates on a subset of tuples and creates a partially aggregated result to which it returns to the main backend process where each partial result will be combined with ones from the other worker processes and the final aggregate result produced.

Most internal aggregate functions support parallel mode. User defined aggregates will need to be changed to add a combine function, and possible a serialization and deserialization in the case of aggregate functions which return an INTERNAL state.

Links:

postgres_fdw

Sort pushdown

Links:

join pushdown

Links:

DML (UPDATE/DELETE) pushdown

Links:

Operator and function pushdown

Links:

Replication

New remote_apply replication mode which waits for confirmation that a standby has applied changes

Links:

Support for multiple synchronous standbys

Links:

pg_stat_wal_receiver

Links:

Replication slots can allocate WAL at creation

Links:

Text Search

Phrase full text search

Links:

Editable tsvector fields

Tsvector fields can now be modified in detail to fine-tune searchability.

Links:

Transactions, VACUUM and the Visibility Map

pg_visibility extension for examining visibility maps

Links:

Frozen page data in visibility map for skipping vacuum on already-frozen data

Links:

User-defined expiration of snapshots to control table bloat

Until now a long-running report or cursor displaying query results could block cleanup of dead rows, bloating all volatile tables in the database. The only options were to allow the bloat, suspend normal updates to other tables, or kill the long-running activity -- which might be completely unrelated to the bloating tables. Accumulation of bloat could cause performance problems and excessive use of storage space.

A new configuration option called old_snapshot_threshold allows the cluster to be configured to allow cleanup of a dead row when the transaction which updated or deleted it (causing it to be dead) has completed and all snapshots which can still see it have reached a certain age, without immediately terminating activities which are using such snapshots. If, for example, a large monthly report is running off of data in table etl_monthly (which is not currently changing), bloat in other tables will be limited, and the report can run to completion without error. If a snapshot has passed the threshold and a query uses that snapshot to attempt to read data from a page modified recently enough that it might not produce accurate results, then a "snapshot too old" error will be thrown. Normally the goal in configuring old_snapshot_threshold is to set it high enough that such an error is rarely seen, while preventing problems caused by bloat.

Links:

psql

\ev and \sv commands to edit views

Links:

Prompt variable to show PId of backend

Links:

Multiple -c and -f options

Crosstabs in psql

\crosstabview is a completely different way to display results from a query: instead of a vertical display of rows, the data values are placed in a grid where the column and row headers come from the data itself, similar to a spreadsheet.

SQL commands

ALTER TABLE ADD COLUMN IF NOT EXISTS

Links:

ALTER TABLE SET and its locks

Links:

COPY and DML statements (CTEs)

Links:


Performance and Monitoring

Detailed wait information in pg_stat_activity

Links:

Index-only scans for partial indexes

Links:

Performance improvements for external sort operations

Links:

System Views and Administration

New system view pg_config


pg_blocking_pids

Links:

Functions pg_get_* to return NULL on invalid objects

Up to 9.6, such functions could have easily bumped on "cache lookup" errors.

Links:

pg_notification_queue_usage to look at notify queue

Links:


Backups

pg_basebackup extended with replication slots

Links:


New API for hot physical backups

Prior to PostgreSQL 9.6, the only way to perform concurrent physical backups was through pg_basebackup, via the streaming replication protocol. Low-level file system copy was only available in an exclusive mode, by calling pg_start_backup(), initiating the copy of data files, then finally calling pg_stop_backup().

The limitation of taking one backup at a time has been overcome by a new native API which redefines the pg_start_backup() method and adds a variant to pg_stop_backup(). For back-compatibility, the older signatures have been kept.

The new pg_start_backup function now accepts a third optional parameter, called 'exclusive', which is set by default to 'true' for backward compatibility. Initiating a concurrent (or if you prefer, a non-exclusive) backup is rather simple:

SELECT pg_start_backup('my_label', false, false);

If you are not familiar with pg_start_backup(), the first argument is a label for the backup, while the second is a request for a fast checkpoint operation. The third parameter, when set to false, requests a concurrent backup:

postgres=# SELECT pg_start_backup('my_label', true, false);
-[ RECORD 1 ]---+----------
pg_start_backup | 0/F000028

The PostgreSQL connection requesting the concurrent backup needs to remain active for the whole duration of the physical copy of data files (performed with your favourite tools such as rsync, cp, tar, SAN or LVM snapshots, ...). Note: this operation follows the same procedure as with prior versions of PostgreSQL (for detailed information, please refer to the documentation about continuous archiving and Point-In-Time-Recovery).

When finished, we need to use the new version of pg_stop_backup() to specify that we are closing the current non-exclusive backup:

SELECT * FROM pg_stop_backup(false);

The different signature has a mandatory parameter, called 'exclusive', which needs to be set to 'false' for concurrent backup. The reason is that the function returns a different result, a row made up of three fields:

  1. LSN, for backup consistency, returned at pg_stop_backup() time: identical to the previous version
  2. the content of the label file: this needs to be saved as 'backup_label' in the main backup directory (new in 9.6)
  3. the list of tablespaces: if not empty, this needs to be saved as 'tablespace_map' in the main backup directory (new in 9.6)

For example:

postgres=# SELECT * FROM pg_stop_backup(false);
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
-[ RECORD 1 ]-------------------------------------------------------------
lsn        | 0/F000130
labelfile  | START WAL LOCATION: 0/F000028 (file 00000001000000000000000F)+
           | CHECKPOINT LOCATION: 0/F000060                               +
           | BACKUP METHOD: streamed                                      +
           | BACKUP FROM: master                                          +
           | START TIME: 2016-05-16 09:19:44 CEST                         +
           | LABEL: my_label                                              +
           | 
spcmapfile | 16386 /Users/gabriele/pg96/tbs1                              +
           | 16388 /Users/gabriele/pg96/tbs2                              +
           | 

As you can see, these two steps require users to change their customised backup scripts. However, users should expect common disaster recovery and backup external solutions such as Barman, OmniPITR, PgBackRest, pghoard, WAL-E to transparently manage this new behaviour introduced in 9.6.

This new in-core API replaces the pgespresso package that was previously used with Barman 1.3.1 and above. Barman 2.0 now supports both pgespresso for PG9.2-PG9.5 and the new 9.6 API.

IMPORTANT: this new API will become the only supported way in the future to coordinate low level backup operations with PostgreSQL.

Links:


Extensions and Contrib Modules

Extension cascade support to install dependencies

Links:

Cube extension kNN support

Links:


Other Features

Command progress reporting

Links:



Generic WAL facility

Links:


Trigonometric functions in degrees