(This page is currently under development ahead of the release of PostgreSQL 9.5)

This page contains an overview of PostgreSQL Version 9.5's features, including descriptions, testing and usage information, and links to blog posts containing further information. See also PostgreSQL 9.5 Open Items.

Major new features

IMPORT FOREIGN SCHEMA

Previously, in order to create a foreign table in PostgreSQL, you would need to define the table, referencing the destination columns and data types, and if you have a lot of tables, this can become tedious and error-prone, and when those tables change, you need to do it all over again...

 CREATE FOREIGN TABLE remote.customers (
     id int NOT NULL,
     name text,
     company text,
     registered_date date,
     expiry_date date,
     active boolean,
     status text,
     account_level text) SERVER dest_server OPTIONS (schema_name 'public');
 
 CREATE FOREIGN TABLE remote.purchases (
     id int NOT NULL,
     purchase_time timestamptz,
     payment_time timestamptz,
     itemid int,
     volume int,
     invoice_sent boolean) SERVER dest_server OPTIONS (schema_name 'public');

As of PostgreSQL 9.5, you can import tables en masse:

 IMPORT FOREIGN SCHEMA public
   FROM SERVER dest_server INTO remote;

This would create foreign tables in the schema named "remote" for every table that appeared in the public schema on the foreign server labelled "dest_server".

You can also filter out any tables you don't wish:

 IMPORT FOREIGN SCHEMA public
   EXCEPT (reports, audit)
   FROM SERVER dest_server INTO remote;

Or limit it to just a specific set of tables:

 IMPORT FOREIGN SCHEMA public
   LIMIT TO (customers, purchases)
   FROM SERVER dest_server INTO remote;

Links

Waiting for 9.5 – Implement IMPORT FOREIGN SCHEMA.

PostgreSQL 9.5: IMPORT FOREIGN SCHEMA

Row-Level Security Policies

Additional security can be added to tables to prevent users from accessing rows they shouldn't be able to see.

Say you had a table with log data, where the username column contained the database user name which created the log entry:

 CREATE TABLE log (
     id serial primary key,
     username text,
     log_event text);

But you don't want users to see the log entries from other users, so we create a policy that says you're allowed to see the row if the username column matches the current user running the query:

 CREATE POLICY policy_user_log ON log
   FOR ALL
   TO PUBLIC
   USING (username = current_user);

And then we enable Row Level Security on the table:

 ALTER TABLE log
   ENABLE ROW LEVEL SECURITY;

As the user "report", we would then only see rows where the username column contained the value 'report':

 # SELECT * FROM log;
  id | username |   log_event    
 ----+----------+----------------
   1 | report   | DELETE issued
   4 | report   | Reset accounts
 (2 rows)

As the user "messaging", we see a different set of rows:

  id | username  |      log_event       
 ----+-----------+----------------------
   2 | messaging | Message queue purged
   3 | messaging | Reset accounts
 (2 rows)

Whereas the "postgres" user, as the superuser would get:

  id | username  |      log_event       
 ----+-----------+----------------------
   1 | report    | DELETE issued
   2 | messaging | Message queue purged
   3 | messaging | Reset accounts
   4 | report    | Reset accounts
 (4 rows)

That's because the superuser sees all rows due to the BYPASSRLS attribute on the superuser role by default.

If you have multiple policies, they are all OR'd together. For example, if you had the following 2 policies:

 CREATE POLICY policy_user_log ON log
   FOR ALL
   TO PUBLIC
   USING (username = current_user);

 CREATE POLICY policy_even_ids_only ON log
   FOR ALL
   TO PUBLIC
   USING (id % 2 = 0);

This would effectively result in the following being added to the WHERE clause of any statement:

 WHERE ((username = current_user) OR (id % 2 = 0))

So as long as it matches one policy, it passes, so as the report user mentioned above, we'd now see the following:

  id | username  |      log_event       
 ----+-----------+----------------------
   1 | report    | DELETE issued
   2 | messaging | Message queue purged
   4 | report    | Reset accounts
 (3 rows)

The additional policy now allows the row where the id is 2 because it matches the new policy. The first row doesn't match the new policy, but it's returned because it matches the first. So polices are permissive rather than restrictive.

Links

Waiting for 9.5 – Row-Level Security Policies (RLS)

BRIN Indexes

BRIN stands for Block Range INdexes, and store metadata on a range of pages. At the moment this means the minimum and maximum values per block.

This results in an inexpensive index that occupies a very small amount of space, and can speed up queries in extremely large tables. This allows the index to determine which blocks are the only ones worth checking, and all others can be skipped. So if a 10GB table of order contained rows that were generally in order of order date, a BRIN index on the order_date column would allow the majority of the table to be skipped rather than performing a full sequential scan. This will still be slower than a regular BTREE index on the same column, but with the benefits of it being far smaller and requires less maintenance.

For example:

 -- Create the table
 CREATE TABLE orders (
     id int,
     order_date timestamptz,
     item text);

 -- Insert lots of data into it
 INSERT INTO orders (order_date, item)
 SELECT x, 'dfiojdso' 
 FROM generate_series('2000-01-01 00:00:00'::timestamptz, '2015-03-01 00:00:00'::timestamptz,'2 seconds'::interval) a(x);

 -- Let's look at how much space the table occupies
 # \dt+ orders
                    List of relations
  Schema |  Name  | Type  | Owner | Size  | Description 
 --------+--------+-------+-------+-------+-------------
  public | orders | table | thom  | 13 GB | 
 (1 row)

 -- What's involved in finding the orders between 2 dates
 # EXPLAIN ANALYSE SELECT count(*) FROM orders WHERE order_date BETWEEN '2012-01-04 09:00:00' and '2014-01-04 14:30:00';
                                                                          QUERY PLAN                                                                          
 -------------------------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=5425021.80..5425021.81 rows=1 width=0) (actual time=30172.428..30172.429 rows=1 loops=1)
    ->  Seq Scan on orders  (cost=0.00..5347754.00 rows=30907121 width=0) (actual time=6050.015..28552.976 rows=31589101 loops=1)
          Filter: ((order_date >= '2012-01-04 09:00:00+00'::timestamp with time zone) AND (order_date <= '2014-01-04 14:30:00+00'::timestamp with time zone))
          Rows Removed by Filter: 207652500
  Planning time: 0.140 ms
  Execution time: 30172.482 ms
 (6 rows)

 -- Now let's create a BRIN index on the order_date column
 CREATE INDEX idx_order_date_brin
   ON orders
   USING BRIN (order_date);

 -- And see how much space it takes up
 # \di+ idx_order_date_brin
                               List of relations
  Schema |        Name         | Type  | Owner | Table  |  Size  | Description 
 --------+---------------------+-------+-------+--------+--------+-------------
  public | idx_order_date_brin | index | thom  | orders | 504 kB | 
 (1 row)

 -- Now let's see how much faster the query is with this very small index
 # EXPLAIN ANALYSE SELECT count(*) FROM orders WHERE order_date BETWEEN '2012-01-04 09:00:00' and '2014-01-04 14:30:00';
                                                                               QUERY PLAN                                                                               
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=2616868.60..2616868.61 rows=1 width=0) (actual time=6347.651..6347.651 rows=1 loops=1)
    ->  Bitmap Heap Scan on orders  (cost=316863.99..2539600.80 rows=30907121 width=0) (actual time=36.366..4686.634 rows=31589101 loops=1)
          Recheck Cond: ((order_date >= '2012-01-04 09:00:00+00'::timestamp with time zone) AND (order_date <= '2014-01-04 14:30:00+00'::timestamp with time zone))
          Rows Removed by Index Recheck: 6419
          Heap Blocks: lossy=232320
          ->  Bitmap Index Scan on idx_order_date_brin  (cost=0.00..309137.21 rows=30907121 width=0) (actual time=35.567..35.567 rows=2323200 loops=1)
                Index Cond: ((order_date >= '2012-01-04 09:00:00+00'::timestamp with time zone) AND (order_date <= '2014-01-04 14:30:00+00'::timestamp with time zone))
  Planning time: 0.108 ms
  Execution time: 6347.701 ms
 (9 rows)

This example is on an SSD drive, so the results would be even more pronounced on an HDD.

By default, the block size is 128 pages. This resolution can be increased or decreased using the pages_per_range:

 -- Create an index with 32 pages per block
 CREATE INDEX idx_order_date_brin_32
 ON orders
 USING BRIN (order_date) WITH (pages_per_range = 32);
 
 -- Create an index with 512 pages per block
 CREATE INDEX idx_order_date_brin_512
 ON orders
 USING BRIN (order_date) WITH (pages_per_range = 512);

The lower the pages per block, the more space the index will occupy, but the less lossy the index will be, i.e. it will need to discard fewer rows.

 # \di+ idx_order_date_brin*
                                  List of relations
  Schema |          Name           | Type  | Owner | Table  |  Size   | Description 
 --------+-------------------------+-------+-------+--------+---------+-------------
  public | idx_order_date_brin     | index | thom  | orders | 504 kB  | 
  public | idx_order_date_brin_32  | index | thom  | orders | 1872 kB | 
  public | idx_order_date_brin_512 | index | thom  | orders | 152 kB  | 
 (3 rows)

Links

Waiting for 9.5 – BRIN: Block Range Indexes.

Foreign Table Inheritance

Foreign tables can now either inherit local tables, or be inherited from.

For example, a local table can inherit a foreign table:

 -- Create a new table which inherits from the foreign table
 # CREATE TABLE local_customers () INHERITS (remote.customers);
 
 -- Insert some data into it
 # INSERT INTO local_customers VALUES (16, 'Bruce',$$Jo's Cupcakes$$, '2015-01-15', '2017-01-14', true, 'running', 'basic');
 
 -- And if we query the parent foreign table...
 # SELECT tableoid::regclass, * FROM remote.customers;
      tableoid     | id | name  |    company    | registered_date | expiry_date | active | status  | account_level 
 ------------------+----+-------+---------------+-----------------+-------------+--------+---------+---------------
  remote.customers |  1 | James | Hughson Corp  | 2014-05-03      | 2016-05-02  | t      | idle    | premium
  local_customers  | 16 | Bruce | Jo's Cupcakes | 2015-01-15      | 2017-01-14  | t      | running | basic
 (2 rows)

Or a foreign table can be made to inherit from a local table:

 -- Create a new table that the foreign table will be a child of
 # CREATE TABLE master_customers (LIKE remote.customers);
 
 -- Insert a new row into this table
 # INSERT INTO master_customers VALUES (99, 'Jolly',$$Cineplanet$$, '2014-10-30', '2016-10-29', true, 'running', 'premium');
 
 -- Have the foreign table inherit from the new table
 # ALTER TABLE remote.customers INHERIT master_customers;
 
 -- Let's have a look at the contents of the new table now
 # SELECT tableoid::regclass, * FROM master_customers;
      tableoid     | id | name  |    company    | registered_date | expiry_date | active | status  | account_level 
 ------------------+----+-------+---------------+-----------------+-------------+--------+---------+---------------
  master_customers | 99 | Jolly | Cineplanet    | 2014-10-30      | 2016-10-29  | t      | running | premium
  remote.customers |  1 | James | Hughson Corp  | 2014-05-03      | 2016-05-02  | t      | idle    | premium
  local_customers  | 16 | Bruce | Jo's Cupcakes | 2015-01-15      | 2017-01-14  | t      | running | basic
 (3 rows)
 
 -- And the query plan...
 # EXPLAIN ANALYSE SELECT tableoid::regclass, * FROM master_customers;
                                                         QUERY PLAN                                                         
 ---------------------------------------------------------------------------------------------------------------------------
  Result  (cost=0.00..140.80 rows=1012 width=145) (actual time=0.014..0.595 rows=3 loops=1)
    ->  Append  (cost=0.00..140.80 rows=1012 width=145) (actual time=0.012..0.591 rows=3 loops=1)
          ->  Seq Scan on master_customers  (cost=0.00..1.48 rows=48 width=145) (actual time=0.012..0.013 rows=1 loops=1)
          ->  Foreign Scan on customers  (cost=100.00..124.52 rows=484 width=145) (actual time=0.567..0.567 rows=1 loops=1)
          ->  Seq Scan on local_customers  (cost=0.00..14.80 rows=480 width=145) (actual time=0.007..0.008 rows=1 loops=1)
  Planning time: 0.256 ms
  Execution time: 1.040 ms
 (7 rows)

Links

Waiting for 9.5 – Allow foreign tables to participate in inheritance. – A.K.A. PostgreSQL got sharding.

Postgres 9.5 feature highlight: Scale-out with Foreign Tables now part of Inheritance Trees

GROUPING SETS, CUBE and ROLLUP

This set of features allows one to summarise data into sets.

For example, if we have this data:

 # SELECT * FROM employees;
    name   |      role       | department |  gender   
 ----------+-----------------+------------+--------
  Tim      | Manager         | Sales      | Male
  Sarah    | Manager         | Finance    | Female
  Neil     | Accountant      | Finance    | Male
  Joe      | Project Manager | Sales      | Male
  Yvette   | Project Manager | Finance    | Female
  Craig    | Project Manager | IT         | Male
  Penelope | Manager         | IT         | Female
 (7 rows)

If we wanted to see summaries for each department, role and gender, we can use GROUPING SETS:

 # SELECT department, role, gender, count(*)
 FROM employees
 GROUP BY GROUPING SETS (department, role, gender, ());
 
  department |      role       |  gender   | count 
 ------------+-----------------+-----------+-------
  Finance    |                 |           |     3
  IT         |                 |           |     2
  Sales      |                 |           |     2
             |                 |           |     7
             |                 | Female    |     3
             |                 | Male      |     4
             | Accountant      |           |     1
             | Manager         |           |     3
             | Project Manager |           |     3
 (9 rows)

Here we can see the count of employees in each department, each role, and each gender. We also get a total where all columns except count are blank.

If we wanted a count for every combination of those 3 categories, we could use CUBE:

 # SELECT department, role, gender, count(*)
 FROM employees
 GROUP BY CUBE (department, role, gender);
 
  department |      role       |  gender   | count 
 ------------+-----------------+-----------+-------
  Finance    | Accountant      | Male      |     1
  Finance    | Accountant      |           |     1
  Finance    | Manager         | Female    |     1
  Finance    | Manager         |           |     1
  Finance    | Project Manager | Female    |     1
  Finance    | Project Manager |           |     1
  Finance    |                 |           |     3
  IT         | Manager         | Female    |     1
  IT         | Manager         |           |     1
  IT         | Project Manager | Male      |     1
  IT         | Project Manager |           |     1
  IT         |                 |           |     2
  Sales      | Manager         | Male      |     1
  Sales      | Manager         |           |     1
  Sales      | Project Manager | Male      |     1
  Sales      | Project Manager |           |     1
  Sales      |                 |           |     2
             |                 |           |     7
  Finance    |                 | Female    |     2
  IT         |                 | Female    |     1
             |                 | Female    |     3
  Finance    |                 | Male      |     1
  IT         |                 | Male      |     1
  Sales      |                 | Male      |     2
             |                 | Male      |     4
             | Accountant      | Male      |     1
             | Accountant      |           |     1
             | Manager         | Female    |     2
             | Manager         | Male      |     1
             | Manager         |           |     3
             | Project Manager | Female    |     1
             | Project Manager | Male      |     2
             | Project Manager |           |     3
 (33 rows)

So we get counts for every combination of all values. If we wanted to ensure columns are grouped in sequence, where we only summarise from the left to right, we'd use ROLLUP:

 # SELECT department, role, gender, count(*)
 FROM employees
 GROUP BY ROLLUP (department, role, gender);
 
  department |      role       |  gender   | count 
 ------------+-----------------+-----------+-------
  Finance    | Accountant      | Male      |     1
  Finance    | Accountant      |           |     1
  Finance    | Manager         | Female    |     1
  Finance    | Manager         |           |     1
  Finance    | Project Manager | Female    |     1
  Finance    | Project Manager |           |     1
  Finance    |                 |           |     3
  IT         | Manager         | Female    |     1
  IT         | Manager         |           |     1
  IT         | Project Manager | Male      |     1
  IT         | Project Manager |           |     1
  IT         |                 |           |     2
  Sales      | Manager         | Male      |     1
  Sales      | Manager         |           |     1
  Sales      | Project Manager | Male      |     1
  Sales      | Project Manager |           |     1
  Sales      |                 |           |     2
             |                 |           |     7
 (18 rows)

So we don't get summaries per role or per gender except when used in combination with the previous columns.

These are just basic examples. Far more complicated configurations are possible.

Waiting for Grouping Sets

Links

JSONB-modifying operators and functions

In 9.3 (and to a greater extent in 9.4), JSONB data could be extracted using various functions and operators, but nothing that could actually modify the data. As of 9.5, JSONB data can now be modified.

jsonb || jsonb (concatenate / overwrite)

The || operator allows us to combine 2 jsonb objects. If there's overlap, values are replaced on the highest level.

For example, if we want to add values to a jsonb object:

 # SELECT '{"name": "Joe", "age": 30}'::jsonb || '{"town": "London"}'::jsonb;
                    ?column?                   
 ----------------------------------------------
  {"age": 30, "name": "Joe", "town": "London"}
 (1 row)

Or we can overwrite existing values:

 # SELECT '{"town": "Dataville", "population": 4096}'::jsonb || '{"population": 8192}'::jsonb;
                  ?column?                  
 -------------------------------------------
  {"town": "Dataville", "population": 8192}
 (1 row)

Note that this only works on the highest level, so nested objects are replaced from the top level. For example:

 # SELECT '{"name": "Jane", "contact": {"phone": "01234 567890", "mobile": "07890 123456"}}'::jsonb || '{"contact": {"fax": "01987 654321"}}'::jsonb;
                        ?column?                       
 ------------------------------------------------------
  {"name": "Jane", "contact": {"fax": "01987 654321"}}
 (1 row)

jsonb - text / int (remove key / array element)

We can remove keys from a jsonb object with the - operator:

  # SELECT '{"name": "James", "email": "james@localhost"}'::jsonb - 'email';
       ?column?      
  -------------------
   {"name": "James"}
  (1 row)

Or remove values from an array (base 0):

 # SELECT '["red","green","blue"]'::jsonb - 1;
     ?column?     
 -----------------
  ["red", "blue"]
 (1 row)

jsonb #- text[] / int (remove key / array element in path)

The previous example, we can remove keys or array elements, but not any lower than the highest level, so we can provide a path to the value we want to delete using a text array. Here we'll want to remove the fax number from within the contact value:

 # SELECT '{"name": "James", "contact": {"phone": "01234 567890", "fax": "01987 543210"}}'::jsonb #- '{contact,fax}'::text[];
                         ?column?                         
 ---------------------------------------------------------
  {"name": "James", "contact": {"phone": "01234 567890"}}
 (1 row)

Or we can remove an array value. Here we'll get rid of the array value as index 1 (2nd value):

 # SELECT '{"name": "James", "aliases": ["Jamie","The Jamester","J Man"]}'::jsonb #- '{aliases,1}'::text[];
                      ?column?                     
 --------------------------------------------------
  {"name": "James", "aliases": ["Jamie", "J Man"]}
 (1 row)

jsonb_set function

The above lets us delete values in a path, but not update them, so we have the jsonb_set function for that. We'll update the phone value within the contact value:

 # SELECT jsonb_set('{"name": "James", "contact": {"phone": "01234 567890", "fax": "01987 543210"}}'::jsonb,
            '{contact,phone}',
            '"07900 112233"'::jsonb);
                                  jsonb_replace                                  
 --------------------------------------------------------------------------------
  {"name": "James", "contact": {"fax": "01987 543210", "phone": "07900 112233"}}
 (1 row)

The anatomy of this function is:

 jsonb_set(
   target jsonb,           # The jsonb value you're amending.
   path text[],            # The path to the value you wish to add to or change, represented as a text array.
   new_value jsonb,        # The new object, key : value pair or array value(s) to add to or change.
   create_missing boolean  # An optional field that, if true (default), creates the value if the key doesn't already exist.
                           #   If false, the path must exist for the update to happen, or the value won't be updated.
 )

If the target key doesn't exist, and you specify true as a fourth parameter (the default), it will be appended at the location specified by the path:

 # SELECT jsonb_set('{"name": "James", "contact": {"phone": "01234 567890", "fax": "01987 543210"}}'::jsonb,
            '{contact,skype}',
            '"myskypeid"'::jsonb,
            true);
                                               jsonb_set                                               
 ------------------------------------------------------------------------------------------------------
  {"name": "James", "contact": {"fax": "01987 543210", "phone": "01234 567890", "skype": "myskypeid"}}
 (1 row)

The same function call, but using false instead of true will not update in this case because there's no pre-existing key to update.

 # SELECT jsonb_set(
            '{"name": "James", "contact": {"phone": "01234 567890", "fax": "01987 543210"}}'::jsonb,
            '{contact,skype}',
            '"myskypeid"'::jsonb,
            false);                                   jsonb_set                                    
 --------------------------------------------------------------------------------
  {"name": "James", "contact": {"fax": "01987 543210", "phone": "01234 567890"}}
 (1 row)

Or you can set the value of array elements by specifying its position. The position start from 0, so the 2nd element is 1. Negative values start from the end backward, so the last element is -1, 2nd to last is -2.

 # SELECT jsonb_set('{"name": "James", "skills": ["design","snowboarding","mechnaicalengineering"]}',
            '{skills,2}',
            '"mechanical engineering"'::jsonb,
            true);
                                      jsonb_set                                     
 -----------------------------------------------------------------------------------
  {"name": "James", "skills": ["design", "snowboarding", "mechanical engineering"]}
 (1 row)

So the positions in this array can be referred to like so:

      0             1                    2           # positive values
     -3            -2                   -1           # negative values
 ["design", "snowboarding", "mechanical engineering"]

With this array, -4 or anything lower would append to the beginning. 3 or anything higher would append to the end.

jsonb_pretty

Notice that jsonb doesn't preserve white-space, so no matter how much effort you put in to make the object easier to read, it will end up as a long string. Well jsonb_pretty will format it for you. If we use the previous jsonb example and wrap it all in a jsonb_pretty function:

 # SELECT jsonb_pretty(jsonb_set('{"name": "James", "contact": {"phone": "01234 567890", "fax": "01987 543210"}}'::jsonb, '{contact,phone}', '"07900 112233"'::jsonb));
           jsonb_pretty           
 ---------------------------------
  {                              +
      "name": "James",           +
      "contact": {               +
          "fax": "01987 543210", +
          "phone": "07900 112233"+
      }                          +
  }
 (1 row)

Much easier to read.

Links

INSERT ... ON CONFLICT DO NOTHING/UPDATE ("UPSERT")

9.5 brings support for "UPSERT" operations.

INSERT is extended to accept an ON CONFLICT DO UPDATE/IGNORE clause. This clause specifies an alternative action to take in the event of a would-be duplicate violation.

For example, if we have a simple table with user accounts logins where we wanted to track the number of times that user had logged in:

 # SELECT username, logins FROM user_logins;
  username | logins 
 ----------+--------
  James    |      4
  Lois     |      2
 (2 rows)

And we wanted to add 2 new logins, normally we'd have a problem if the primary key (or unique constraint) was violated:

 # INSERT INTO user_logins (username, logins)
   VALUES ('Naomi',1),('James',1);
 ERROR:  duplicate key value violates unique constraint "users_pkey"
 DETAIL:  Key (username)=(James) already exists.

Unlike approaches using a Common Table Expression, the new command has no race conditions, guaranteeing either an insert or an update (provided there is no incidental error). ON CONFLICT DO UPDATE safely guarantees "insert-or-update" semantics, with no risk of the statement failing to perform one of those two actions for each row proposed for insertion (unless there was an independent error). Unlike SQL MERGE as implemented in other database systems, there is no implementation or semantic restriction that burdens users with having to consider that their UPSERT might fail due to a concurrent operation. In short, the new "UPSERT" feature is designed to be both as simple and as safe as possible.

Further example of new syntax:

 INSERT INTO user_logins (username, logins)
 VALUES ('Naomi',1),('James',1)
 ON CONFLICT (username)
 DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;

Now let's look at what happened:

 # SELECT username, logins FROM user_logins;
  username | logins 
 ----------+--------
  Lois     |      2
  Naomi    |      1
  James    |      5
 (3 rows)

We have a new row for Naomi, which shows her having logged in once, but then we also have James whose logins value has incremented by one as specified by the UPDATE part of the statement. The UPDATE statement knows which rows it's updating based on the column or unique constraint that's being checked against.

Of course there are scenarios where you might want to insert a value into a table, but only if it's not there already.

Say we had a list of countries which would be used to constrain values in other tables:

 # SELECT * FROM countries;
   country  
 -----------
  Australia
  Italy
  Japan
  UK
  USA
 (5 rows)

We want to add 2 more countries. If one or more of them already existed and violated the primary key (in this case the "country" column), we'd get an error:

 # INSERT INTO countries (country) VALUES ('France'),('Japan');
 ERROR:  duplicate key value violates unique constraint "countries_pkey"
 DETAIL:  Key (country)=(Japan) already exists.

But now we can tell it that a conflict is fine, and just DO NOTHING in those scenarios:

 # INSERT INTO countries (country) VALUES ('France'),('Japan') ON CONFLICT DO NOTHING;
 INSERT 0 1

Now we should just have one additional country in our table:

 # SELECT * FROM countries;
   country  
 -----------
  Australia
  Italy
  Japan
  UK
  USA
  France
 (6 rows)

If there were additional columns, that were also uniquely-constrained, we could specify the constraint or column that we want to apply the condition to, so that a "legitimate" conflict on another column produces an error.

So, to clarify our intent, we could have written the last example as:

 # INSERT INTO countries (country) VALUES ('France'),('Japan') ON CONFLICT ON CONSTRAINT countries_pkey DO NOTHING;

or

 # INSERT INTO countries (country) VALUES ('France'),('Japan') ON CONFLICT (country) DO NOTHING;

However, the former isn't really fully equivalent to the latter, because the "inference" based syntax is more robust against things like schema migrations that result in a new name for essentially the same constraint/business rule. Users should prefer the latter (unless it's unavoidable, as with exclusion constraints).

Note that providing multiple sets of conflict/update conditions isn't supported, so if a specific conflict is specified, but another conflict occurs instead, it will produce a conflict error like it would with a normal insert.

Links

Upsert Lands in PostgreSQL 9.5 – a First Look

Youtube video - PostgreSQL 9.5's Upsert Feature Explained

pg_rewind

pg_rewind makes it possible to efficiently bring an old primary in sync with a new primary without having to perform a full base backup. This works by looking in the Write Ahead Log to see which pages have been modified, and only copying across those pages.

In this example, we have a primary (running on port 5530) and a standby subscribing to it (on port 5531):

 # SELECT * FROM pg_stat_replication;
 -[ RECORD 1 ]----+------------------------------
 pid              | 11609
 usesysid         | 16384
 usename          | rep_user
 application_name | standby1
 client_addr      | 127.0.0.1
 client_hostname  | 
 client_port      | 38434
 backend_start    | 2015-03-29 00:11:55.243319+00
 backend_xmin     | 
 state            | streaming
 sent_location    | 0/C81BB40
 write_location   | 0/C81BB40
 flush_location   | 0/C81BB40
 replay_location  | 0/C81BB40
 sync_priority    | 0
 sync_state       | async

Now we'll promote the standby:

 $ pg_ctl promote -D standby1
 server promoting

And we'll make some changes on this instance:

 $ psql -p 5531 postgres
 
 # CREATE TABLE x (content text);
 CREATE TABLE
 
 # INSERT INTO x SELECT 'test' FROM generate_series(1,1000);
 INSERT 0 1000

Now we'll stop old primary and use pg_rewind to re-synchronise it:

 $ pg_ctl stop -D primary
 waiting for server to shut down.... done
 server stopped
 
 $ pg_rewind -D primary --source-server='host=localhost port=5531' -P
 connected to remote server
 The servers diverged at WAL position 0/C81BB40 on timeline 1.
 Rewinding from last common checkpoint at 0/2000060 on timeline 1
 reading source file list
 reading target file list
 reading WAL in target
 Need to copy 274 MB (total source directory size is 290 MB)
 281142/281142 kB (100%) copied
 creating backup label and updating control file
 Done!

And we'll make some changes to get it to subscribe to the new primary:

 $ cd primary
 
 $ mv recovery.{done,conf}
 
 $ vi recovery.conf   # edited to set host info to point to port 5531 in this case
 
 $ vi postgresql.conf # as our example instances are running on the same server, we'll just change the port so it doesn't conflict

Then start the new standby (old primary):

 $ pg_ctl start -D primary

Let's see if it's successfully caught up:

 $ psql -p 5531 postgres  # connect to the new primary
 # SELECT * FROM pg_stat_replication;
 -[ RECORD 1 ]----+------------------------------
 pid              | 11837
 usesysid         | 16384
 usename          | rep_user
 application_name | standby1
 client_addr      | 127.0.0.1
 client_hostname  | 
 client_port      | 49148
 backend_start    | 2015-03-29 00:22:39.047657+00
 backend_xmin     | 
 state            | streaming
 sent_location    | 0/C8559B0
 write_location   | 0/C8559B0
 flush_location   | 0/C8559B0
 replay_location  | 0/C855978
 sync_priority    | 0
 sync_state       | async

And see if the test data from the new primary is on the new standby:

 $ psql -p 5530 postgres  # connect to the new standby
 
 # SELECT COUNT(*) FROM x;
  count 
 -------
   1000
 (1 row)

All synchronised.

Links

Waiting for 9.5 – Add pg_rewind, for re-synchronizing a master server after failback.

Other new features

ALTER TABLE ... SET LOGGED / UNLOGGED

PostgreSQL allows one to create tables which aren't written to the Write Ahead Log, meaning they aren't replicated or crash-safe, but also don't have the associated overhead, so are good for data that doesn't need the guarantees of regular tables. But if you decided an unlogged table should now be replicated, or a regular table should no longer be logged, you'd previously have to create a new copy of the table and copy the data across. But in 9.5, you can switch between logged and unlogged using a new command:

Set an unlogged table to logged:

 ALTER TABLE <tablename> SET LOGGED;

Set a logged table to unlogged:

 ALTER TABLE <tablename> SET UNLOGGED;

For example:

 # CREATE UNLOGGED TABLE messages (id int PRIMARY KEY, message text);

 # SELECT relname,
   CASE relpersistence
     WHEN 'u' THEN 'unlogged' 
     WHEN 'p' then 'logged' 
     ELSE 'unknown' END AS table_type
   FROM pg_class
   WHERE relname ~ 'messages*';
      relname    | table_type 
  ---------------+------------
   messages      | unlogged
   messages_pkey | unlogged
  (2 rows)

Note that setting an unlogged table to logged will generate WAL which will contain all data in the table, so this would cause a spike in replication traffic for large tables. And now we change it to a logged table:

 # ALTER TABLE messages SET LOGGED;

And the result of the previous query is now:

      relname    | table_type 
  ---------------+------------
   messages      | logged
   messages_pkey | logged
  (2 rows)

SKIP LOCKED

If you need to SELECT from a table and protect those rows from being updated until your transaction has completed, you would specify FOR UPDATE, but if some rows are locked, you can specify SKIP LOCKED to tell it to simply ignore those rows and just perform the operation on whichever rows it can access.

For example:

In session 1:

 # BEGIN;
 BEGIN

 # SELECT * FROM colours;
  id | name  
 ----+-------
   1 | red
   2 | green
   3 | blue
 (3 rows)

 # UPDATE colours SET name = 'scarlet' WHERE name = 'red';
 CREATE TABLE

In session 2:

 # SELECT * FROM colours FOR UPDATE NOWAIT;
 ERROR:  could not obtain lock on row in relation "colours"

 # SELECT * FROM colours FOR UPDATE SKIP LOCKED;
  id | name  
 ----+-------
   2 | green
   3 | blue
 (2 rows)

Links

Waiting for 9.5 – Implement SKIP LOCKED for row-level locks

Parallel VACUUMing

The vacuumdb utility now supports parallel jobs. This is specified with the -j option, just like when using pg_dump or pg_restore. This means vacuuming a database will complete a lot quicker, and especially so for cases where tables are spread across multiple tablespaces. It will also start vacuuming the largest relations first.

For example:

 vacuumdb -j4 productiondb

This would vacuum the database named "productiondb" by spawning 4 vacuum jobs to run simultaneously.

Links

Waiting for 9.5 – vacuumdb: enable parallel mode

Commit timestamp tracking

A boolean configuration parameter named track_commit_timestamp is now available. "on" enables commit timestamp tracking. This makes it possible to see when a transaction was committed. (timestamp with timezone of the servertime)

To query that information, use the function pg_xact_commit_timestamp(transaction_id) to find out when rows were INSERTed/UPDATEed, call the pg_xact_commit_timestamp function passing in the xmin system column. Note that you have to set the parameter and restart before it can start tracking timestamps, so it won't show up until you do so:

 # SELECT pg_xact_commit_timestamp(xmin), * FROM colours;
  pg_xact_commit_timestamp | id | name  
 --------------------------+----+-------
                           |  1 | red
                           |  2 | green
                           |  3 | blue
 (3 rows)

Then we edit postgresql.conf so that track_commit_timestamp = on, then restart PostgreSQL:

 # INSERT INTO colours VALUES ('mauve'),('cyan'),('indigo');
 INSERT
 
 # SELECT pg_xact_commit_timestamp(xmin), * FROM colours;
    pg_xact_commit_timestamp    | id |  name  
 -------------------------------+----+--------
                                |  1 | red
                                |  2 | green
                                |  3 | blue
  2015-10-02 11:16:34.678267+01 |  4 | mauve
  2015-10-02 11:16:34.678267+01 |  5 | cyan
  2015-10-02 11:16:34.678267+01 |  6 | indigo
 (6 rows)

You can see that the first 3 rows that were inserted before the parameter was enabled don't return the timestamp, so it only starts to keep track of timestamps for INSERTs/UPDATEs once the parameter has been enabled and the database cluster has been restarted.

There is also the set-returning function pg_last_committed_xact which tells you the timestamp of the last committed transaction:

 # SELECT * FROM pg_last_committed_xact();
  xid  |           timestamp           
 ------+-------------------------------
  2039 | 2015-10-02 11:16:34.678267+01
 (1 row)

cluster_name

There's a new parameter in postgresql.conf called cluster_name. This allows a string to be used to identify the cluster in the process list, which is particularly useful when multiple clusters are running on the same server:

 $ ps -ef | grep checkpointer
 postgres     12181 12178  0 11:12 ?        00:00:00 postgres: personnel: checkpointer process              
 postgres     12207 12204  0 11:12 ?        00:00:00 postgres: reportsdb: checkpointer process            
 postgres     12233 12230  0 11:12 ?        00:00:00 postgres: management: checkpointer process

Sorting optimization (Abbreviated Keys)

The abbreviated keys optimization can be expected to greatly enhance the performance of sorts in PostgreSQL, including those used for CREATE INDEX. Reportedly, in some cases, CREATE INDEX on text columns can be as much as an entire order of magnitude faster (3x is a more typical improvement). Numeric sorts also support the optimization.

Abbreviated keys will complement grouping sets very nicely. Reporting queries will be very significantly faster with PostgreSQL 9.5.

Links

Abbreviated keys: exploiting locality to improve PostgreSQL's text sort performance

Abbreviated keys for numeric to accelerate numeric sorts

GiST Index-Only Scans

Previously, the only index access method that supported index-only scans was B-Tree and SP-GiST, but support is added for GiST in PostgreSQL 9.5:

In this example, we'll be using the btree_gist extension:

 # CREATE EXTENSION btree_gist;

We'll set up a simple table that stores meeting room reservations:

 # CREATE TABLE meetings (
   id serial primary key,
   room int,
   reservation tstzrange);

Then we add an exclusion constraint to ensure that no booking for any one room overlaps another booking for the same room, which creates an index to enforce the constraint:

 # ALTER TABLE meetings
   ADD CONSTRAINT meeting_exclusion
   EXCLUDE USING GIST (room with =, reservation with &&);

And we'll populate it with lots of test data:

 # WITH RECURSIVE ins AS (
   SELECT
     1 AS room,
     '2000-01-01 08:00:00'::timestamptz AS reservation_start,
     (ceil(random()*24)*5 || ' minutes')::interval as duration
   UNION ALL
   SELECT
     CASE
       WHEN ins.reservation_start > now() THEN ins.room + 1
       ELSE ins.room
     END AS room,
     CASE
       WHEN ins.reservation_start > now() THEN '2000-01-01 08:00:00'::timestamptz
       ELSE ins.reservation_start + ins.duration
     END AS reservation_start,
     (ceil(random()*16)*15 || ' minutes')::interval AS duration
   FROM ins
   WHERE reservation_start < now() + '1 day'::interval
   and room <= 200
 )
 INSERT INTO meetings (room, reservation)
 SELECT room, tstzrange(reservation_start, reservation_start + duration) FROM ins
 WHERE (reservation_start + duration)::time between '08:00' and '20:00';

One run of this results in 6.4 million rows.

If we get the query plan for counting how many meetings occurred during May in 2014 for each room:

 # EXPLAIN SELECT room, count(*) FROM meetings WHERE reservation && '[2014-05-01,2014-05-31]'::tstzrange GROUP BY room ORDER BY room;
                                                  QUERY PLAN                                                  
 -------------------------------------------------------------------------------------------------------------
  Sort  (cost=1294.20..1294.21 rows=2 width=4)
    Sort Key: room
    ->  HashAggregate  (cost=1294.17..1294.19 rows=2 width=4)
          Group Key: room
          ->  Index Only Scan using meeting_exclusion on meetings  (cost=0.41..1113.98 rows=36038 width=4)
                Index Cond: (reservation && '["2014-05-01 00:00:00+01","2014-05-31 00:00:00+01"]'::tstzrange)

Prior to 9.5, we would get the following plan:

                                                     QUERY PLAN                                                     
 -------------------------------------------------------------------------------------------------------------------
  Sort  (cost=28570.15..28570.16 rows=2 width=4)
    Sort Key: room
    ->  HashAggregate  (cost=28570.12..28570.14 rows=2 width=4)
          Group Key: room
          ->  Bitmap Heap Scan on meetings  (cost=778.49..28386.07 rows=36810 width=4)
                Recheck Cond: (reservation && '["2014-05-01 00:00:00+01","2014-05-31 00:00:00+01"]'::tstzrange)
                ->  Bitmap Index Scan on meeting_exclusion  (cost=0.00..769.29 rows=36810 width=0)
                      Index Cond: (reservation && '["2014-05-01 00:00:00+01","2014-05-31 00:00:00+01"]'::tstzrange)

TABLESAMPLE

The TABLESAMPLE clause allows returning a randomly selected sample of the rows in the table. There are two sampling algorithms provided, SYSTEM and BERNOULLI. SYSTEM chooses random data pages and returns all of the rows on each page, and is thus quite fast. Bernoulli, which is slower, returns a more random set of rows.

Example:

   SELECT * FROM profiles TABLESAMPLE SYSTEM ( 0.01 );

Lock scalability improvements

The locks acquired for most operations in postgres didn't scale well to many concurrent shared acquisitions. Despite only acquiring shared locks, frequently most of the backends were blocked acquiring a exclusive spinlock. This frequently lead to 100% CPU usage by a large number of sessions. Now shared lockers should not interfere much with each others. This particularly helps on servers with more than one socket.

[1]

Improved buffer replacement scalability

Before 9.5 every buffer replacement happened while holding a global lock. Now the hot path is lock-free, and the slow paths use a per buffer spinlock.

[2] [3]

Reduced per-backend memory usage

Before 9.5 every backend used an array containing an entry for each page shared_buffers to manage page reference counts. With large shared_buffers that could add tens to hundreds of megabytes to each session. Now a small, fixed size, array is used instead.

[4]

CustomScan interface and Join-pushdown infrastructure

CustomScan interface allows extensions to implement own logic to scan relation, also replace built-in logic if its estimated cost is more reasonable. Relevant infrastructure also allows to replace built-in join nodes, by foreign-scan (if both child nodes are managed by same foreign server) or custom-scan; that performs like a scan on remotely/preliminary joined relations.

What's new in PostgreSQL 9.5

Major new features

IMPORT FOREIGN SCHEMA

Links

Row-Level Security Policies

Links

BRIN Indexes

Links

Foreign Table Inheritance

Links

GROUPING SETS, CUBE and ROLLUP

Links

JSONB-modifying operators and functions

jsonb || jsonb (concatenate / overwrite)

jsonb - text / int (remove key / array element)

jsonb #- text[] / int (remove key / array element in path)

jsonb_set function

jsonb_pretty

Links

INSERT ... ON CONFLICT DO NOTHING/UPDATE ("UPSERT")

Links

pg_rewind

Links

Other new features

ALTER TABLE ... SET LOGGED / UNLOGGED

SKIP LOCKED

Links

Parallel VACUUMing

Links

Commit timestamp tracking

cluster_name

Sorting optimization (Abbreviated Keys)

Links

GiST Index-Only Scans

TABLESAMPLE

Lock scalability improvements

Improved buffer replacement scalability

Reduced per-backend memory usage

CustomScan interface and Join-pushdown infrastructure

Navigation menu

Search