https://wiki.postgresql.org/api.php?action=feedcontributions&user=Jer&feedformat=atomPostgreSQL wiki - User contributions [en]2024-03-28T19:01:04ZUser contributionsMediaWiki 1.35.13https://wiki.postgresql.org/index.php?title=Compile_and_Install_from_source_code&diff=38555Compile and Install from source code2024-01-06T00:25:13Z<p>Jer: </p>
<hr />
<div><br />
A set of instructions for building Postgres from a source archive. <br />
<br />
= Acquire the source code =<br />
<br />
=== From postgresql.org ===<br />
<br />
The source code can be found at [https://git.postgresql.org git.postgresql.org] or in the [https://www.postgresql.org/ftp/source/ main file browser]. <br />
<br />
=== From GitHub ===<br />
<br />
<br />
The [https://github.com/postgres/postgres mirror of Postgres on GitHub] provides a "Download ZIP" option.<br />
<br />
= Build+install the source code =<br />
<br />
Detailed instructions can be found in the [https://www.postgresql.org/docs/current/installation.html documentation].<br />
<br />
=== Red Hat requirements===<br />
<br />
The following packages are needed for building Postgres with SSL support:<br />
<br />
sudo yum install -y bison-devel readline-devel zlib-devel openssl-devel wget ccache<br />
sudo yum groupinstall -y 'Development Tools'<br />
<br />
=== Ubuntu requirements ===<br />
<br />
The following packages are needed for building Postgres with including SSL support:<br />
<br />
sudo apt-get install build-essential libreadline-dev zlib1g-dev flex bison libxml2-dev libxslt-dev libssl-dev libxml2-utils xsltproc ccache pkg-config<br />
<br />
You might also consider installing the package '''libipc-run-perl''' if you want to run TAP tests and the package '''meson''' for building</div>Jerhttps://wiki.postgresql.org/index.php?title=Trademark_issues&diff=38103Trademark issues2023-07-31T22:22:46Z<p>Jer: </p>
<hr />
<div>The PostgreSQL community is currently battling a trademark issue with Fundación PostgreSQL. This article details the timeline of the dispute as well as various courts' findings.<br />
<br />
== Actors ==<br />
<br />
The following parties are involved in this dispute:<br />
<br />
* Core Team ([https://www.postgresql.org/developer/core/ Official Site])<br />
* Fundación PostgreSQL ([https://postgresql.fund/ Official Site])<br />
* PGCA: PostgreSQL Community Association ([https://www.postgres.ca/ Official Site])<br />
* PGEU: PostgreSQL Europe ([https://www.postgresql.eu/ Official Site])<br />
<br />
The following legal entities are presiding over this dispute:<br />
<br />
* EUIPO: European Union Intellectual Property Office ([https://euipo.europa.eu/ Official Site])<br />
* OEPM: <span lang="es">Oficina Española de Patentes y Marcas</span> ([https://oepm.es/ Official Site])<br />
* USPTO: United States Patent and Trademark Office ([https://www.uspto.gov/ Official Site])<br />
<br />
== Timeline ==<br />
<br />
* 2003-07-17, PGCA's '''POSTGRESQL''' trademark is registered in Canada (originally by PostgreSQL Inc.).<br />
* 2011-05-30, The ''PostgreSQL Community Association of Canada'' (''PGCAC'') is registered as an NPO in Canada to steward the PostgreSQL Projects assets (domain names, trademarks etc) at the request of the PostgreSQL Core Team.<br />
* 2018-04-17, PGEU's '''POSTGRESQL CONFERENCE''' trademark is registered in the EU/UK.<br />
* 2018-04-20, PGEU's '''POSTGRES CONFERENCE''' trademark is registered in the EU/UK.<br />
* 2018-08-15, PGCA's '''POSTGRES''' trademark is registered in the EU/UK.<br />
* 2018-08-15, PGCA's '''POSTGRESQL''' trademark is registered in the EU/UK.<br />
* 2018-08-15, PGCA's '''POSTGRES''' trademark is registered in the USA.<br />
* 2018-08-15, PGCA's '''POSTGRESQL''' trademark is registered in the USA.<br />
* 2020-04-27, [https://euipo.europa.eu/eSearch/#details/trademarks/W01534836 Fundación PostgreSQL registers EU trademark for '''POSTGRESQL'''] (EUIPO)<br />
* 2020-04-27, [https://euipo.europa.eu/eSearch/#details/trademarks/W01558723 Fundación PostgreSQL registers EU trademark for '''POSTGRESQL COMMUNITY'''] (EUIPO)<br />
* 2020-10-06, PGCA files EUIPO opposition for '''POSTGRESQL''' trademark<br />
* 2020-10-06, PGEU files EUIPO opposition for '''POSTGRESQL''' trademark<br />
* 2020-10-20, [http://consultas2.oepm.es/ceo/jsp/busqueda/consultaExterna.xhtml?numExp=M4089693# Fundación PostgreSQL registers Spanish trademark for '''POSTGRES'''] (OEPM)<br />
* 2020-11-20, ''Fundación PostgreSQL'' publishes blog post: [https://postgresql.fund/blog/is-it-time-to-modernize-postgresql-core/ Is it time to modernize the processes, structure and governance of the PostgreSQL Core Team?] (Fundación PostgreSQL) ([https://web.archive.org/web/*/https://postgresql.fund/blog/is-it-time-to-modernize-postgresql-core/ Archive link, including changes])<br />
* 2020-12-15, ''Core Team'' publishes article: [https://www.postgresql.org/about/news/new-trademark-policy-and-recognised-user-group-guidelines-2133/ New Trademark Policy and Recognised User Group Guidelines] (PostgreSQL.org)<br />
* 2021-02-16, [https://euipo.europa.eu/eSearch/#details/trademarks/W01598034 Fundación PostgreSQL registers EU trademark for "POSTGRES"] (EUIPO)<br />
* 2021-03-21, PGCA files EUIPO opposition for '''POSTGRESQL COMMUNITY''' trademark<br />
* 2021-03-21, PGEU files EUIPO opposition for '''POSTGRESQL COMMUNITY''' trademark<br />
* 2021-06-25, ''Fundación PostgreSQL'' registers Spanish trademark for '''POSTGRES'''<br />
* 2021-09-13, ''Core Team'' and PGCA publish article: [https://www.postgresql.org/about/news/trademark-actions-against-the-postgresql-community-2302/ Trademark Actions Against the PostgreSQL Community] (PostgreSQL.org)<br />
* 2021-09-14, ''Fundación PostgreSQL'' publishes blog post: [https://postgresql.fund/blog/respecting-majority-questioning-status-quo-as-a-minority/ Respecting the majority, questioning the status quo as a minority] (Fundación PostgreSQL) ([https://web.archive.org/web/*/https://postgresql.fund/blog/respecting-majority-questioning-status-quo-as-a-minority/ Archive link, including changes])<br />
** This blog posting includes the following statement: "we have informed the Core Team that effective immediately Fundación PostgreSQL has unanimously passed a resolution to start the process to transfer, permanently and irrevocably, all PostgreSQL-related trademarks and domain names to the PostgreSQL Association of Canada, with no conditions or costs attached" ([https://web.archive.org/web/20210914223639/https://postgresql.fund/blog/respecting-majority-questioning-status-quo-as-a-minority/ 2021-09-14])<br />
* 2021-09-21, ''Fundación PostgreSQL'' publishes blog post: [https://postgresql.fund/blog/postgres-core-team-attacks-postgres-community/ Postgres Core Team launches unprecedented attack against the Postgres Community] (Fundación PostgreSQL) ([https://web.archive.org/web/*/https://postgresql.fund/blog/postgres-core-team-attacks-postgres-community/ Archive link, including changes])<br />
* 2021-10-20, PGCA files EUIPO opposition for '''POSTGRES''' trademark<br />
* 2021-10-20, PGEU files EUIPO opposition for '''POSTGRES''' trademark<br />
* 2022-04-12, PGCA publishes article: [https://www.postgresql.org/about/news/update-on-the-trademark-actions-against-the-postgresql-community-2437/ Update on the Trademark Actions Against the PostgreSQL Community]<br />
* 2022-06-10, ''Fundación PostgreSQL'' publishes blog post: [https://postgresql.fund/blog/re-update-on-the-trademark-actions/ Re: Update on the Trademark Actions] (Fundación PostgreSQL) ([https://web.archive.org/web/*/https://postgresql.fund/blog/re-update-on-the-trademark-actions/ Archive link, including changes])<br />
* 2022-06-23, the Spanish courts [https://www.postgres.ca/#2022-06-23 invalidate] the infringing '''POSTGRESQL''' and '''POSTGRESQL COMMUNITY''' trademark registrations.<br />
* 2022-10-05, ''Fundación PostgreSQL'' publishes blog post: [https://postgresql.fund/blog/postgres-trademarks-disagreement-proposing-a-solution/ PostgreSQL Trademarks Disagreement: Proposing a Solution] (Fundación PostgreSQL) ([https://web.archive.org/web/*/https://postgresql.fund/blog/postgres-trademarks-disagreement-proposing-a-solution/ Archive link, including changes])<br />
* 2023-05-09, The USPTO gives final rejection of ''Fundación PostgreSQL'''s '''POSTGRES''' trademark in the USA<br />
* 2023-05-09, The USPTO gives final rejection of ''Fundación PostgreSQL'''s '''POSTGRESQL''' trademark in the USA<br />
* 2023-05-09, The USPTO gives final rejection of ''Fundación PostgreSQL'''s '''POSTGRESQL COMMUNITY''' trademark in the USA<br />
* 2023-07-11, PGCA publishes article: [https://www.postgresql.org/about/news/update-on-continued-trademark-actions-against-the-postgresql-community-2673/ Update on Continued Trademark Actions Against the PostgreSQL Community] (PostgreSQL.org)<br />
* 2023-07-24, ''Fundación PostgreSQL'' publishes blog post claiming that ''PostgreSQL'' is trying to [https://postgresql.fund/blog/the-postgres-core-team-tries-to-shut-down-a-postgres-community-conference/ "shut down"] the Ibiza conference. (Fundación PostgreSQL) ([https://web.archive.org/web/*/https://postgresql.fund/blog/the-postgres-core-team-tries-to-shut-down-a-postgres-community-conference/ Archive link, including changes])<br />
* 2023-07-27, PGCA publishes article: [https://www.postgresql.org/about/news/setting-the-record-straight-more-updates-on-a-trademark-dispute-2682/ Setting the record straight: More updates on a trademark dispute] (PostgreSQL.org)<br />
<br />
== See Also ==<br />
<br />
* [https://news.ycombinator.com/item?id=28512274 Hacker News discussion]<br />
* [https://lwn.net/Articles/869108/ LWN discussion]<br />
* [https://news.ycombinator.com/item?id=36891784 Hacker News discussion]</div>Jerhttps://wiki.postgresql.org/index.php?title=Pgfoundry&diff=34830Pgfoundry2020-04-22T18:15:28Z<p>Jer: pgfoundry.org domain now belongs to someone else; update wiki page with archive.org link</p>
<hr />
<div>'''NOTE: pgFoundry has now been shut down permanently. Please see [[Project Hosting]] for alternatives.'''<br />
<br />
'''pgfoundry.org''' was a website hosting various Postgres-related software projects. It was the successor to [[gborg]], and ran using [http://gforge.org GForge], an open source collaborative software development tool.<br />
<br />
It seems that the domain name now belongs to a different organization, but the old website can be seen at archive.org: [https://web.archive.org/web/20191015045210/http://www.pgfoundry.org/ archive.org]<br />
<br />
Archived code of most PgFoundry projects is available at https://www.postgresql.org/ftp/projects/pgFoundry/.<br />
<br />
See [[Project Hosting]] for more information about what pgfoundry provides, and alternatives to it.<br />
<br />
[[Category:Development]]</div>Jerhttps://wiki.postgresql.org/index.php?title=GSoD_2019&diff=33185GSoD 20192019-03-28T22:35:31Z<p>Jer: </p>
<hr />
<div>This page is for collecting ideas for future Season of Docs projects.<br />
<br />
== Regarding Project Ideas ==<br />
<br />
Project ideas are to be added here by community members.<br />
<br />
== Mentors (2019) ==<br />
<br />
The following individuals have been listed as possible mentors on the below projects, and/or offered to be mentors for technical writer-proposed projects:<br />
<br />
=== Organization Administrators ===<br />
<br />
* Stephen Frost<br />
* Sarah Conway<br />
<br />
=== PostgreSQL ===<br />
<br />
* James Chanco<br />
* Alan Youngblood<br />
* Pavan Agrawal<br />
* Evan Macbeth<br />
* Amit Kumar Jaiswal<br />
* Lætitia Avrot<br />
* Tahir Ramzan<br />
* Jeremy Schneider<br />
* Darren Douglas<br />
* Peter Eisentraut<br />
<br />
=== pg_partman ===<br />
<br />
* Keith Fiske<br />
<br />
=== pgJDBC ===<br />
<br />
* Dave Cramer<br />
<br />
=== PL/R ===<br />
<br />
* Dave Cramer<br />
<br />
=== PostGIS ===<br />
<br />
* Martin Davis<br />
<br />
=== pgBackRest ===<br />
<br />
* David Steele<br />
* Cynthia Shang<br />
<br />
== Documentation Project Ideas ==<br />
<br />
=== General Changes ===<br />
<br />
Suggested by Oleg Bartunov. <br />
<br />
Currently, our documentation follows the format of primarily being in a reference manual format. This should be updated, section-by-section, to be more of a user manual format. This should be supplemented by architectural diagrams and visual references, which would need to be created.<br />
<br />
=== Introductory Resources ===<br />
<br />
Suggested by Oleg Bartunov. <br />
<br />
We currently don't have a community resource available that's suitable for absolute beginners for PostgreSQL. We want to ensure that new users of PostgreSQL have a positive experience from the start, and this begins with proper documentation that is simple and approachable. Existing external resources could be adapted for this purpose, including the following that was suggested by Oleg:<br />
<br />
https://edu.postgrespro.ru/introbook_v4_en.pdf<br />
<br />
=== Introductory Tutorial ===<br />
<br />
Suggested by Pavan Agrawal, James Chanco.<br />
<br />
The introductory tutorial available (https://www.postgresql.org/docs/current/tutorial-install.html) in the documentation is not ideal for absolute beginners or those new to databases. It's extremely jargon-heavy and not beginner friendly. This should either be updated or expanded to better suit this purpose.<br />
<br />
Additionally, there are several sections in the tutorial where prior knowledge is assumed, such as how to connect to psql. These sections are currently unclear, and should be rewritten for clarity and accuracy.<br />
<br />
The original task from Google Code-in 2018:<br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "Getting Started". Attempt to follow this tutorial to install PostgreSQL, create a PostgreSQL database and then access that database. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Document the patch submitting process ===<br />
<br />
PostgreSQL development revolves around CommitFest periods, but the process of submitting patches is tedious.<br />
<br />
The documented steps for submitting and reviewing a patch need to be reviewed, and the documentation expanded with useful details and links.<br />
<br />
More information about patch submission and review: https://wiki.postgresql.org/wiki/Reviewing_a_Patch<br />
<br />
=== Document the Extension submitting process ===<br />
<br />
PostgreSQL Database is known to be very extensible. Currently the Extension documentation is sparse:<br />
<br />
https://www.postgresql.org/docs/devel/static/external-extensions.html<br />
<br />
The task is to extend this page with more information about developing extensions, add useful resources, and pointers to the Extension Network (PGXN).<br />
<br />
=== Compose cheat sheets for common tools/operations in PostgreSQL === <br />
<br />
Compose various cheat sheets for common tools/operations in PostgreSQL like Julia Evans did for Linux commands (https://jvns.ca/blog/2017/12/27/a-perf-cheat-sheet/).<br />
<br />
Installing<br />
Managing clusters<br />
Basic settings<br />
Controlling a PostgreSQL server(start, stop, restart...)<br />
Security checks<br />
Performances checks<br />
Upgrading<br />
psql tool<br />
pg_dump tool<br />
pg_restore tool<br />
<br />
=== Upgrade pg_backrest tutorial with Postgres 11 === <br />
<br />
The pgbackrest tutorial was written for postgres 9.4 (https://pgbackrest.org/user-guide.html). Maybe it's time to upgrade it to postgres 11.<br />
<br />
Step needed:<br />
<br />
Understand the user guide<br />
Test it with a postgres 11 version<br />
Upgrade the tutorial for postgres 11<br />
<br />
=== Compose a cheat sheet for backing up and restoring a PostgreSQL instance using X backup tool (11 tasks) === <br />
<br />
The purpose of this task is to guide a user about how to backup and restore using different tools with their pros and cons. Each tool is equal to one task:<br />
<br />
Amanda:<br />
http://www.amanda.org/<br />
<br />
Bacula:<br />
https://www.bacula.org/<br />
<br />
Backup and Recovery Manager for PostgreSQL:<br />
https://www.2ndquadrant.com/<br />
<br />
Barman<br />
https://www.pgbarman.org/<br />
<br />
Handy Backup:<br />
https://www.handybackup.net/<br />
<br />
Iperius Backup:<br />
https://www.iperiusbackup.com/<br />
<br />
NetVault Backup:<br />
https://www.quest.com/products/netvault-backup/<br />
<br />
pgBackRest:<br />
https://pgbackrest.org/<br />
<br />
PostgreSQL backups:<br />
https://www.pgbarman.org/<br />
<br />
Simpana:<br />
https://www.commvault.com/<br />
<br />
Spectrum Protect:<br />
https://www.ibm.com/us-en/marketplace/data-protection-and-recovery<br />
<br />
Manuals:<br />
https://severalnines.com/blog/top-backup-tools-postgresql<br />
<br />
=== Write a PostgreSQL technical mumbo-jumbo dictionary === <br />
<br />
The main goal of this task is to create a HTML website/page to explain some technical words used either in the database world or specifically in Postgres world. Examples of words you could find in that dictionary :<br />
<br />
- cluster<br />
<br />
- vacuum<br />
<br />
- WAL<br />
<br />
- query<br />
<br />
=== Document how to migrate a trigger-based partitioned table to a natively partitioned table (pg_partman) === <br />
<br />
PostgreSQL 10 introduced native partitioning, but previous versions had a method of partitioning that used a combination of triggers, constraints & inheritance.<br />
<br />
https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html<br />
<br />
Document a method that can be used to migrate to the new native partitioning scheme with minimal downtime to the end users of that table.<br />
<br />
https://www.postgresql.org/docs/10/static/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE<br />
<br />
The main objective of this task is to provide the working example. If there is any difficulty working out the commands or methods to do this, those answers can be provided. Documentation should be provide in markdown format.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Document a working example of how to migrate a table to native partitioning in PG11 with minimal downtime (pg_partman) === <br />
<br />
The new DEFAULT partition feature in PG11 should make migrating an existing table to a natively partitioned one a lot easier. However, a new child table with constraints that would match data that exists in that DEFAULT table are not allowed to be added. Using PostgreSQL's transaction system, document a method to move the data and attached a child table with minimal downtime to users of the table in question.<br />
<br />
The main objective of this task is to provide the working example. If there is any difficulty working out the commands or methods to do this, those answers can be provided. Documentation should be provide in markdown format.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Review and improve the documentation for PG Partition Manager (pg_partman) === <br />
<br />
The pg_partman documentation has grown quite extensive over the years since it was released. Looking for a basic review of the top level README and files contained in the docs folder for grammar and clarity. Any other recommendations to improve documentation are welcome as well.<br />
<br />
Using Github, fork the repository and submit documentation improvements back via a pull request.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Add documentation for JDBC driver CopyManager API (pgJDBC) === <br />
<br />
The PostgreSQL JDBC driver, pgjdbc, supports COPY commands to both STDIN and STDOUT via a non-JDBC extension, the CopyManager interface. While the driver has supported this functionality for many years, it's not documented in the official docs.<br />
<br />
The purpose of this task is to add some basic documentation of how to use the CopyManager interface to the official pgjdbc docs. The test suite for the pgjdbc driver includes examples of how to use the CopyManager interface and would a good resource for real world usage.<br />
<br />
The expected submission for this task will be a pull request on the pgjdbc GitHub page, https://github.com/pgjdbc/pgjdbc, for the documentation patch.<br />
<br />
=== Add tutorial on logical Replication to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers logical replication between PostgreSQL systems.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up logical replication following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the logical replication tutorial.<br />
<br />
=== Add tutorial on physical Replication to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers physical replication between instances.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up physical replication following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the physical replication tutorial.<br />
<br />
=== Add tutorial on partitioning to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers partitioning of a PostgreSQL table.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up partitioning following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the partitioning tutorial.<br />
<br />
=== Review and improve the "Advanced Features" tutorial in the PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "Advanced Features". Attempt to follow this tutorial, after installing PostgreSQL and creating a database and learning some SQL, to work with some of the Advanced Features of PostgreSQL including creating views, foreign keys, transactions, window functions, and inheritance. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Review and improve the "SQL Language" tutorial in the PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "SQL Language". Attempt to follow this tutorial, after installing PostgreSQL and creating a database, to use and learn SQL with PostgreSQL. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Improve VACUUM Progress Reporting documentation ===<br />
<br />
Suggested by Jeremy Schneider and Jim Nasby<br />
<br />
The docs on VACUUM Progress Reporting list the phases of vacuum, but they don't mention that some of these phases can be repeated - and why those phases would be repeated, and what factors (config settings, attributes of data, num_dead_tuples/max_dead_tuples) influence this behavior. These topics are important for people administrating PostgreSQL.<br />
<br />
Should discuss in what order PostgreSQL chooses tables to vacuum - and would be good to at least mention what an "aggressive" vacuum is and what it looks like (I need to check if this is covered elsewhere), and discuss block skipping/freeze map/visibility map as they relate to vacuum.<br />
<br />
Need to determine what information goes in the main vacuum documentation and what goes in the pg_stat_progress_vacuum docs, staying in line with existing conventions for how docs are organized.<br />
<br />
=== Add more images ===<br />
<br />
Suggested by Peter Eisentraut<br />
<br />
As of PostgreSQL 12, the PostgreSQL documentation can contain images. Right now, we have the total of two images. Find places where images would be sensible and add them. This wouldn't necessarily require great artistry, only a sense of where some artistry would be useful.<br />
<br />
"SQLite has a bubble generator tool that they use to generate syntax<br />
diagrams for their documentation"<br />
https://www.postgresql.org/message-id/flat/20190328222500.GA16520%40alvherre.pgsql#c1c100db6b90697674e6fa88573bc9a5<br />
<br />
<br />
=== Web site hierarchy and navigation ===<br />
<br />
Suggested by Peter Eisentraut<br />
<br />
Some people have difficulty finding things on the web site. For example, there are too many top-level categories. Are mailing lists under "Community", "Developers", or "Support"? Find a more sensible and easier hierarchy. Note: The design of the web site would not be part of this project, only the hierarchy and relationship of the pages.</div>Jerhttps://wiki.postgresql.org/index.php?title=GSoD_2019&diff=33181GSoD 20192019-03-26T18:52:22Z<p>Jer: /* Improve VACUUM Progress Reporting documentation */</p>
<hr />
<div>This page is for collecting ideas for future Season of Docs projects.<br />
<br />
== Regarding Project Ideas ==<br />
<br />
Project ideas are to be added here by community members.<br />
<br />
== Mentors (2019) ==<br />
<br />
The following individuals have been listed as possible mentors on the below projects, and/or offered to be mentors for technical writer-proposed projects:<br />
<br />
=== Organization Administrators ===<br />
<br />
* Stephen Frost<br />
* Sarah Conway<br />
<br />
=== PostgreSQL ===<br />
<br />
* James Chanco<br />
* Alan Youngblood<br />
* Pavan Agrawal<br />
* Evan Macbeth<br />
* Amit Kumar Jaiswal<br />
* Lætitia Avrot<br />
* Tahir Ramzan<br />
* Jeremy Schneider<br />
* Darren Douglas<br />
* Peter Eisentraut<br />
<br />
=== pg_partman ===<br />
<br />
* Keith Fiske<br />
<br />
=== pgJDBC ===<br />
<br />
* Dave Cramer<br />
<br />
=== PL/R ===<br />
<br />
* Dave Cramer<br />
<br />
=== PostGIS ===<br />
<br />
* Martin Davis<br />
<br />
=== pgBackRest ===<br />
<br />
* David Steele<br />
* Cynthia Shang<br />
<br />
== Documentation Project Ideas ==<br />
<br />
=== General Changes ===<br />
<br />
Suggested by Oleg Bartunov. <br />
<br />
Currently, our documentation follows the format of primarily being in a reference manual format. This should be updated, section-by-section, to be more of a user manual format. This should be supplemented by architectural diagrams and visual references, which would need to be created.<br />
<br />
=== Introductory Resources ===<br />
<br />
Suggested by Oleg Bartunov. <br />
<br />
We currently don't have a community resource available that's suitable for absolute beginners for PostgreSQL. We want to ensure that new users of PostgreSQL have a positive experience from the start, and this begins with proper documentation that is simple and approachable. Existing external resources could be adapted for this purpose, including the following that was suggested by Oleg:<br />
<br />
https://edu.postgrespro.ru/introbook_v4_en.pdf<br />
<br />
=== Introductory Tutorial ===<br />
<br />
Suggested by Pavan Agrawal, James Chanco.<br />
<br />
The introductory tutorial available (https://www.postgresql.org/docs/current/tutorial-install.html) in the documentation is not ideal for absolute beginners or those new to databases. It's extremely jargon-heavy and not beginner friendly. This should either be updated or expanded to better suit this purpose.<br />
<br />
Additionally, there are several sections in the tutorial where prior knowledge is assumed, such as how to connect to psql. These sections are currently unclear, and should be rewritten for clarity and accuracy.<br />
<br />
The original task from Google Code-in 2018:<br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "Getting Started". Attempt to follow this tutorial to install PostgreSQL, create a PostgreSQL database and then access that database. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Document the patch submitting process ===<br />
<br />
PostgreSQL development revolves around CommitFest periods, but the process of submitting patches is tedious.<br />
<br />
The documented steps for submitting and reviewing a patch need to be reviewed, and the documentation expanded with useful details and links.<br />
<br />
More information about patch submission and review: https://wiki.postgresql.org/wiki/Reviewing_a_Patch<br />
<br />
=== Document the Extension submitting process ===<br />
<br />
PostgreSQL Database is known to be very extensible. Currently the Extension documentation is sparse:<br />
<br />
https://www.postgresql.org/docs/devel/static/external-extensions.html<br />
<br />
The task is to extend this page with more information about developing extensions, add useful resources, and pointers to the Extension Network (PGXN).<br />
<br />
=== Compose cheat sheets for common tools/operations in PostgreSQL === <br />
<br />
Compose various cheat sheets for common tools/operations in PostgreSQL like Julia Evans did for Linux commands (https://jvns.ca/blog/2017/12/27/a-perf-cheat-sheet/).<br />
<br />
Installing<br />
Managing clusters<br />
Basic settings<br />
Controlling a PostgreSQL server(start, stop, restart...)<br />
Security checks<br />
Performances checks<br />
Upgrading<br />
psql tool<br />
pg_dump tool<br />
pg_restore tool<br />
<br />
=== Upgrade pg_backrest tutorial with Postgres 11 === <br />
<br />
The pgbackrest tutorial was written for postgres 9.4 (https://pgbackrest.org/user-guide.html). Maybe it's time to upgrade it to postgres 11.<br />
<br />
Step needed:<br />
<br />
Understand the user guide<br />
Test it with a postgres 11 version<br />
Upgrade the tutorial for postgres 11<br />
<br />
=== Compose a cheat sheet for backing up and restoring a PostgreSQL instance using X backup tool (11 tasks) === <br />
<br />
The purpose of this task is to guide a user about how to backup and restore using different tools with their pros and cons. Each tool is equal to one task:<br />
<br />
Amanda:<br />
http://www.amanda.org/<br />
<br />
Bacula:<br />
https://www.bacula.org/<br />
<br />
Backup and Recovery Manager for PostgreSQL:<br />
https://www.2ndquadrant.com/<br />
<br />
Barman<br />
https://www.pgbarman.org/<br />
<br />
Handy Backup:<br />
https://www.handybackup.net/<br />
<br />
Iperius Backup:<br />
https://www.iperiusbackup.com/<br />
<br />
NetVault Backup:<br />
https://www.quest.com/products/netvault-backup/<br />
<br />
pgBackRest:<br />
https://pgbackrest.org/<br />
<br />
PostgreSQL backups:<br />
https://www.pgbarman.org/<br />
<br />
Simpana:<br />
https://www.commvault.com/<br />
<br />
Spectrum Protect:<br />
https://www.ibm.com/us-en/marketplace/data-protection-and-recovery<br />
<br />
Manuals:<br />
https://severalnines.com/blog/top-backup-tools-postgresql<br />
<br />
=== Write a PostgreSQL technical mumbo-jumbo dictionary === <br />
<br />
The main goal of this task is to create a HTML website/page to explain some technical words used either in the database world or specifically in Postgres world. Examples of words you could find in that dictionary :<br />
<br />
- cluster<br />
<br />
- vacuum<br />
<br />
- WAL<br />
<br />
- query<br />
<br />
=== Document how to migrate a trigger-based partitioned table to a natively partitioned table (pg_partman) === <br />
<br />
PostgreSQL 10 introduced native partitioning, but previous versions had a method of partitioning that used a combination of triggers, constraints & inheritance.<br />
<br />
https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html<br />
<br />
Document a method that can be used to migrate to the new native partitioning scheme with minimal downtime to the end users of that table.<br />
<br />
https://www.postgresql.org/docs/10/static/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE<br />
<br />
The main objective of this task is to provide the working example. If there is any difficulty working out the commands or methods to do this, those answers can be provided. Documentation should be provide in markdown format.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Document a working example of how to migrate a table to native partitioning in PG11 with minimal downtime (pg_partman) === <br />
<br />
The new DEFAULT partition feature in PG11 should make migrating an existing table to a natively partitioned one a lot easier. However, a new child table with constraints that would match data that exists in that DEFAULT table are not allowed to be added. Using PostgreSQL's transaction system, document a method to move the data and attached a child table with minimal downtime to users of the table in question.<br />
<br />
The main objective of this task is to provide the working example. If there is any difficulty working out the commands or methods to do this, those answers can be provided. Documentation should be provide in markdown format.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Review and improve the documentation for PG Partition Manager (pg_partman) === <br />
<br />
The pg_partman documentation has grown quite extensive over the years since it was released. Looking for a basic review of the top level README and files contained in the docs folder for grammar and clarity. Any other recommendations to improve documentation are welcome as well.<br />
<br />
Using Github, fork the repository and submit documentation improvements back via a pull request.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Add documentation for JDBC driver CopyManager API (pgJDBC) === <br />
<br />
The PostgreSQL JDBC driver, pgjdbc, supports COPY commands to both STDIN and STDOUT via a non-JDBC extension, the CopyManager interface. While the driver has supported this functionality for many years, it's not documented in the official docs.<br />
<br />
The purpose of this task is to add some basic documentation of how to use the CopyManager interface to the official pgjdbc docs. The test suite for the pgjdbc driver includes examples of how to use the CopyManager interface and would a good resource for real world usage.<br />
<br />
The expected submission for this task will be a pull request on the pgjdbc GitHub page, https://github.com/pgjdbc/pgjdbc, for the documentation patch.<br />
<br />
=== Add tutorial on logical Replication to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers logical replication between PostgreSQL systems.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up logical replication following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the logical replication tutorial.<br />
<br />
=== Add tutorial on physical Replication to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers physical replication between instances.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up physical replication following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the physical replication tutorial.<br />
<br />
=== Add tutorial on partitioning to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers partitioning of a PostgreSQL table.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up partitioning following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the partitioning tutorial.<br />
<br />
=== Review and improve the "Advanced Features" tutorial in the PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "Advanced Features". Attempt to follow this tutorial, after installing PostgreSQL and creating a database and learning some SQL, to work with some of the Advanced Features of PostgreSQL including creating views, foreign keys, transactions, window functions, and inheritance. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Review and improve the "SQL Language" tutorial in the PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "SQL Language". Attempt to follow this tutorial, after installing PostgreSQL and creating a database, to use and learn SQL with PostgreSQL. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Improve VACUUM Progress Reporting documentation ===<br />
<br />
Suggested by Jeremy Schneider and Jim Nasby<br />
<br />
The docs on VACUUM Progress Reporting list the phases of vacuum, but they don't mention that some of these phases can be repeated - and why those phases would be repeated, and what factors (config settings, attributes of data, num_dead_tuples/max_dead_tuples) influence this behavior. These topics are important for people administrating PostgreSQL.<br />
<br />
Should discuss in what order PostgreSQL chooses tables to vacuum - and would be good to at least mention what an "aggressive" vacuum is and what it looks like (I need to check if this is covered elsewhere), and discuss block skipping/freeze map/visibility map as they relate to vacuum.<br />
<br />
Need to determine what information goes in the main vacuum documentation and what goes in the pg_stat_progress_vacuum docs, staying in line with existing conventions for how docs are organized.</div>Jerhttps://wiki.postgresql.org/index.php?title=GSoD_2019&diff=33180GSoD 20192019-03-26T18:50:45Z<p>Jer: /* Improve VACUUM Progress Reporting documentation */</p>
<hr />
<div>This page is for collecting ideas for future Season of Docs projects.<br />
<br />
== Regarding Project Ideas ==<br />
<br />
Project ideas are to be added here by community members.<br />
<br />
== Mentors (2019) ==<br />
<br />
The following individuals have been listed as possible mentors on the below projects, and/or offered to be mentors for technical writer-proposed projects:<br />
<br />
=== Organization Administrators ===<br />
<br />
* Stephen Frost<br />
* Sarah Conway<br />
<br />
=== PostgreSQL ===<br />
<br />
* James Chanco<br />
* Alan Youngblood<br />
* Pavan Agrawal<br />
* Evan Macbeth<br />
* Amit Kumar Jaiswal<br />
* Lætitia Avrot<br />
* Tahir Ramzan<br />
* Jeremy Schneider<br />
* Darren Douglas<br />
* Peter Eisentraut<br />
<br />
=== pg_partman ===<br />
<br />
* Keith Fiske<br />
<br />
=== pgJDBC ===<br />
<br />
* Dave Cramer<br />
<br />
=== PL/R ===<br />
<br />
* Dave Cramer<br />
<br />
=== PostGIS ===<br />
<br />
* Martin Davis<br />
<br />
=== pgBackRest ===<br />
<br />
* David Steele<br />
* Cynthia Shang<br />
<br />
== Documentation Project Ideas ==<br />
<br />
=== General Changes ===<br />
<br />
Suggested by Oleg Bartunov. <br />
<br />
Currently, our documentation follows the format of primarily being in a reference manual format. This should be updated, section-by-section, to be more of a user manual format. This should be supplemented by architectural diagrams and visual references, which would need to be created.<br />
<br />
=== Introductory Resources ===<br />
<br />
Suggested by Oleg Bartunov. <br />
<br />
We currently don't have a community resource available that's suitable for absolute beginners for PostgreSQL. We want to ensure that new users of PostgreSQL have a positive experience from the start, and this begins with proper documentation that is simple and approachable. Existing external resources could be adapted for this purpose, including the following that was suggested by Oleg:<br />
<br />
https://edu.postgrespro.ru/introbook_v4_en.pdf<br />
<br />
=== Introductory Tutorial ===<br />
<br />
Suggested by Pavan Agrawal, James Chanco.<br />
<br />
The introductory tutorial available (https://www.postgresql.org/docs/current/tutorial-install.html) in the documentation is not ideal for absolute beginners or those new to databases. It's extremely jargon-heavy and not beginner friendly. This should either be updated or expanded to better suit this purpose.<br />
<br />
Additionally, there are several sections in the tutorial where prior knowledge is assumed, such as how to connect to psql. These sections are currently unclear, and should be rewritten for clarity and accuracy.<br />
<br />
The original task from Google Code-in 2018:<br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "Getting Started". Attempt to follow this tutorial to install PostgreSQL, create a PostgreSQL database and then access that database. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Document the patch submitting process ===<br />
<br />
PostgreSQL development revolves around CommitFest periods, but the process of submitting patches is tedious.<br />
<br />
The documented steps for submitting and reviewing a patch need to be reviewed, and the documentation expanded with useful details and links.<br />
<br />
More information about patch submission and review: https://wiki.postgresql.org/wiki/Reviewing_a_Patch<br />
<br />
=== Document the Extension submitting process ===<br />
<br />
PostgreSQL Database is known to be very extensible. Currently the Extension documentation is sparse:<br />
<br />
https://www.postgresql.org/docs/devel/static/external-extensions.html<br />
<br />
The task is to extend this page with more information about developing extensions, add useful resources, and pointers to the Extension Network (PGXN).<br />
<br />
=== Compose cheat sheets for common tools/operations in PostgreSQL === <br />
<br />
Compose various cheat sheets for common tools/operations in PostgreSQL like Julia Evans did for Linux commands (https://jvns.ca/blog/2017/12/27/a-perf-cheat-sheet/).<br />
<br />
Installing<br />
Managing clusters<br />
Basic settings<br />
Controlling a PostgreSQL server(start, stop, restart...)<br />
Security checks<br />
Performances checks<br />
Upgrading<br />
psql tool<br />
pg_dump tool<br />
pg_restore tool<br />
<br />
=== Upgrade pg_backrest tutorial with Postgres 11 === <br />
<br />
The pgbackrest tutorial was written for postgres 9.4 (https://pgbackrest.org/user-guide.html). Maybe it's time to upgrade it to postgres 11.<br />
<br />
Step needed:<br />
<br />
Understand the user guide<br />
Test it with a postgres 11 version<br />
Upgrade the tutorial for postgres 11<br />
<br />
=== Compose a cheat sheet for backing up and restoring a PostgreSQL instance using X backup tool (11 tasks) === <br />
<br />
The purpose of this task is to guide a user about how to backup and restore using different tools with their pros and cons. Each tool is equal to one task:<br />
<br />
Amanda:<br />
http://www.amanda.org/<br />
<br />
Bacula:<br />
https://www.bacula.org/<br />
<br />
Backup and Recovery Manager for PostgreSQL:<br />
https://www.2ndquadrant.com/<br />
<br />
Barman<br />
https://www.pgbarman.org/<br />
<br />
Handy Backup:<br />
https://www.handybackup.net/<br />
<br />
Iperius Backup:<br />
https://www.iperiusbackup.com/<br />
<br />
NetVault Backup:<br />
https://www.quest.com/products/netvault-backup/<br />
<br />
pgBackRest:<br />
https://pgbackrest.org/<br />
<br />
PostgreSQL backups:<br />
https://www.pgbarman.org/<br />
<br />
Simpana:<br />
https://www.commvault.com/<br />
<br />
Spectrum Protect:<br />
https://www.ibm.com/us-en/marketplace/data-protection-and-recovery<br />
<br />
Manuals:<br />
https://severalnines.com/blog/top-backup-tools-postgresql<br />
<br />
=== Write a PostgreSQL technical mumbo-jumbo dictionary === <br />
<br />
The main goal of this task is to create a HTML website/page to explain some technical words used either in the database world or specifically in Postgres world. Examples of words you could find in that dictionary :<br />
<br />
- cluster<br />
<br />
- vacuum<br />
<br />
- WAL<br />
<br />
- query<br />
<br />
=== Document how to migrate a trigger-based partitioned table to a natively partitioned table (pg_partman) === <br />
<br />
PostgreSQL 10 introduced native partitioning, but previous versions had a method of partitioning that used a combination of triggers, constraints & inheritance.<br />
<br />
https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html<br />
<br />
Document a method that can be used to migrate to the new native partitioning scheme with minimal downtime to the end users of that table.<br />
<br />
https://www.postgresql.org/docs/10/static/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE<br />
<br />
The main objective of this task is to provide the working example. If there is any difficulty working out the commands or methods to do this, those answers can be provided. Documentation should be provide in markdown format.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Document a working example of how to migrate a table to native partitioning in PG11 with minimal downtime (pg_partman) === <br />
<br />
The new DEFAULT partition feature in PG11 should make migrating an existing table to a natively partitioned one a lot easier. However, a new child table with constraints that would match data that exists in that DEFAULT table are not allowed to be added. Using PostgreSQL's transaction system, document a method to move the data and attached a child table with minimal downtime to users of the table in question.<br />
<br />
The main objective of this task is to provide the working example. If there is any difficulty working out the commands or methods to do this, those answers can be provided. Documentation should be provide in markdown format.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Review and improve the documentation for PG Partition Manager (pg_partman) === <br />
<br />
The pg_partman documentation has grown quite extensive over the years since it was released. Looking for a basic review of the top level README and files contained in the docs folder for grammar and clarity. Any other recommendations to improve documentation are welcome as well.<br />
<br />
Using Github, fork the repository and submit documentation improvements back via a pull request.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Add documentation for JDBC driver CopyManager API (pgJDBC) === <br />
<br />
The PostgreSQL JDBC driver, pgjdbc, supports COPY commands to both STDIN and STDOUT via a non-JDBC extension, the CopyManager interface. While the driver has supported this functionality for many years, it's not documented in the official docs.<br />
<br />
The purpose of this task is to add some basic documentation of how to use the CopyManager interface to the official pgjdbc docs. The test suite for the pgjdbc driver includes examples of how to use the CopyManager interface and would a good resource for real world usage.<br />
<br />
The expected submission for this task will be a pull request on the pgjdbc GitHub page, https://github.com/pgjdbc/pgjdbc, for the documentation patch.<br />
<br />
=== Add tutorial on logical Replication to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers logical replication between PostgreSQL systems.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up logical replication following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the logical replication tutorial.<br />
<br />
=== Add tutorial on physical Replication to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers physical replication between instances.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up physical replication following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the physical replication tutorial.<br />
<br />
=== Add tutorial on partitioning to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers partitioning of a PostgreSQL table.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up partitioning following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the partitioning tutorial.<br />
<br />
=== Review and improve the "Advanced Features" tutorial in the PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "Advanced Features". Attempt to follow this tutorial, after installing PostgreSQL and creating a database and learning some SQL, to work with some of the Advanced Features of PostgreSQL including creating views, foreign keys, transactions, window functions, and inheritance. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Review and improve the "SQL Language" tutorial in the PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "SQL Language". Attempt to follow this tutorial, after installing PostgreSQL and creating a database, to use and learn SQL with PostgreSQL. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Improve VACUUM Progress Reporting documentation ===<br />
<br />
Suggested by Jeremy Schneider and Jim Nasby<br />
<br />
The docs on VACUUM Progress Reporting list the phases of vacuum, but they don't mention that some of these phases can be repeated - and why those phases would be repeated, and what factors (config settings, attributes of data, num_dead_tuples/max_dead_tuples) influence this behavior. These topics are important for people administrating PostgreSQL and this would be a great place to cover them.<br />
<br />
Should discuss in what order PostgreSQL chooses tables to vacuum - and would be good to at least mention what an "aggressive" vacuum is and what it looks like (I need to check if this is covered elsewhere), and discuss block skipping/freeze map/visibility map as they relate to vacuum.<br />
<br />
https://www.postgresql.org/docs/current/progress-reporting.html#VACUUM-PROGRESS-REPORTING</div>Jerhttps://wiki.postgresql.org/index.php?title=GSoD_2019&diff=33179GSoD 20192019-03-26T18:48:17Z<p>Jer: /* Improve VACUUM Progress Reporting */</p>
<hr />
<div>This page is for collecting ideas for future Season of Docs projects.<br />
<br />
== Regarding Project Ideas ==<br />
<br />
Project ideas are to be added here by community members.<br />
<br />
== Mentors (2019) ==<br />
<br />
The following individuals have been listed as possible mentors on the below projects, and/or offered to be mentors for technical writer-proposed projects:<br />
<br />
=== Organization Administrators ===<br />
<br />
* Stephen Frost<br />
* Sarah Conway<br />
<br />
=== PostgreSQL ===<br />
<br />
* James Chanco<br />
* Alan Youngblood<br />
* Pavan Agrawal<br />
* Evan Macbeth<br />
* Amit Kumar Jaiswal<br />
* Lætitia Avrot<br />
* Tahir Ramzan<br />
* Jeremy Schneider<br />
* Darren Douglas<br />
* Peter Eisentraut<br />
<br />
=== pg_partman ===<br />
<br />
* Keith Fiske<br />
<br />
=== pgJDBC ===<br />
<br />
* Dave Cramer<br />
<br />
=== PL/R ===<br />
<br />
* Dave Cramer<br />
<br />
=== PostGIS ===<br />
<br />
* Martin Davis<br />
<br />
=== pgBackRest ===<br />
<br />
* David Steele<br />
* Cynthia Shang<br />
<br />
== Documentation Project Ideas ==<br />
<br />
=== General Changes ===<br />
<br />
Suggested by Oleg Bartunov. <br />
<br />
Currently, our documentation follows the format of primarily being in a reference manual format. This should be updated, section-by-section, to be more of a user manual format. This should be supplemented by architectural diagrams and visual references, which would need to be created.<br />
<br />
=== Introductory Resources ===<br />
<br />
Suggested by Oleg Bartunov. <br />
<br />
We currently don't have a community resource available that's suitable for absolute beginners for PostgreSQL. We want to ensure that new users of PostgreSQL have a positive experience from the start, and this begins with proper documentation that is simple and approachable. Existing external resources could be adapted for this purpose, including the following that was suggested by Oleg:<br />
<br />
https://edu.postgrespro.ru/introbook_v4_en.pdf<br />
<br />
=== Introductory Tutorial ===<br />
<br />
Suggested by Pavan Agrawal, James Chanco.<br />
<br />
The introductory tutorial available (https://www.postgresql.org/docs/current/tutorial-install.html) in the documentation is not ideal for absolute beginners or those new to databases. It's extremely jargon-heavy and not beginner friendly. This should either be updated or expanded to better suit this purpose.<br />
<br />
Additionally, there are several sections in the tutorial where prior knowledge is assumed, such as how to connect to psql. These sections are currently unclear, and should be rewritten for clarity and accuracy.<br />
<br />
The original task from Google Code-in 2018:<br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "Getting Started". Attempt to follow this tutorial to install PostgreSQL, create a PostgreSQL database and then access that database. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Document the patch submitting process ===<br />
<br />
PostgreSQL development revolves around CommitFest periods, but the process of submitting patches is tedious.<br />
<br />
The documented steps for submitting and reviewing a patch need to be reviewed, and the documentation expanded with useful details and links.<br />
<br />
More information about patch submission and review: https://wiki.postgresql.org/wiki/Reviewing_a_Patch<br />
<br />
=== Document the Extension submitting process ===<br />
<br />
PostgreSQL Database is known to be very extensible. Currently the Extension documentation is sparse:<br />
<br />
https://www.postgresql.org/docs/devel/static/external-extensions.html<br />
<br />
The task is to extend this page with more information about developing extensions, add useful resources, and pointers to the Extension Network (PGXN).<br />
<br />
=== Compose cheat sheets for common tools/operations in PostgreSQL === <br />
<br />
Compose various cheat sheets for common tools/operations in PostgreSQL like Julia Evans did for Linux commands (https://jvns.ca/blog/2017/12/27/a-perf-cheat-sheet/).<br />
<br />
Installing<br />
Managing clusters<br />
Basic settings<br />
Controlling a PostgreSQL server(start, stop, restart...)<br />
Security checks<br />
Performances checks<br />
Upgrading<br />
psql tool<br />
pg_dump tool<br />
pg_restore tool<br />
<br />
=== Upgrade pg_backrest tutorial with Postgres 11 === <br />
<br />
The pgbackrest tutorial was written for postgres 9.4 (https://pgbackrest.org/user-guide.html). Maybe it's time to upgrade it to postgres 11.<br />
<br />
Step needed:<br />
<br />
Understand the user guide<br />
Test it with a postgres 11 version<br />
Upgrade the tutorial for postgres 11<br />
<br />
=== Compose a cheat sheet for backing up and restoring a PostgreSQL instance using X backup tool (11 tasks) === <br />
<br />
The purpose of this task is to guide a user about how to backup and restore using different tools with their pros and cons. Each tool is equal to one task:<br />
<br />
Amanda:<br />
http://www.amanda.org/<br />
<br />
Bacula:<br />
https://www.bacula.org/<br />
<br />
Backup and Recovery Manager for PostgreSQL:<br />
https://www.2ndquadrant.com/<br />
<br />
Barman<br />
https://www.pgbarman.org/<br />
<br />
Handy Backup:<br />
https://www.handybackup.net/<br />
<br />
Iperius Backup:<br />
https://www.iperiusbackup.com/<br />
<br />
NetVault Backup:<br />
https://www.quest.com/products/netvault-backup/<br />
<br />
pgBackRest:<br />
https://pgbackrest.org/<br />
<br />
PostgreSQL backups:<br />
https://www.pgbarman.org/<br />
<br />
Simpana:<br />
https://www.commvault.com/<br />
<br />
Spectrum Protect:<br />
https://www.ibm.com/us-en/marketplace/data-protection-and-recovery<br />
<br />
Manuals:<br />
https://severalnines.com/blog/top-backup-tools-postgresql<br />
<br />
=== Write a PostgreSQL technical mumbo-jumbo dictionary === <br />
<br />
The main goal of this task is to create a HTML website/page to explain some technical words used either in the database world or specifically in Postgres world. Examples of words you could find in that dictionary :<br />
<br />
- cluster<br />
<br />
- vacuum<br />
<br />
- WAL<br />
<br />
- query<br />
<br />
=== Document how to migrate a trigger-based partitioned table to a natively partitioned table (pg_partman) === <br />
<br />
PostgreSQL 10 introduced native partitioning, but previous versions had a method of partitioning that used a combination of triggers, constraints & inheritance.<br />
<br />
https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html<br />
<br />
Document a method that can be used to migrate to the new native partitioning scheme with minimal downtime to the end users of that table.<br />
<br />
https://www.postgresql.org/docs/10/static/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE<br />
<br />
The main objective of this task is to provide the working example. If there is any difficulty working out the commands or methods to do this, those answers can be provided. Documentation should be provide in markdown format.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Document a working example of how to migrate a table to native partitioning in PG11 with minimal downtime (pg_partman) === <br />
<br />
The new DEFAULT partition feature in PG11 should make migrating an existing table to a natively partitioned one a lot easier. However, a new child table with constraints that would match data that exists in that DEFAULT table are not allowed to be added. Using PostgreSQL's transaction system, document a method to move the data and attached a child table with minimal downtime to users of the table in question.<br />
<br />
The main objective of this task is to provide the working example. If there is any difficulty working out the commands or methods to do this, those answers can be provided. Documentation should be provide in markdown format.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Review and improve the documentation for PG Partition Manager (pg_partman) === <br />
<br />
The pg_partman documentation has grown quite extensive over the years since it was released. Looking for a basic review of the top level README and files contained in the docs folder for grammar and clarity. Any other recommendations to improve documentation are welcome as well.<br />
<br />
Using Github, fork the repository and submit documentation improvements back via a pull request.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Add documentation for JDBC driver CopyManager API (pgJDBC) === <br />
<br />
The PostgreSQL JDBC driver, pgjdbc, supports COPY commands to both STDIN and STDOUT via a non-JDBC extension, the CopyManager interface. While the driver has supported this functionality for many years, it's not documented in the official docs.<br />
<br />
The purpose of this task is to add some basic documentation of how to use the CopyManager interface to the official pgjdbc docs. The test suite for the pgjdbc driver includes examples of how to use the CopyManager interface and would a good resource for real world usage.<br />
<br />
The expected submission for this task will be a pull request on the pgjdbc GitHub page, https://github.com/pgjdbc/pgjdbc, for the documentation patch.<br />
<br />
=== Add tutorial on logical Replication to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers logical replication between PostgreSQL systems.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up logical replication following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the logical replication tutorial.<br />
<br />
=== Add tutorial on physical Replication to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers physical replication between instances.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up physical replication following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the physical replication tutorial.<br />
<br />
=== Add tutorial on partitioning to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers partitioning of a PostgreSQL table.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up partitioning following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the partitioning tutorial.<br />
<br />
=== Review and improve the "Advanced Features" tutorial in the PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "Advanced Features". Attempt to follow this tutorial, after installing PostgreSQL and creating a database and learning some SQL, to work with some of the Advanced Features of PostgreSQL including creating views, foreign keys, transactions, window functions, and inheritance. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Review and improve the "SQL Language" tutorial in the PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "SQL Language". Attempt to follow this tutorial, after installing PostgreSQL and creating a database, to use and learn SQL with PostgreSQL. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Improve VACUUM Progress Reporting documentation ===<br />
<br />
Suggested by Jeremy Schneider and Jim Nasby<br />
<br />
The docs on VACUUM Progress Reporting list the phases of vacuum, but they don't mention that some of these phases can be repeated - and why those phases would be repeated, and what factors (config settings, attributes of data, num_dead_tuples/max_dead_tuples) influence this behavior. These topics are important for people administrating PostgreSQL and this would be a great place to cover them.<br />
<br />
Additionally, would be good to at least mention what an "aggressive" vacuum is and what it looks like (I need to check if this is covered elsewhere), and discuss block skipping/freeze map/visibility map as they relate to vacuum.<br />
<br />
https://www.postgresql.org/docs/current/progress-reporting.html#VACUUM-PROGRESS-REPORTING</div>Jerhttps://wiki.postgresql.org/index.php?title=GSoD_2019&diff=33178GSoD 20192019-03-26T18:46:56Z<p>Jer: /* Improve VACUUM Progress Reporting */</p>
<hr />
<div>This page is for collecting ideas for future Season of Docs projects.<br />
<br />
== Regarding Project Ideas ==<br />
<br />
Project ideas are to be added here by community members.<br />
<br />
== Mentors (2019) ==<br />
<br />
The following individuals have been listed as possible mentors on the below projects, and/or offered to be mentors for technical writer-proposed projects:<br />
<br />
=== Organization Administrators ===<br />
<br />
* Stephen Frost<br />
* Sarah Conway<br />
<br />
=== PostgreSQL ===<br />
<br />
* James Chanco<br />
* Alan Youngblood<br />
* Pavan Agrawal<br />
* Evan Macbeth<br />
* Amit Kumar Jaiswal<br />
* Lætitia Avrot<br />
* Tahir Ramzan<br />
* Jeremy Schneider<br />
* Darren Douglas<br />
* Peter Eisentraut<br />
<br />
=== pg_partman ===<br />
<br />
* Keith Fiske<br />
<br />
=== pgJDBC ===<br />
<br />
* Dave Cramer<br />
<br />
=== PL/R ===<br />
<br />
* Dave Cramer<br />
<br />
=== PostGIS ===<br />
<br />
* Martin Davis<br />
<br />
=== pgBackRest ===<br />
<br />
* David Steele<br />
* Cynthia Shang<br />
<br />
== Documentation Project Ideas ==<br />
<br />
=== General Changes ===<br />
<br />
Suggested by Oleg Bartunov. <br />
<br />
Currently, our documentation follows the format of primarily being in a reference manual format. This should be updated, section-by-section, to be more of a user manual format. This should be supplemented by architectural diagrams and visual references, which would need to be created.<br />
<br />
=== Introductory Resources ===<br />
<br />
Suggested by Oleg Bartunov. <br />
<br />
We currently don't have a community resource available that's suitable for absolute beginners for PostgreSQL. We want to ensure that new users of PostgreSQL have a positive experience from the start, and this begins with proper documentation that is simple and approachable. Existing external resources could be adapted for this purpose, including the following that was suggested by Oleg:<br />
<br />
https://edu.postgrespro.ru/introbook_v4_en.pdf<br />
<br />
=== Introductory Tutorial ===<br />
<br />
Suggested by Pavan Agrawal, James Chanco.<br />
<br />
The introductory tutorial available (https://www.postgresql.org/docs/current/tutorial-install.html) in the documentation is not ideal for absolute beginners or those new to databases. It's extremely jargon-heavy and not beginner friendly. This should either be updated or expanded to better suit this purpose.<br />
<br />
Additionally, there are several sections in the tutorial where prior knowledge is assumed, such as how to connect to psql. These sections are currently unclear, and should be rewritten for clarity and accuracy.<br />
<br />
The original task from Google Code-in 2018:<br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "Getting Started". Attempt to follow this tutorial to install PostgreSQL, create a PostgreSQL database and then access that database. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Document the patch submitting process ===<br />
<br />
PostgreSQL development revolves around CommitFest periods, but the process of submitting patches is tedious.<br />
<br />
The documented steps for submitting and reviewing a patch need to be reviewed, and the documentation expanded with useful details and links.<br />
<br />
More information about patch submission and review: https://wiki.postgresql.org/wiki/Reviewing_a_Patch<br />
<br />
=== Document the Extension submitting process ===<br />
<br />
PostgreSQL Database is known to be very extensible. Currently the Extension documentation is sparse:<br />
<br />
https://www.postgresql.org/docs/devel/static/external-extensions.html<br />
<br />
The task is to extend this page with more information about developing extensions, add useful resources, and pointers to the Extension Network (PGXN).<br />
<br />
=== Compose cheat sheets for common tools/operations in PostgreSQL === <br />
<br />
Compose various cheat sheets for common tools/operations in PostgreSQL like Julia Evans did for Linux commands (https://jvns.ca/blog/2017/12/27/a-perf-cheat-sheet/).<br />
<br />
Installing<br />
Managing clusters<br />
Basic settings<br />
Controlling a PostgreSQL server(start, stop, restart...)<br />
Security checks<br />
Performances checks<br />
Upgrading<br />
psql tool<br />
pg_dump tool<br />
pg_restore tool<br />
<br />
=== Upgrade pg_backrest tutorial with Postgres 11 === <br />
<br />
The pgbackrest tutorial was written for postgres 9.4 (https://pgbackrest.org/user-guide.html). Maybe it's time to upgrade it to postgres 11.<br />
<br />
Step needed:<br />
<br />
Understand the user guide<br />
Test it with a postgres 11 version<br />
Upgrade the tutorial for postgres 11<br />
<br />
=== Compose a cheat sheet for backing up and restoring a PostgreSQL instance using X backup tool (11 tasks) === <br />
<br />
The purpose of this task is to guide a user about how to backup and restore using different tools with their pros and cons. Each tool is equal to one task:<br />
<br />
Amanda:<br />
http://www.amanda.org/<br />
<br />
Bacula:<br />
https://www.bacula.org/<br />
<br />
Backup and Recovery Manager for PostgreSQL:<br />
https://www.2ndquadrant.com/<br />
<br />
Barman<br />
https://www.pgbarman.org/<br />
<br />
Handy Backup:<br />
https://www.handybackup.net/<br />
<br />
Iperius Backup:<br />
https://www.iperiusbackup.com/<br />
<br />
NetVault Backup:<br />
https://www.quest.com/products/netvault-backup/<br />
<br />
pgBackRest:<br />
https://pgbackrest.org/<br />
<br />
PostgreSQL backups:<br />
https://www.pgbarman.org/<br />
<br />
Simpana:<br />
https://www.commvault.com/<br />
<br />
Spectrum Protect:<br />
https://www.ibm.com/us-en/marketplace/data-protection-and-recovery<br />
<br />
Manuals:<br />
https://severalnines.com/blog/top-backup-tools-postgresql<br />
<br />
=== Write a PostgreSQL technical mumbo-jumbo dictionary === <br />
<br />
The main goal of this task is to create a HTML website/page to explain some technical words used either in the database world or specifically in Postgres world. Examples of words you could find in that dictionary :<br />
<br />
- cluster<br />
<br />
- vacuum<br />
<br />
- WAL<br />
<br />
- query<br />
<br />
=== Document how to migrate a trigger-based partitioned table to a natively partitioned table (pg_partman) === <br />
<br />
PostgreSQL 10 introduced native partitioning, but previous versions had a method of partitioning that used a combination of triggers, constraints & inheritance.<br />
<br />
https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html<br />
<br />
Document a method that can be used to migrate to the new native partitioning scheme with minimal downtime to the end users of that table.<br />
<br />
https://www.postgresql.org/docs/10/static/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE<br />
<br />
The main objective of this task is to provide the working example. If there is any difficulty working out the commands or methods to do this, those answers can be provided. Documentation should be provide in markdown format.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Document a working example of how to migrate a table to native partitioning in PG11 with minimal downtime (pg_partman) === <br />
<br />
The new DEFAULT partition feature in PG11 should make migrating an existing table to a natively partitioned one a lot easier. However, a new child table with constraints that would match data that exists in that DEFAULT table are not allowed to be added. Using PostgreSQL's transaction system, document a method to move the data and attached a child table with minimal downtime to users of the table in question.<br />
<br />
The main objective of this task is to provide the working example. If there is any difficulty working out the commands or methods to do this, those answers can be provided. Documentation should be provide in markdown format.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Review and improve the documentation for PG Partition Manager (pg_partman) === <br />
<br />
The pg_partman documentation has grown quite extensive over the years since it was released. Looking for a basic review of the top level README and files contained in the docs folder for grammar and clarity. Any other recommendations to improve documentation are welcome as well.<br />
<br />
Using Github, fork the repository and submit documentation improvements back via a pull request.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Add documentation for JDBC driver CopyManager API (pgJDBC) === <br />
<br />
The PostgreSQL JDBC driver, pgjdbc, supports COPY commands to both STDIN and STDOUT via a non-JDBC extension, the CopyManager interface. While the driver has supported this functionality for many years, it's not documented in the official docs.<br />
<br />
The purpose of this task is to add some basic documentation of how to use the CopyManager interface to the official pgjdbc docs. The test suite for the pgjdbc driver includes examples of how to use the CopyManager interface and would a good resource for real world usage.<br />
<br />
The expected submission for this task will be a pull request on the pgjdbc GitHub page, https://github.com/pgjdbc/pgjdbc, for the documentation patch.<br />
<br />
=== Add tutorial on logical Replication to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers logical replication between PostgreSQL systems.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up logical replication following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the logical replication tutorial.<br />
<br />
=== Add tutorial on physical Replication to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers physical replication between instances.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up physical replication following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the physical replication tutorial.<br />
<br />
=== Add tutorial on partitioning to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers partitioning of a PostgreSQL table.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up partitioning following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the partitioning tutorial.<br />
<br />
=== Review and improve the "Advanced Features" tutorial in the PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "Advanced Features". Attempt to follow this tutorial, after installing PostgreSQL and creating a database and learning some SQL, to work with some of the Advanced Features of PostgreSQL including creating views, foreign keys, transactions, window functions, and inheritance. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Review and improve the "SQL Language" tutorial in the PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "SQL Language". Attempt to follow this tutorial, after installing PostgreSQL and creating a database, to use and learn SQL with PostgreSQL. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Improve VACUUM Progress Reporting ===<br />
<br />
Suggested by Jeremy Schneider and Jim Nasby<br />
<br />
The docs on VACUUM Progress Reporting list the phases of vacuum, but they don't mention that some of these phases can be repeated - and why those phases would be repeated, and what factors (config settings, attributes of data, num_dead_tuples/max_dead_tuples) influence this behavior. These topics are important for people administrating PostgreSQL and this would be a great place to cover them.<br />
<br />
Additionally, would be good to at least mention what an "aggressive" vacuum is and what it looks like (I need to check if this is covered elsewhere), and discuss block skipping/freeze map/visibility map as they relate to vacuum.<br />
<br />
https://www.postgresql.org/docs/current/progress-reporting.html#VACUUM-PROGRESS-REPORTING</div>Jerhttps://wiki.postgresql.org/index.php?title=GSoD_2019&diff=33177GSoD 20192019-03-26T18:42:46Z<p>Jer: Improve VACUUM Progress Reporting</p>
<hr />
<div>This page is for collecting ideas for future Season of Docs projects.<br />
<br />
== Regarding Project Ideas ==<br />
<br />
Project ideas are to be added here by community members.<br />
<br />
== Mentors (2019) ==<br />
<br />
The following individuals have been listed as possible mentors on the below projects, and/or offered to be mentors for technical writer-proposed projects:<br />
<br />
=== Organization Administrators ===<br />
<br />
* Stephen Frost<br />
* Sarah Conway<br />
<br />
=== PostgreSQL ===<br />
<br />
* James Chanco<br />
* Alan Youngblood<br />
* Pavan Agrawal<br />
* Evan Macbeth<br />
* Amit Kumar Jaiswal<br />
* Lætitia Avrot<br />
* Tahir Ramzan<br />
* Jeremy Schneider<br />
* Darren Douglas<br />
* Peter Eisentraut<br />
<br />
=== pg_partman ===<br />
<br />
* Keith Fiske<br />
<br />
=== pgJDBC ===<br />
<br />
* Dave Cramer<br />
<br />
=== PL/R ===<br />
<br />
* Dave Cramer<br />
<br />
=== PostGIS ===<br />
<br />
* Martin Davis<br />
<br />
=== pgBackRest ===<br />
<br />
* David Steele<br />
* Cynthia Shang<br />
<br />
== Documentation Project Ideas ==<br />
<br />
=== General Changes ===<br />
<br />
Suggested by Oleg Bartunov. <br />
<br />
Currently, our documentation follows the format of primarily being in a reference manual format. This should be updated, section-by-section, to be more of a user manual format. This should be supplemented by architectural diagrams and visual references, which would need to be created.<br />
<br />
=== Introductory Resources ===<br />
<br />
Suggested by Oleg Bartunov. <br />
<br />
We currently don't have a community resource available that's suitable for absolute beginners for PostgreSQL. We want to ensure that new users of PostgreSQL have a positive experience from the start, and this begins with proper documentation that is simple and approachable. Existing external resources could be adapted for this purpose, including the following that was suggested by Oleg:<br />
<br />
https://edu.postgrespro.ru/introbook_v4_en.pdf<br />
<br />
=== Introductory Tutorial ===<br />
<br />
Suggested by Pavan Agrawal, James Chanco.<br />
<br />
The introductory tutorial available (https://www.postgresql.org/docs/current/tutorial-install.html) in the documentation is not ideal for absolute beginners or those new to databases. It's extremely jargon-heavy and not beginner friendly. This should either be updated or expanded to better suit this purpose.<br />
<br />
Additionally, there are several sections in the tutorial where prior knowledge is assumed, such as how to connect to psql. These sections are currently unclear, and should be rewritten for clarity and accuracy.<br />
<br />
The original task from Google Code-in 2018:<br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "Getting Started". Attempt to follow this tutorial to install PostgreSQL, create a PostgreSQL database and then access that database. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Document the patch submitting process ===<br />
<br />
PostgreSQL development revolves around CommitFest periods, but the process of submitting patches is tedious.<br />
<br />
The documented steps for submitting and reviewing a patch need to be reviewed, and the documentation expanded with useful details and links.<br />
<br />
More information about patch submission and review: https://wiki.postgresql.org/wiki/Reviewing_a_Patch<br />
<br />
=== Document the Extension submitting process ===<br />
<br />
PostgreSQL Database is known to be very extensible. Currently the Extension documentation is sparse:<br />
<br />
https://www.postgresql.org/docs/devel/static/external-extensions.html<br />
<br />
The task is to extend this page with more information about developing extensions, add useful resources, and pointers to the Extension Network (PGXN).<br />
<br />
=== Compose cheat sheets for common tools/operations in PostgreSQL === <br />
<br />
Compose various cheat sheets for common tools/operations in PostgreSQL like Julia Evans did for Linux commands (https://jvns.ca/blog/2017/12/27/a-perf-cheat-sheet/).<br />
<br />
Installing<br />
Managing clusters<br />
Basic settings<br />
Controlling a PostgreSQL server(start, stop, restart...)<br />
Security checks<br />
Performances checks<br />
Upgrading<br />
psql tool<br />
pg_dump tool<br />
pg_restore tool<br />
<br />
=== Upgrade pg_backrest tutorial with Postgres 11 === <br />
<br />
The pgbackrest tutorial was written for postgres 9.4 (https://pgbackrest.org/user-guide.html). Maybe it's time to upgrade it to postgres 11.<br />
<br />
Step needed:<br />
<br />
Understand the user guide<br />
Test it with a postgres 11 version<br />
Upgrade the tutorial for postgres 11<br />
<br />
=== Compose a cheat sheet for backing up and restoring a PostgreSQL instance using X backup tool (11 tasks) === <br />
<br />
The purpose of this task is to guide a user about how to backup and restore using different tools with their pros and cons. Each tool is equal to one task:<br />
<br />
Amanda:<br />
http://www.amanda.org/<br />
<br />
Bacula:<br />
https://www.bacula.org/<br />
<br />
Backup and Recovery Manager for PostgreSQL:<br />
https://www.2ndquadrant.com/<br />
<br />
Barman<br />
https://www.pgbarman.org/<br />
<br />
Handy Backup:<br />
https://www.handybackup.net/<br />
<br />
Iperius Backup:<br />
https://www.iperiusbackup.com/<br />
<br />
NetVault Backup:<br />
https://www.quest.com/products/netvault-backup/<br />
<br />
pgBackRest:<br />
https://pgbackrest.org/<br />
<br />
PostgreSQL backups:<br />
https://www.pgbarman.org/<br />
<br />
Simpana:<br />
https://www.commvault.com/<br />
<br />
Spectrum Protect:<br />
https://www.ibm.com/us-en/marketplace/data-protection-and-recovery<br />
<br />
Manuals:<br />
https://severalnines.com/blog/top-backup-tools-postgresql<br />
<br />
=== Write a PostgreSQL technical mumbo-jumbo dictionary === <br />
<br />
The main goal of this task is to create a HTML website/page to explain some technical words used either in the database world or specifically in Postgres world. Examples of words you could find in that dictionary :<br />
<br />
- cluster<br />
<br />
- vacuum<br />
<br />
- WAL<br />
<br />
- query<br />
<br />
=== Document how to migrate a trigger-based partitioned table to a natively partitioned table (pg_partman) === <br />
<br />
PostgreSQL 10 introduced native partitioning, but previous versions had a method of partitioning that used a combination of triggers, constraints & inheritance.<br />
<br />
https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html<br />
<br />
Document a method that can be used to migrate to the new native partitioning scheme with minimal downtime to the end users of that table.<br />
<br />
https://www.postgresql.org/docs/10/static/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE<br />
<br />
The main objective of this task is to provide the working example. If there is any difficulty working out the commands or methods to do this, those answers can be provided. Documentation should be provide in markdown format.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Document a working example of how to migrate a table to native partitioning in PG11 with minimal downtime (pg_partman) === <br />
<br />
The new DEFAULT partition feature in PG11 should make migrating an existing table to a natively partitioned one a lot easier. However, a new child table with constraints that would match data that exists in that DEFAULT table are not allowed to be added. Using PostgreSQL's transaction system, document a method to move the data and attached a child table with minimal downtime to users of the table in question.<br />
<br />
The main objective of this task is to provide the working example. If there is any difficulty working out the commands or methods to do this, those answers can be provided. Documentation should be provide in markdown format.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Review and improve the documentation for PG Partition Manager (pg_partman) === <br />
<br />
The pg_partman documentation has grown quite extensive over the years since it was released. Looking for a basic review of the top level README and files contained in the docs folder for grammar and clarity. Any other recommendations to improve documentation are welcome as well.<br />
<br />
Using Github, fork the repository and submit documentation improvements back via a pull request.<br />
<br />
https://github.com/pgpartman/pg_partman<br />
<br />
=== Add documentation for JDBC driver CopyManager API (pgJDBC) === <br />
<br />
The PostgreSQL JDBC driver, pgjdbc, supports COPY commands to both STDIN and STDOUT via a non-JDBC extension, the CopyManager interface. While the driver has supported this functionality for many years, it's not documented in the official docs.<br />
<br />
The purpose of this task is to add some basic documentation of how to use the CopyManager interface to the official pgjdbc docs. The test suite for the pgjdbc driver includes examples of how to use the CopyManager interface and would a good resource for real world usage.<br />
<br />
The expected submission for this task will be a pull request on the pgjdbc GitHub page, https://github.com/pgjdbc/pgjdbc, for the documentation patch.<br />
<br />
=== Add tutorial on logical Replication to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers logical replication between PostgreSQL systems.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up logical replication following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the logical replication tutorial.<br />
<br />
=== Add tutorial on physical Replication to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers physical replication between instances.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up physical replication following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the physical replication tutorial.<br />
<br />
=== Add tutorial on partitioning to PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a section for tutorials, but lacks a tutorial which covers partitioning of a PostgreSQL table.<br />
<br />
The tutorial should be in the same format as the PostgreSQL documentation and should be submitted as a patch to the PostgreSQL source code and successfully built using the PostgreSQL build tool-chain. The simplest approach to writing this documentation is to pull down the PostgreSQL source code, build all of PostgreSQL, including the documentation, then work with PostgreSQL to set up partitioning following the existing documentation and ultimately make a copy of an existing tutorial section and then rewrite it to be the partitioning tutorial.<br />
<br />
=== Review and improve the "Advanced Features" tutorial in the PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "Advanced Features". Attempt to follow this tutorial, after installing PostgreSQL and creating a database and learning some SQL, to work with some of the Advanced Features of PostgreSQL including creating views, foreign keys, transactions, window functions, and inheritance. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Review and improve the "SQL Language" tutorial in the PostgreSQL Documentation === <br />
<br />
The PostgreSQL documentation has a set of existing tutorials, one of which is "SQL Language". Attempt to follow this tutorial, after installing PostgreSQL and creating a database, to use and learn SQL with PostgreSQL. Make notes of issues trying to follow the tutorial, areas where the tutorial lacks specific information to be able to accomplish the task, places where the tutorial doesn't provide information about how to tell if a given step was successful or not, and other items which could be improved.<br />
<br />
Using these notes, make changes to the PostgreSQL tutorial in its source format to address the deficiencies and submit these changes as patches to the PostgreSQL source tree so that they can be included in PostgreSQL in the future. These changes must be able to be patched to the PostgreSQL source code and the resulting changes built using the PostgreSQL build tool-chain.<br />
<br />
=== Improve VACUUM Progress Reporting ===<br />
<br />
Suggested by Jeremy Schneider<br />
<br />
The docs on VACUUM Progress Reporting list the phases of vacuum, but they don't mention that some of these phases can be repeated - and why those phases would be repeated, and what factors (config settings, attributes of data) influence this behavior. These topics are important for people administrating PostgreSQL and this would be a great place to cover them.<br />
<br />
https://www.postgresql.org/docs/current/progress-reporting.html#VACUUM-PROGRESS-REPORTING</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30975New in postgres 102017-09-29T20:25:19Z<p>Jer: /* Additional FDW Push-Down */ add link</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Bruce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
* Procedural Languages<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
Links:<br />
* [https://www.depesz.com/2016/10/25/waiting-for-postgresql-10-postgres_fdw-push-down-aggregates-to-remote-servers/ Waiting for PostgreSQL 10 – postgres_fdw: Push down aggregates to remote servers]<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Temporary replication slots ===<br />
<br />
Automatically dropped at the end of the session; prevents fall-behind with less risk.<br />
<br />
=== Connection Failover and Routing in libpq ===<br />
<br />
Postgres 10 is allowing applications to define multiple connection points and define some properties that are expected from the backend server. This simplifies the logic at application level: there is no need for it to know exactly which node is the primary and which ones are the standbys. The new parameter can also be controlled by environment variables.<br />
<br />
Links:<br />
* [http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Postgres 10 highlight - read-write and read-only mode of libpq]<br />
* [http://paquier.xyz/postgresql-2/postgres-10-multi-host-connstr/ Postgres 10 highlight - Multiple hosts in connection strings]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
=== Background processes in pg_stat_activity ===<br />
<br />
pg_stat_activity now includes information (including wait events) about background processes including:<br />
* auxiliary processes<br />
* worker processes<br />
* WAL senders<br />
<br />
=== Traceable Commit / Status by Transaction-ID ===<br />
<br />
PostgreSQL 10 now supports finding out the status of a recent transaction for recovery after network connection loss or crash without having to use heavyweight two-phase commit. It’s also useful for querying standbys.<br />
<br />
Links:<br />
* [https://blog.2ndquadrant.com/postgresql-10-transaction-traceability/ Transaction traceability in PostgreSQL 10 with txid_status(…)]<br />
* [https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
SCRAM is more secure than MD5 and has become the standard way to do authentication. It is a salted challenge response authentication method.<br />
<br />
Client support is required in order to switch to SCRAM authentication in PostgreSQL.<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
Now it is possible to avoid superuser in more instances.<br />
<br />
* pg_read_all_settings<br />
* pg_read_all_stats<br />
* pg_stat_scan_tables<br />
* pg_monitor<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== file_fdw can execute a program ===<br />
<br />
example: (from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation])<br />
CREATE FOREIGN TABLE<br />
test(a int, b text)<br />
SERVER csv<br />
OPTIONS (program 'gunzip -c /tmp/data.czv.gz');<br />
<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
New postgresql.conf defaults:<br />
* wal_level = replica<br />
* max_wal_senders = 10<br />
* max_replication_slots = 10<br />
<br />
New pg_hba.conf defaults:<br />
* Replication connections by default<br />
<br />
pg_basebackup:<br />
* WAL streaming (-X stream) now default<br />
* Uses temporary replication slots by default<br />
<br />
pg_basebackup enhancements:<br />
* WAL streaming supported in tar mode (-Ft)<br />
* Better excludes<br />
<br />
<br />
''Wording from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation].''<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30974New in postgres 102017-09-29T20:23:45Z<p>Jer: /* Connection "Failover" in libpq */ add detail on connection failover & routing</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Bruce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
* Procedural Languages<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Temporary replication slots ===<br />
<br />
Automatically dropped at the end of the session; prevents fall-behind with less risk.<br />
<br />
=== Connection Failover and Routing in libpq ===<br />
<br />
Postgres 10 is allowing applications to define multiple connection points and define some properties that are expected from the backend server. This simplifies the logic at application level: there is no need for it to know exactly which node is the primary and which ones are the standbys. The new parameter can also be controlled by environment variables.<br />
<br />
Links:<br />
* [http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Postgres 10 highlight - read-write and read-only mode of libpq]<br />
* [http://paquier.xyz/postgresql-2/postgres-10-multi-host-connstr/ Postgres 10 highlight - Multiple hosts in connection strings]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
=== Background processes in pg_stat_activity ===<br />
<br />
pg_stat_activity now includes information (including wait events) about background processes including:<br />
* auxiliary processes<br />
* worker processes<br />
* WAL senders<br />
<br />
=== Traceable Commit / Status by Transaction-ID ===<br />
<br />
PostgreSQL 10 now supports finding out the status of a recent transaction for recovery after network connection loss or crash without having to use heavyweight two-phase commit. It’s also useful for querying standbys.<br />
<br />
Links:<br />
* [https://blog.2ndquadrant.com/postgresql-10-transaction-traceability/ Transaction traceability in PostgreSQL 10 with txid_status(…)]<br />
* [https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
SCRAM is more secure than MD5 and has become the standard way to do authentication. It is a salted challenge response authentication method.<br />
<br />
Client support is required in order to switch to SCRAM authentication in PostgreSQL.<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
Now it is possible to avoid superuser in more instances.<br />
<br />
* pg_read_all_settings<br />
* pg_read_all_stats<br />
* pg_stat_scan_tables<br />
* pg_monitor<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== file_fdw can execute a program ===<br />
<br />
example: (from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation])<br />
CREATE FOREIGN TABLE<br />
test(a int, b text)<br />
SERVER csv<br />
OPTIONS (program 'gunzip -c /tmp/data.czv.gz');<br />
<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
New postgresql.conf defaults:<br />
* wal_level = replica<br />
* max_wal_senders = 10<br />
* max_replication_slots = 10<br />
<br />
New pg_hba.conf defaults:<br />
* Replication connections by default<br />
<br />
pg_basebackup:<br />
* WAL streaming (-X stream) now default<br />
* Uses temporary replication slots by default<br />
<br />
pg_basebackup enhancements:<br />
* WAL streaming supported in tar mode (-Ft)<br />
* Better excludes<br />
<br />
<br />
''Wording from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation].''<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30973New in postgres 102017-09-29T20:16:49Z<p>Jer: /* Traceable Commit / Status by Transaction-ID */ format links</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Bruce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
* Procedural Languages<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Temporary replication slots ===<br />
<br />
Automatically dropped at the end of the session; prevents fall-behind with less risk.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
=== Background processes in pg_stat_activity ===<br />
<br />
pg_stat_activity now includes information (including wait events) about background processes including:<br />
* auxiliary processes<br />
* worker processes<br />
* WAL senders<br />
<br />
=== Traceable Commit / Status by Transaction-ID ===<br />
<br />
PostgreSQL 10 now supports finding out the status of a recent transaction for recovery after network connection loss or crash without having to use heavyweight two-phase commit. It’s also useful for querying standbys.<br />
<br />
Links:<br />
* [https://blog.2ndquadrant.com/postgresql-10-transaction-traceability/ Transaction traceability in PostgreSQL 10 with txid_status(…)]<br />
* [https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
SCRAM is more secure than MD5 and has become the standard way to do authentication. It is a salted challenge response authentication method.<br />
<br />
Client support is required in order to switch to SCRAM authentication in PostgreSQL.<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
Now it is possible to avoid superuser in more instances.<br />
<br />
* pg_read_all_settings<br />
* pg_read_all_stats<br />
* pg_stat_scan_tables<br />
* pg_monitor<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== file_fdw can execute a program ===<br />
<br />
example: (from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation])<br />
CREATE FOREIGN TABLE<br />
test(a int, b text)<br />
SERVER csv<br />
OPTIONS (program 'gunzip -c /tmp/data.czv.gz');<br />
<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
New postgresql.conf defaults:<br />
* wal_level = replica<br />
* max_wal_senders = 10<br />
* max_replication_slots = 10<br />
<br />
New pg_hba.conf defaults:<br />
* Replication connections by default<br />
<br />
pg_basebackup:<br />
* WAL streaming (-X stream) now default<br />
* Uses temporary replication slots by default<br />
<br />
pg_basebackup enhancements:<br />
* WAL streaming supported in tar mode (-Ft)<br />
* Better excludes<br />
<br />
<br />
''Wording from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation].''<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30972New in postgres 102017-09-29T20:16:15Z<p>Jer: /* What's New In PostgreSQL 10 */ traceable commit and status by txid are the same feature; collapsed into a single section</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Bruce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
* Procedural Languages<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Temporary replication slots ===<br />
<br />
Automatically dropped at the end of the session; prevents fall-behind with less risk.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
=== Background processes in pg_stat_activity ===<br />
<br />
pg_stat_activity now includes information (including wait events) about background processes including:<br />
* auxiliary processes<br />
* worker processes<br />
* WAL senders<br />
<br />
=== Traceable Commit / Status by Transaction-ID ===<br />
<br />
PostgreSQL 10 now supports finding out the status of a recent transaction for recovery after network connection loss or crash without having to use heavyweight two-phase commit. It’s also useful for querying standbys.<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-transaction-traceability/ Transaction traceability in PostgreSQL 10 with txid_status(…)]<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
SCRAM is more secure than MD5 and has become the standard way to do authentication. It is a salted challenge response authentication method.<br />
<br />
Client support is required in order to switch to SCRAM authentication in PostgreSQL.<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
Now it is possible to avoid superuser in more instances.<br />
<br />
* pg_read_all_settings<br />
* pg_read_all_stats<br />
* pg_stat_scan_tables<br />
* pg_monitor<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== file_fdw can execute a program ===<br />
<br />
example: (from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation])<br />
CREATE FOREIGN TABLE<br />
test(a int, b text)<br />
SERVER csv<br />
OPTIONS (program 'gunzip -c /tmp/data.czv.gz');<br />
<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
New postgresql.conf defaults:<br />
* wal_level = replica<br />
* max_wal_senders = 10<br />
* max_replication_slots = 10<br />
<br />
New pg_hba.conf defaults:<br />
* Replication connections by default<br />
<br />
pg_basebackup:<br />
* WAL streaming (-X stream) now default<br />
* Uses temporary replication slots by default<br />
<br />
pg_basebackup enhancements:<br />
* WAL streaming supported in tar mode (-Ft)<br />
* Better excludes<br />
<br />
<br />
''Wording from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation].''<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30946New in postgres 102017-09-25T23:35:08Z<p>Jer: /* Change Defaults around Replication and pg_basebackup */ pg_basebackup updates</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
* Procedural Languages<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Temporary replication slots ===<br />
<br />
Automatically dropped at the end of the session; prevents fall-behind with less risk.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
=== Background processes in pg_stat_activity ===<br />
<br />
pg_stat_activity now includes information (including wait events) about background processes including:<br />
* auxiliary processes<br />
* worker processes<br />
* WAL senders<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
SCRAM is more secure than MD5 and has become the standard way to do authentication. It is a salted challenge response authentication method.<br />
<br />
Client support is required in order to switch to SCRAM authentication in PostgreSQL.<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
Now it is possible to avoid superuser in more instances.<br />
<br />
* pg_read_all_settings<br />
* pg_read_all_stats<br />
* pg_stat_scan_tables<br />
* pg_monitor<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== file_fdw can execute a program ===<br />
<br />
example: (from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation])<br />
CREATE FOREIGN TABLE<br />
test(a int, b text)<br />
SERVER csv<br />
OPTIONS (program 'gunzip -c /tmp/data.czv.gz');<br />
<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
New postgresql.conf defaults:<br />
* wal_level = replica<br />
* max_wal_senders = 10<br />
* max_replication_slots = 10<br />
<br />
New pg_hba.conf defaults:<br />
* Replication connections by default<br />
<br />
pg_basebackup:<br />
* WAL streaming (-X stream) now default<br />
* Uses temporary replication slots by default<br />
<br />
pg_basebackup enhancements:<br />
* WAL streaming supported in tar mode (-Ft)<br />
* Better excludes<br />
<br />
<br />
''Wording from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation].''<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30945New in postgres 102017-09-25T23:27:34Z<p>Jer: /* Replication and Scaling */ temp replication slots</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
* Procedural Languages<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Temporary replication slots ===<br />
<br />
Automatically dropped at the end of the session; prevents fall-behind with less risk.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
=== Background processes in pg_stat_activity ===<br />
<br />
pg_stat_activity now includes information (including wait events) about background processes including:<br />
* auxiliary processes<br />
* worker processes<br />
* WAL senders<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
SCRAM is more secure than MD5 and has become the standard way to do authentication. It is a salted challenge response authentication method.<br />
<br />
Client support is required in order to switch to SCRAM authentication in PostgreSQL.<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
Now it is possible to avoid superuser in more instances.<br />
<br />
* pg_read_all_settings<br />
* pg_read_all_stats<br />
* pg_stat_scan_tables<br />
* pg_monitor<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== file_fdw can execute a program ===<br />
<br />
example: (from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation])<br />
CREATE FOREIGN TABLE<br />
test(a int, b text)<br />
SERVER csv<br />
OPTIONS (program 'gunzip -c /tmp/data.czv.gz');<br />
<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
New postgresql.conf defaults:<br />
* wal_level = replica<br />
* max_wal_senders = 10<br />
* max_replication_slots = 10<br />
<br />
New pg_hba.conf defaults:<br />
* Replication connections by default<br />
<br />
<br />
<br />
''Wording from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation].''<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30944New in postgres 102017-09-25T23:25:58Z<p>Jer: /* Change Defaults around Replication and pg_basebackup */ description</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
* Procedural Languages<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
=== Background processes in pg_stat_activity ===<br />
<br />
pg_stat_activity now includes information (including wait events) about background processes including:<br />
* auxiliary processes<br />
* worker processes<br />
* WAL senders<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
SCRAM is more secure than MD5 and has become the standard way to do authentication. It is a salted challenge response authentication method.<br />
<br />
Client support is required in order to switch to SCRAM authentication in PostgreSQL.<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
Now it is possible to avoid superuser in more instances.<br />
<br />
* pg_read_all_settings<br />
* pg_read_all_stats<br />
* pg_stat_scan_tables<br />
* pg_monitor<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== file_fdw can execute a program ===<br />
<br />
example: (from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation])<br />
CREATE FOREIGN TABLE<br />
test(a int, b text)<br />
SERVER csv<br />
OPTIONS (program 'gunzip -c /tmp/data.czv.gz');<br />
<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
New postgresql.conf defaults:<br />
* wal_level = replica<br />
* max_wal_senders = 10<br />
* max_replication_slots = 10<br />
<br />
New pg_hba.conf defaults:<br />
* Replication connections by default<br />
<br />
<br />
<br />
''Wording from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation].''<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30943New in postgres 102017-09-25T23:21:14Z<p>Jer: /* Other Features */ add file_fdw</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
* Procedural Languages<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
=== Background processes in pg_stat_activity ===<br />
<br />
pg_stat_activity now includes information (including wait events) about background processes including:<br />
* auxiliary processes<br />
* worker processes<br />
* WAL senders<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
SCRAM is more secure than MD5 and has become the standard way to do authentication. It is a salted challenge response authentication method.<br />
<br />
Client support is required in order to switch to SCRAM authentication in PostgreSQL.<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
Now it is possible to avoid superuser in more instances.<br />
<br />
* pg_read_all_settings<br />
* pg_read_all_stats<br />
* pg_stat_scan_tables<br />
* pg_monitor<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== file_fdw can execute a program ===<br />
<br />
example: (from Magnus Hagander's [https://www.hagander.net/talks/PostgreSQL_10.pdf new features presentation])<br />
CREATE FOREIGN TABLE<br />
test(a int, b text)<br />
SERVER csv<br />
OPTIONS (program 'gunzip -c /tmp/data.czv.gz');<br />
<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30942New in postgres 102017-09-25T23:11:35Z<p>Jer: /* New "monitoring" roles for permission grants */ basic description</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
* Procedural Languages<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
=== Background processes in pg_stat_activity ===<br />
<br />
pg_stat_activity now includes information (including wait events) about background processes including:<br />
* auxiliary processes<br />
* worker processes<br />
* WAL senders<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
SCRAM is more secure than MD5 and has become the standard way to do authentication. It is a salted challenge response authentication method.<br />
<br />
Client support is required in order to switch to SCRAM authentication in PostgreSQL.<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
Now it is possible to avoid superuser in more instances.<br />
<br />
* pg_read_all_settings<br />
* pg_read_all_stats<br />
* pg_stat_scan_tables<br />
* pg_monitor<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30941New in postgres 102017-09-25T23:10:32Z<p>Jer: /* SCRAM Authentication */ description</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
* Procedural Languages<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
=== Background processes in pg_stat_activity ===<br />
<br />
pg_stat_activity now includes information (including wait events) about background processes including:<br />
* auxiliary processes<br />
* worker processes<br />
* WAL senders<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
SCRAM is more secure than MD5 and has become the standard way to do authentication. It is a salted challenge response authentication method.<br />
<br />
Client support is required in order to switch to SCRAM authentication in PostgreSQL.<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30940New in postgres 102017-09-25T22:59:39Z<p>Jer: /* Administration */ bg process in pg_stat_activity</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
* Procedural Languages<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
=== Background processes in pg_stat_activity ===<br />
<br />
pg_stat_activity now includes information (including wait events) about background processes including:<br />
* auxiliary processes<br />
* worker processes<br />
* WAL senders<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30939New in postgres 102017-09-25T22:57:10Z<p>Jer: /* Additional Parallelism in Query Execution */ mention procedural languages</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
* Procedural Languages<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30938New in postgres 102017-09-25T22:56:13Z<p>Jer: /* ICU Collation Support */ description</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
Compile-time configuration option to use an ICU library instead of relying on OS-supplied internationalization library (which was prone to unexpected behavior)<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30937New in postgres 102017-09-25T22:54:45Z<p>Jer: /* Crash Safe, Replicable Hash Indexes */ add description</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
(wording from Bruce Momjian's [http://momjian.us/main/writings/pgsql/features.pdf general pg10 presentation])<br />
<br />
* Crash safe<br />
* Replicated<br />
* Reduced locking during bucket splits<br />
* Faster lookups<br />
* More even index growth<br />
* Single-page pruning<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=Table_partitioning&diff=30936Table partitioning2017-09-25T22:52:44Z<p>Jer: /* Limitations (of declarative partitioning in PostgreSQL 10) */ better wording</p>
<hr />
<div>= Background =<br />
<br />
== Status Quo ==<br />
Starting in PostgreSQL 10, we have declarative partitioning. With it, there is dedicated syntax to create range and list *partitioned* tables and their partitions. Although significant limitations still exist in the usage of partitioned tables, such as the inability to create indexes, row-level triggers, etc. on the partitioned parent table, a lot of manual steps are now rendered unnecessary.<br />
<br />
It is still possible to use the older methods of partitioning if need to implement some custom partitioning criteria (other than the range and list methods that declarative partitioning natively supports), or if the limitations of declarative partitioned tables are seen as hindering. See [http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html PostgreSQL Partitioning] for details. There are some 3rd party plugins that simplify the (manual) task/triggers, etc. see bottom of this page. Although declarative partitioning in PostgreSQL 10 reduces a lot of manual steps, such 3rd party plugins still offer features that the core system does not provide.<br />
<br />
See the various blogs out there describing both the new declarative partitioning and the older inheritance-based implementation.<br />
<br />
=== Resolved Issues ===<br />
* SELECT, UPDATE, DELETE (in 8.2) : They can be handled with constraint_exclusion.<br />
* TRUNCATE (in 8.4) : TRUNCATE for a parent table is expanded into child tables.<br />
* ANALYZE (in 9.0) : {{MessageLink|20091229201145.CF641753FB7@cvs.postgresql.org|ANALYZE to compute such stats for tables that have subclasses}}<br />
* MAX()/MIN() (in 9.1) : Smarter partition detection.<br />
* NO INHERIT constraints (in 9.2) make it possible to define a constraint only on the parent such that it will always be excluded; declarative partitioning (in upcoming 10) always excludes the parent without any additional configuration<br />
* With declarative partitioning (in upcoming 10), tuples inserted into the parent partitioned table are automatically routed to the leaf partitions<br />
<br />
=== Limitations (of declarative partitioning in PostgreSQL 10) ===<br />
''Note: Some work on the following features has already been completed and committed for PostgreSQL 11! And work will continue in this area.''<br />
<br />
* No support for hash partitioning<br />
* No support for UPDATEs that cause rows to move from one partition to another<br />
* No support for routing tuples to partitions that are foreign tables<br />
* No support for index constraints, such as UNIQUE, across the entire partition tree; indexes need to be defined on the individual leaf partitions (unique indexes span only the individual partitions)<br />
* No support for referencing partitioned parent tables in foreign key relationships, nor is there support for referencing regular tables from partitioned parent tables<br />
* No support for defining row triggers on the partitioned parent tables<br />
* No support for "catch-all" / "fallback" / "default" partition<br />
* No support for "splitting" or "merging" partitions using dedicated commands<br />
* No support for automatic creation of partitions (e.g. for values not covered)<br />
* No support for executor-stage partition pruning or faster child table pruning or parallel partition processing<br />
<br />
== Overviews of Project Goals ==<br />
* [[:Image:Partitioning Requirements.pdf | Partitioning Requirements document from Simon Riggs (2008)]]<br />
* [[PgCon 2008 Developer Meeting#Partitioning_Roadmap|PGCon 2008 Developer meeting roadmap]]<br />
<br />
=== List discussions ===<br />
<br />
* [http://www.postgresql.org/message-id/1115677858.3830.131.camel@localhost.localdomain <nowiki>(2005-05) Table Partitioning, Part 1</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00375.php <nowiki>(2007-03) Auto creation of Partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00151.php <nowiki>(2007-04) Re: Auto Partitioning Patch - WIP version 1</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00028.php <nowiki>(2008-01) Dynamic Partitioning using Segment Visibility Maps</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00248.php <nowiki>(2008-01) Named vs Unnamed Partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00387.php <nowiki>(2008-01) Storage Model for Partitioning</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00413.php <nowiki>(2008-01) Declarative partitioning grammar</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01097.php <nowiki>(2008-10) Auto-Partitioning patch discussion</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2009-03/msg00897.php <nowiki>(2009-03) Partitioning feature</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-performance/2009-05/msg00005.php <nowiki>(2009-05) Transparent table partitioning in future version of PG?</nowiki>]<br />
* [http://archives.postgresql.org/message-id/1247564358.11347.1308.camel@ebony.2ndQuadrant <nowiki>(2009-07) Comments on automatic DML routing and explicit partitioning subcommands</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01831.php <nowiki>(2009-10) Patch for automated partitioning</nowiki>]<br />
* [http://archives.postgresql.org/message-id/20091112195450.A967.52131E4D@oss.ntt.co.jp <nowiki>(2009-11) Syntax for partitioning</nowiki>]<br />
* [http://archives.postgresql.org/message-id/4AFADD6A.9070002@asterdata.com <nowiki>(2009-11) Partitioning support for COPY</nowiki>]<br />
* [http://www.postgresql.org/message-id/20100114181323.9A33.52131E4D@oss.ntt.co.jp <nowiki>(2010-01) Partitioning syntax</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-07/msg01519.php <nowiki>(2010-07) Scalability of the planner with non trivial number of partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-07/msg01449.php <nowiki>(2011-07) New partitioning WAS: Check constraints on partition parents only?</nowiki>]<br />
* [http://www.postgresql.org/message-id/20140829155607.GF7705@eldon.alvh.no-ip.org <nowiki>(2014-08) On partitioning</nowiki>]<br />
* [http://www.postgresql.org/message-id/54EC32B6.9070605@lab.ntt.co.jp <nowiki>(2015-02) Partitioning WIP patch</nowiki>]<br />
* [http://www.postgresql.org/message-id/55D3093C.5010800@lab.ntt.co.jp <nowiki>(2015-08) Declarative partitioning</nowiki>]<br />
* [https://www.postgresql.org/message-id/ad16e2f5-fc7c-cc2d-333a-88d4aa446f96@lab.ntt.co.jp <nowiki>(2016-08) Declarative partitioning - another take</nowiki>]<br />
<br />
== Possible Directions ==<br />
<br />
=== Oracle-Style ===<br />
Allow users to declare their intention with partitioned tables. Ie, declare what the partition key is and what range or values are covered by each partition.<br />
<br />
I think this would mean two new types of relation. One "meta-table" that acts like a view, in that it doesn't have an attached filenode. It would also have some kind of meta data about the partition key but no view definition, it would act like parent tables in nested table structure do now. The other would be "partition" which would be a separate namespace from tables and would have attached information about what values of the partition key it covered.<br />
<br />
Pros:<br />
<br />
* Makes it more reasonable to handle inserts automatically since the structure is explicit and doesn't require making logical deductions. <br />
* More idiot-proof, ie you can't set up nonsensical combinations of constraints.<br />
* Consistent with other databases and DBA expectations.<br />
<br />
Cons:<br />
<br />
* Less flexible, you can't set up arbitrary non-traditional structures such as having some data in the parent table or having extra columns in some children.<br />
<br />
Background:<br />
* [http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_7002.htm Oracle CREATE TABLE syntax]<br />
* [http://download-east.oracle.com/docs/cd/B13789_01/server.101/b10736/parpart.htm Partitioning in Oracle 10g]<br />
* [http://download-east.oracle.com/docs/cd/B13789_01/server.101/b10739/partiti.htm#i1006820 Partition management in Oracle 10g]<br />
* [http://www.oracle.com/technetwork/articles/sql/11g-partitioning-084209.html Partition management in Oracle 11g including interval partitions]<br />
* [http://dev.mysql.com/doc/refman/5.1/en/partitioning.html MySQL partitioning]<br />
<br />
<br />
=== DB2-Style ===<br />
<br />
DB2 uses modifier clauses in the CREATE TABLE statement for partitioning. It includes a native form of sharding in the same implementation<br />
{|<br />
! Clause in the CREATE TABLE statement || DB2 feature name<br />
|-<br />
| DISTRIBUTE BY HASH || DPF - Database Partitioning Feature<br />
|-<br />
| ORGANIZE BY DIMENSION || MDC - Multidimensional Clustering<br />
|-<br />
| PARTITION BY RANGE || TP - Table partitioning<br />
|}<br />
<br />
The clauses in any combination to achieve the desired effect.<br />
(cfr. https://www.ibm.com/developerworks/data/library/techarticle/dm-0608mcinerney/)<br />
<br />
- DPF splits into "database partitions"(we would call them shards). "Each database partition has its own set of computing resources, including CPU and storage. In a DPF environment, each table row is distributed to a database partition according to the distribution key specified in the CREATE TABLE statement. When a query is processed, the request is divided so each database partition processes the rows that it is responsible for." <br />
<br />
- MDC enables rows with similar values across multiple dimensions to be physically clustered together on disk. <br />
This clustering allows for efficient I/O for typical analytical queries. For example, all rows where Product='car', Region='East', and SaleMonthYear='Jan09' can be stored in the same storage location, known as a block.<br />
<br />
- TP is what we know as "range partitioning" or "list partitioning", and is implemented in a very similar way as what Postgres currently has: "the user can manually define each data partition, including the range of values to include in that data partition." (and MDC automatically allocates storage for it). "Each TP partition is a separate database object (unlike other tables which are a single database object). Consequently, TP supports attaching and detaching a data partition from the TP table. A detached partitions becomes a regular table. As well, each data partition can be placed in its own table space, if desired."<br />
<br />
The key point seems to be that all three features are orthogonal among them, and can be added at table creation time as well as later on. Moreover, sharding is made a first-class citizen and directly supported by the DB. ISTM that we could leverage an evolved version of postgres_fdw (plus some code borrowed from pg_shard and/or PL/Proxy) to this effect.<br />
<br />
<br />
MQTs (materialized query tables) ---what we call materialized views--- are also subject to partitioning (apparently, also to sharding) directly.<br />
<br />
Syntax Examples:<br />
<br />
CREATE TABLE orders(id INT, shipdate DATE, …)<br />
PARTITION BY RANGE(shipdate)<br />
(<br />
STARTING '1/1/2006' ENDING '12/31/2006' <br />
EVERY 3 MONTHS<br />
)<br />
<br />
Auto-partitioning by interval is nice to have ...<br />
<br />
<br />
CREATE TABLE orders(id INT, shipdate DATE, …)<br />
PARTITION BY RANGE(shipdate)<br />
(<br />
PARTITION q4_05 STARTING MINVALUE,<br />
PARTITION q1_06 STARTING '1/1/2006',<br />
PARTITION q2_06 STARTING '4/1/2006',<br />
PARTITION q3_06 STARTING '7/1/2006',<br />
PARTITION q4_06 STARTING '10/1/2006' <br />
ENDING ‘12/31/2006'<br />
)<br />
<br />
This is equivalent to "VALUES LESS THAN"(technically VALUES GREATER THAN) and includes a limit<br />
<br />
The partition manipulation syntax (here, addition) is nice, too:<br />
ALTER TABLE orders<br />
ATTACH PARTITION q1_07<br />
STARTING '01/01/2007'<br />
ENDING '03/31/2007'<br />
FROM TABLE neworders<br />
<br />
<br />
References:<br />
* https://www.ibm.com/developerworks/data/library/techarticle/dm-0608mcinerney/<br />
* http://www.ibm.com/developerworks/data/library/techarticle/dm-0605ahuja2/<br />
<br />
=== MySQL-style ===<br />
<br />
Fairly basic, supports RANGE, LIST and HASH<br />
<br />
CREATE TABLE ti (id INT, amount DECIMAL(7,2), tr_date DATE)<br />
ENGINE=INNODB<br />
PARTITION BY HASH( MONTH(tr_date) )<br />
PARTITIONS 6;<br />
<br />
References:<br />
* http://dev.mysql.com/doc/refman/5.6/en/partitioning-overview.html<br />
<br />
=== Trigger-based ===<br />
First attempts to support auto-partitioning have been made using triggers.<br />
* avoid specific languages such as pgpsql that requires 'CREATE LANGUAGE'<br />
* performance of C trigger 4 to 5 times faster than pgpsql<br />
* insert/copy returns 0 rows when all rows have been routed by trigger from master to child tables<br />
* chaining triggers allow tunable behavior in case of rows not matching any partition: add an error trigger, move to an overflow table, create new partitions dynamically<br />
* constraint_exclusion does not work well with prepared statements. It might possible to convert CHECKs to One-Time Filter plan nodes if the condition is a variable.<br />
<br />
= Active Work In Progress =<br />
<br />
<br />
== Syntax ==<br />
Syntax is proposed at "[https://commitfest-old.postgresql.org/action/patch_view?id=207 Syntax for partitioning]", [https://commitfest-old.postgresql.org/action/patch_view?id=266 second version]. The syntax resembles [[Oracle]] and [[MySQL]]. See also [[Todo#Administration]] (Simplify ability to create partitioned tables).<br />
<br />
-- create partitioned table and child partitions at once.<br />
CREATE TABLE parent (...)<br />
PARTITION BY [ RANGE | LIST ] ( key ) [ opclass ]<br />
[ (<br />
PARTITION child<br />
{<br />
VALUES LESS THAN { ... | MAXVALUE } -- for RANGE<br />
| VALUES [ IN ] ( { ... | DEFAULT } ) -- for LIST<br />
}<br />
[ WITH ( ... ) ] [ TABLESPACE tbs ]<br />
[, ...]<br />
) ] ;<br />
<br />
-- add a partition key to a table.<br />
ALTER TABLE parent PARTITION BY [ RANGE | LIST ] ( key ) [ opclass ] [ (...) ] ;<br />
<br />
-- create a new partition on a partitioned table.<br />
CREATE PARTITION child ON parent VALUES ... ;<br />
<br />
-- add a table as a partition.<br />
ALTER TABLE parent ATTACH PARTITION child VALUES ... ;<br />
<br />
-- Remove a partition as a normal table.<br />
ALTER TABLE parent DETACH PARTITION child ;<br />
<br />
== Internal representation ==<br />
On-disk structure is included in the "Syntax for partitioning" patch.<br />
On-memory structure will be proposed in a future patch<br />
<br />
=== On-disk structure ===<br />
A new system table "pg_partition" added.<br />
Partition keys are stored in it.<br />
<br />
CREATE TABLE pg_catalog.pg_partition<br />
(<br />
partrelid oid NOT NULL, -- partitioned table oid<br />
partopclass oid NOT NULL, -- operator class to compare keys<br />
partkind "char" NOT NULL, -- kind of partition: RANGE or LIST<br />
partkey text, -- partition key expression<br />
<br />
PRIMARY KEY (partrelid),<br />
FOREIGN KEY (partrelid) REFERENCES pg_class (oid),<br />
FOREIGN KEY (partopclass) REFERENCES pg_opclass (oid)<br />
)<br />
WITHOUT OIDS ;<br />
<br />
A new column "inhvalues" are added into pg_inherits.<br />
Partition values for each partition are stored in it.<br />
<br />
ALTER TABLE pg_class.pg_inherits ADD COLUMN inhvalues anyarray ;<br />
<br />
* RANGE partition has an upper value of the range in inhvalues.<br />
* LIST partition has an array with multiple elements in inhvalues.<br />
* An overflow partition has an empty array in inhvalues.<br />
* A normal inherited table has a NULL in inhvalues.<br />
<br />
=== On-memory structure ===<br />
A cached list of partitions are sorted by partition values and stored in the relcache for the parent table. Changes to the partitions would need to invalidate parent caches to ensure the cache is accurately maintained.<br />
<br />
== Operations ==<br />
=== INSERT ===<br />
INSERT TRIGGERs will be replaced with specialized tuple-routing feature using on-memory structure. Tuples will be routed in O(log N). It also solve "0 row affected" problem in INSERT TRIGGERs.<br />
<br />
=== SELECT, UPDATE, DELETE ===<br />
CHECK constraints continue to be used for a while.<br />
<br />
It would be improved using on-memory structure; instead of CHECK constraints for each child tables, we can use a sorted list in the parent table. Constraint exclusion can be in O(log N) order instead of O(N) of now.<br />
<br />
=== VACUUM, CLUSTER, REINDEX ===<br />
We don't expand those commands for now, but they might have to be expanded like as TRUNCATE.<br />
<br />
= Future improvements =<br />
They are hard to fix in 9.0, but should continue to be improved in the future releases.<br />
<br />
=== Syntax ===<br />
* Support SPLIT and MERGE for existing partitions. See also [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01831.php Kedar's patch]<br />
* Support UPDATE of partition keys and values.<br />
* Support adding a partition between existing partitions. It requires SPLIT feature.<br />
* Support sub-partitions.<br />
* Support some partition kinds for GIS types. For example, "PARTITION BY GIST" holds partition keys as a GiST tree in on-memory structure.<br />
* Support HASH partitions. Each partition could be a FOREIGN TABLE in [[SQL/MED]]. In other words, it is [[PL/Proxy]] integration.<br />
* Support CREATE TABLE AS -- CREATE TABLE tbl PARTITION BY ... AS SELECT ...;<br />
<br />
=== Executor ===<br />
* SELECT FOR SHARE/UPDATE for parent tables.<br />
* Prepared statements that uses partition keys in place holders.<br />
** An idea is to convert check constraints into One-Time_Filter [http://archives.postgresql.org/message-id/20081013172100.87A1.52131E4D@oss.ntt.co.jp]<br />
* Unique constraint over multiple partitions, when each partition has a unique index on set/superset of partition keys<br />
* Unique constraints over multiple partitions in the general case (typically called as "global index").<br />
<br />
=== Planner ===<br />
* Optimization for min/max, LIMIT + ORDER BY, GROUP BY on partition keys.<br />
* Optimization when constraint exclusion are used with stable or volatile functions. It is a very common case that the partition key is timestamp and compared with now().<br />
* Join optimization for two partitioned tables.<br />
<br />
= Third-Party Tools =<br />
<br />
=== PG Partition Manager ===<br />
* [https://github.com/keithf4/pg_partman Project Home Page]<br />
* This is an extension that automates time & serial based partitioning (basically does interval partitioning setting up the right triggers for you). <br />
* Handles setting up, partitioning existing data, dropping unneeded child tables, & undoing partitioning.<br />
<br />
[[Category:Table partitioning]]</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30935New in postgres 102017-09-25T22:49:14Z<p>Jer: /* Additional FDW Push-Down */ add a description</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
In postgres_fdw, push joins and aggregate functions to the remote server in more cases. This reduces the amount of data that must be passed from the remote server, and offloads aggregate computation from the requesting server.<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30934New in postgres 102017-09-25T22:41:15Z<p>Jer: /* Backwards-Incompatible Changes */ mention monitoring</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around monitoring and backup automation. As usual, PostgreSQL users should carefully test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30933New in postgres 102017-09-25T22:39:32Z<p>Jer: /* Drop Support for Floating Point Timestamps */ better wording</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around backup automation. Users should specifically test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. For the small number of users who are using this option a dump/restore will be required to upgrade to PostgreSQL 10. With large datasets this may be time-consuming and will need to be planned carefully.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30932New in postgres 102017-09-25T22:38:11Z<p>Jer: /* Drop Support for Floating Point Timestamps */ description</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around backup automation. Users should specifically test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
Floating-point Timestamps are a compile-time option that have been problematic with replication for some time. It is thought that a small percentage of users are using them, partly due to the fact that few distributors enable the option. However for the small number of users who have large datasets, a dump/restore will be required.<br />
<br />
* [https://www.postgresql.org/message-id/flat/26788.1487455319%40sss.pgh.pa.us#26788.1487455319@sss.pgh.pa.us email discussion]<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 commit]<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30931New in postgres 102017-09-25T22:23:52Z<p>Jer: /* Drop Support for FE/BE 1.0 Protocol */ mention client version</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around backup automation. Users should specifically test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
Clients older than version 6.3 may be affected.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30930New in postgres 102017-09-25T22:20:07Z<p>Jer: /* Additional Parallelism */ clarify title</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism in Query Execution ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around backup automation. Users should specifically test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=Table_partitioning&diff=30929Table partitioning2017-09-25T22:18:16Z<p>Jer: /* Limitations (of declarative partitioning in PostgreSQL 10) */ better wording</p>
<hr />
<div>= Background =<br />
<br />
== Status Quo ==<br />
Starting in PostgreSQL 10, we have declarative partitioning. With it, there is dedicated syntax to create range and list *partitioned* tables and their partitions. Although significant limitations still exist in the usage of partitioned tables, such as the inability to create indexes, row-level triggers, etc. on the partitioned parent table, a lot of manual steps are now rendered unnecessary.<br />
<br />
It is still possible to use the older methods of partitioning if need to implement some custom partitioning criteria (other than the range and list methods that declarative partitioning natively supports), or if the limitations of declarative partitioned tables are seen as hindering. See [http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html PostgreSQL Partitioning] for details. There are some 3rd party plugins that simplify the (manual) task/triggers, etc. see bottom of this page. Although declarative partitioning in PostgreSQL 10 reduces a lot of manual steps, such 3rd party plugins still offer features that the core system does not provide.<br />
<br />
See the various blogs out there describing both the new declarative partitioning and the older inheritance-based implementation.<br />
<br />
=== Resolved Issues ===<br />
* SELECT, UPDATE, DELETE (in 8.2) : They can be handled with constraint_exclusion.<br />
* TRUNCATE (in 8.4) : TRUNCATE for a parent table is expanded into child tables.<br />
* ANALYZE (in 9.0) : {{MessageLink|20091229201145.CF641753FB7@cvs.postgresql.org|ANALYZE to compute such stats for tables that have subclasses}}<br />
* MAX()/MIN() (in 9.1) : Smarter partition detection.<br />
* NO INHERIT constraints (in 9.2) make it possible to define a constraint only on the parent such that it will always be excluded; declarative partitioning (in upcoming 10) always excludes the parent without any additional configuration<br />
* With declarative partitioning (in upcoming 10), tuples inserted into the parent partitioned table are automatically routed to the leaf partitions<br />
<br />
=== Limitations (of declarative partitioning in PostgreSQL 10) ===<br />
''Note: Some work on the following features has already been completed and committed for PostgreSQL 11! And work will continue in this area.''<br />
<br />
* No support for hash partitioning<br />
* No support for UPDATEs that cause rows to move from one partition to another<br />
* No support for routing tuples to partitions that are foreign tables<br />
* No support for index constraints, such as UNIQUE, across the entire partition tree; indexes need to be defined on the individual leaf partitions (unique indexes span only the individual partitions)<br />
* No support for referencing partitioned parent tables in foreign key relationships, nor is there support for referencing regular tables from partitioned parent tables<br />
* No support for defining row triggers on the partitioned parent tables<br />
* No support for "catch-all" / "fallback" / "default" partition<br />
* No support for "splitting" or "merging" partitions using dedicated commands<br />
* No support for automatic creation of partitions<br />
<br />
== Overviews of Project Goals ==<br />
* [[:Image:Partitioning Requirements.pdf | Partitioning Requirements document from Simon Riggs (2008)]]<br />
* [[PgCon 2008 Developer Meeting#Partitioning_Roadmap|PGCon 2008 Developer meeting roadmap]]<br />
<br />
=== List discussions ===<br />
<br />
* [http://www.postgresql.org/message-id/1115677858.3830.131.camel@localhost.localdomain <nowiki>(2005-05) Table Partitioning, Part 1</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00375.php <nowiki>(2007-03) Auto creation of Partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00151.php <nowiki>(2007-04) Re: Auto Partitioning Patch - WIP version 1</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00028.php <nowiki>(2008-01) Dynamic Partitioning using Segment Visibility Maps</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00248.php <nowiki>(2008-01) Named vs Unnamed Partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00387.php <nowiki>(2008-01) Storage Model for Partitioning</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00413.php <nowiki>(2008-01) Declarative partitioning grammar</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01097.php <nowiki>(2008-10) Auto-Partitioning patch discussion</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2009-03/msg00897.php <nowiki>(2009-03) Partitioning feature</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-performance/2009-05/msg00005.php <nowiki>(2009-05) Transparent table partitioning in future version of PG?</nowiki>]<br />
* [http://archives.postgresql.org/message-id/1247564358.11347.1308.camel@ebony.2ndQuadrant <nowiki>(2009-07) Comments on automatic DML routing and explicit partitioning subcommands</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01831.php <nowiki>(2009-10) Patch for automated partitioning</nowiki>]<br />
* [http://archives.postgresql.org/message-id/20091112195450.A967.52131E4D@oss.ntt.co.jp <nowiki>(2009-11) Syntax for partitioning</nowiki>]<br />
* [http://archives.postgresql.org/message-id/4AFADD6A.9070002@asterdata.com <nowiki>(2009-11) Partitioning support for COPY</nowiki>]<br />
* [http://www.postgresql.org/message-id/20100114181323.9A33.52131E4D@oss.ntt.co.jp <nowiki>(2010-01) Partitioning syntax</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-07/msg01519.php <nowiki>(2010-07) Scalability of the planner with non trivial number of partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-07/msg01449.php <nowiki>(2011-07) New partitioning WAS: Check constraints on partition parents only?</nowiki>]<br />
* [http://www.postgresql.org/message-id/20140829155607.GF7705@eldon.alvh.no-ip.org <nowiki>(2014-08) On partitioning</nowiki>]<br />
* [http://www.postgresql.org/message-id/54EC32B6.9070605@lab.ntt.co.jp <nowiki>(2015-02) Partitioning WIP patch</nowiki>]<br />
* [http://www.postgresql.org/message-id/55D3093C.5010800@lab.ntt.co.jp <nowiki>(2015-08) Declarative partitioning</nowiki>]<br />
* [https://www.postgresql.org/message-id/ad16e2f5-fc7c-cc2d-333a-88d4aa446f96@lab.ntt.co.jp <nowiki>(2016-08) Declarative partitioning - another take</nowiki>]<br />
<br />
== Possible Directions ==<br />
<br />
=== Oracle-Style ===<br />
Allow users to declare their intention with partitioned tables. Ie, declare what the partition key is and what range or values are covered by each partition.<br />
<br />
I think this would mean two new types of relation. One "meta-table" that acts like a view, in that it doesn't have an attached filenode. It would also have some kind of meta data about the partition key but no view definition, it would act like parent tables in nested table structure do now. The other would be "partition" which would be a separate namespace from tables and would have attached information about what values of the partition key it covered.<br />
<br />
Pros:<br />
<br />
* Makes it more reasonable to handle inserts automatically since the structure is explicit and doesn't require making logical deductions. <br />
* More idiot-proof, ie you can't set up nonsensical combinations of constraints.<br />
* Consistent with other databases and DBA expectations.<br />
<br />
Cons:<br />
<br />
* Less flexible, you can't set up arbitrary non-traditional structures such as having some data in the parent table or having extra columns in some children.<br />
<br />
Background:<br />
* [http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_7002.htm Oracle CREATE TABLE syntax]<br />
* [http://download-east.oracle.com/docs/cd/B13789_01/server.101/b10736/parpart.htm Partitioning in Oracle 10g]<br />
* [http://download-east.oracle.com/docs/cd/B13789_01/server.101/b10739/partiti.htm#i1006820 Partition management in Oracle 10g]<br />
* [http://www.oracle.com/technetwork/articles/sql/11g-partitioning-084209.html Partition management in Oracle 11g including interval partitions]<br />
* [http://dev.mysql.com/doc/refman/5.1/en/partitioning.html MySQL partitioning]<br />
<br />
<br />
=== DB2-Style ===<br />
<br />
DB2 uses modifier clauses in the CREATE TABLE statement for partitioning. It includes a native form of sharding in the same implementation<br />
{|<br />
! Clause in the CREATE TABLE statement || DB2 feature name<br />
|-<br />
| DISTRIBUTE BY HASH || DPF - Database Partitioning Feature<br />
|-<br />
| ORGANIZE BY DIMENSION || MDC - Multidimensional Clustering<br />
|-<br />
| PARTITION BY RANGE || TP - Table partitioning<br />
|}<br />
<br />
The clauses in any combination to achieve the desired effect.<br />
(cfr. https://www.ibm.com/developerworks/data/library/techarticle/dm-0608mcinerney/)<br />
<br />
- DPF splits into "database partitions"(we would call them shards). "Each database partition has its own set of computing resources, including CPU and storage. In a DPF environment, each table row is distributed to a database partition according to the distribution key specified in the CREATE TABLE statement. When a query is processed, the request is divided so each database partition processes the rows that it is responsible for." <br />
<br />
- MDC enables rows with similar values across multiple dimensions to be physically clustered together on disk. <br />
This clustering allows for efficient I/O for typical analytical queries. For example, all rows where Product='car', Region='East', and SaleMonthYear='Jan09' can be stored in the same storage location, known as a block.<br />
<br />
- TP is what we know as "range partitioning" or "list partitioning", and is implemented in a very similar way as what Postgres currently has: "the user can manually define each data partition, including the range of values to include in that data partition." (and MDC automatically allocates storage for it). "Each TP partition is a separate database object (unlike other tables which are a single database object). Consequently, TP supports attaching and detaching a data partition from the TP table. A detached partitions becomes a regular table. As well, each data partition can be placed in its own table space, if desired."<br />
<br />
The key point seems to be that all three features are orthogonal among them, and can be added at table creation time as well as later on. Moreover, sharding is made a first-class citizen and directly supported by the DB. ISTM that we could leverage an evolved version of postgres_fdw (plus some code borrowed from pg_shard and/or PL/Proxy) to this effect.<br />
<br />
<br />
MQTs (materialized query tables) ---what we call materialized views--- are also subject to partitioning (apparently, also to sharding) directly.<br />
<br />
Syntax Examples:<br />
<br />
CREATE TABLE orders(id INT, shipdate DATE, …)<br />
PARTITION BY RANGE(shipdate)<br />
(<br />
STARTING '1/1/2006' ENDING '12/31/2006' <br />
EVERY 3 MONTHS<br />
)<br />
<br />
Auto-partitioning by interval is nice to have ...<br />
<br />
<br />
CREATE TABLE orders(id INT, shipdate DATE, …)<br />
PARTITION BY RANGE(shipdate)<br />
(<br />
PARTITION q4_05 STARTING MINVALUE,<br />
PARTITION q1_06 STARTING '1/1/2006',<br />
PARTITION q2_06 STARTING '4/1/2006',<br />
PARTITION q3_06 STARTING '7/1/2006',<br />
PARTITION q4_06 STARTING '10/1/2006' <br />
ENDING ‘12/31/2006'<br />
)<br />
<br />
This is equivalent to "VALUES LESS THAN"(technically VALUES GREATER THAN) and includes a limit<br />
<br />
The partition manipulation syntax (here, addition) is nice, too:<br />
ALTER TABLE orders<br />
ATTACH PARTITION q1_07<br />
STARTING '01/01/2007'<br />
ENDING '03/31/2007'<br />
FROM TABLE neworders<br />
<br />
<br />
References:<br />
* https://www.ibm.com/developerworks/data/library/techarticle/dm-0608mcinerney/<br />
* http://www.ibm.com/developerworks/data/library/techarticle/dm-0605ahuja2/<br />
<br />
=== MySQL-style ===<br />
<br />
Fairly basic, supports RANGE, LIST and HASH<br />
<br />
CREATE TABLE ti (id INT, amount DECIMAL(7,2), tr_date DATE)<br />
ENGINE=INNODB<br />
PARTITION BY HASH( MONTH(tr_date) )<br />
PARTITIONS 6;<br />
<br />
References:<br />
* http://dev.mysql.com/doc/refman/5.6/en/partitioning-overview.html<br />
<br />
=== Trigger-based ===<br />
First attempts to support auto-partitioning have been made using triggers.<br />
* avoid specific languages such as pgpsql that requires 'CREATE LANGUAGE'<br />
* performance of C trigger 4 to 5 times faster than pgpsql<br />
* insert/copy returns 0 rows when all rows have been routed by trigger from master to child tables<br />
* chaining triggers allow tunable behavior in case of rows not matching any partition: add an error trigger, move to an overflow table, create new partitions dynamically<br />
* constraint_exclusion does not work well with prepared statements. It might possible to convert CHECKs to One-Time Filter plan nodes if the condition is a variable.<br />
<br />
= Active Work In Progress =<br />
<br />
<br />
== Syntax ==<br />
Syntax is proposed at "[https://commitfest-old.postgresql.org/action/patch_view?id=207 Syntax for partitioning]", [https://commitfest-old.postgresql.org/action/patch_view?id=266 second version]. The syntax resembles [[Oracle]] and [[MySQL]]. See also [[Todo#Administration]] (Simplify ability to create partitioned tables).<br />
<br />
-- create partitioned table and child partitions at once.<br />
CREATE TABLE parent (...)<br />
PARTITION BY [ RANGE | LIST ] ( key ) [ opclass ]<br />
[ (<br />
PARTITION child<br />
{<br />
VALUES LESS THAN { ... | MAXVALUE } -- for RANGE<br />
| VALUES [ IN ] ( { ... | DEFAULT } ) -- for LIST<br />
}<br />
[ WITH ( ... ) ] [ TABLESPACE tbs ]<br />
[, ...]<br />
) ] ;<br />
<br />
-- add a partition key to a table.<br />
ALTER TABLE parent PARTITION BY [ RANGE | LIST ] ( key ) [ opclass ] [ (...) ] ;<br />
<br />
-- create a new partition on a partitioned table.<br />
CREATE PARTITION child ON parent VALUES ... ;<br />
<br />
-- add a table as a partition.<br />
ALTER TABLE parent ATTACH PARTITION child VALUES ... ;<br />
<br />
-- Remove a partition as a normal table.<br />
ALTER TABLE parent DETACH PARTITION child ;<br />
<br />
== Internal representation ==<br />
On-disk structure is included in the "Syntax for partitioning" patch.<br />
On-memory structure will be proposed in a future patch<br />
<br />
=== On-disk structure ===<br />
A new system table "pg_partition" added.<br />
Partition keys are stored in it.<br />
<br />
CREATE TABLE pg_catalog.pg_partition<br />
(<br />
partrelid oid NOT NULL, -- partitioned table oid<br />
partopclass oid NOT NULL, -- operator class to compare keys<br />
partkind "char" NOT NULL, -- kind of partition: RANGE or LIST<br />
partkey text, -- partition key expression<br />
<br />
PRIMARY KEY (partrelid),<br />
FOREIGN KEY (partrelid) REFERENCES pg_class (oid),<br />
FOREIGN KEY (partopclass) REFERENCES pg_opclass (oid)<br />
)<br />
WITHOUT OIDS ;<br />
<br />
A new column "inhvalues" are added into pg_inherits.<br />
Partition values for each partition are stored in it.<br />
<br />
ALTER TABLE pg_class.pg_inherits ADD COLUMN inhvalues anyarray ;<br />
<br />
* RANGE partition has an upper value of the range in inhvalues.<br />
* LIST partition has an array with multiple elements in inhvalues.<br />
* An overflow partition has an empty array in inhvalues.<br />
* A normal inherited table has a NULL in inhvalues.<br />
<br />
=== On-memory structure ===<br />
A cached list of partitions are sorted by partition values and stored in the relcache for the parent table. Changes to the partitions would need to invalidate parent caches to ensure the cache is accurately maintained.<br />
<br />
== Operations ==<br />
=== INSERT ===<br />
INSERT TRIGGERs will be replaced with specialized tuple-routing feature using on-memory structure. Tuples will be routed in O(log N). It also solve "0 row affected" problem in INSERT TRIGGERs.<br />
<br />
=== SELECT, UPDATE, DELETE ===<br />
CHECK constraints continue to be used for a while.<br />
<br />
It would be improved using on-memory structure; instead of CHECK constraints for each child tables, we can use a sorted list in the parent table. Constraint exclusion can be in O(log N) order instead of O(N) of now.<br />
<br />
=== VACUUM, CLUSTER, REINDEX ===<br />
We don't expand those commands for now, but they might have to be expanded like as TRUNCATE.<br />
<br />
= Future improvements =<br />
They are hard to fix in 9.0, but should continue to be improved in the future releases.<br />
<br />
=== Syntax ===<br />
* Support SPLIT and MERGE for existing partitions. See also [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01831.php Kedar's patch]<br />
* Support UPDATE of partition keys and values.<br />
* Support adding a partition between existing partitions. It requires SPLIT feature.<br />
* Support sub-partitions.<br />
* Support some partition kinds for GIS types. For example, "PARTITION BY GIST" holds partition keys as a GiST tree in on-memory structure.<br />
* Support HASH partitions. Each partition could be a FOREIGN TABLE in [[SQL/MED]]. In other words, it is [[PL/Proxy]] integration.<br />
* Support CREATE TABLE AS -- CREATE TABLE tbl PARTITION BY ... AS SELECT ...;<br />
<br />
=== Executor ===<br />
* SELECT FOR SHARE/UPDATE for parent tables.<br />
* Prepared statements that uses partition keys in place holders.<br />
** An idea is to convert check constraints into One-Time_Filter [http://archives.postgresql.org/message-id/20081013172100.87A1.52131E4D@oss.ntt.co.jp]<br />
* Unique constraint over multiple partitions, when each partition has a unique index on set/superset of partition keys<br />
* Unique constraints over multiple partitions in the general case (typically called as "global index").<br />
<br />
=== Planner ===<br />
* Optimization for min/max, LIMIT + ORDER BY, GROUP BY on partition keys.<br />
* Optimization when constraint exclusion are used with stable or volatile functions. It is a very common case that the partition key is timestamp and compared with now().<br />
* Join optimization for two partitioned tables.<br />
<br />
= Third-Party Tools =<br />
<br />
=== PG Partition Manager ===<br />
* [https://github.com/keithf4/pg_partman Project Home Page]<br />
* This is an extension that automates time & serial based partitioning (basically does interval partitioning setting up the right triggers for you). <br />
* Handles setting up, partitioning existing data, dropping unneeded child tables, & undoing partitioning.<br />
<br />
[[Category:Table partitioning]]</div>Jerhttps://wiki.postgresql.org/index.php?title=Table_partitioning&diff=30928Table partitioning2017-09-25T22:15:38Z<p>Jer: /* Limitations (of declarative partitioning in PostgreSQL 10) */ note about pg11</p>
<hr />
<div>= Background =<br />
<br />
== Status Quo ==<br />
Starting in PostgreSQL 10, we have declarative partitioning. With it, there is dedicated syntax to create range and list *partitioned* tables and their partitions. Although significant limitations still exist in the usage of partitioned tables, such as the inability to create indexes, row-level triggers, etc. on the partitioned parent table, a lot of manual steps are now rendered unnecessary.<br />
<br />
It is still possible to use the older methods of partitioning if need to implement some custom partitioning criteria (other than the range and list methods that declarative partitioning natively supports), or if the limitations of declarative partitioned tables are seen as hindering. See [http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html PostgreSQL Partitioning] for details. There are some 3rd party plugins that simplify the (manual) task/triggers, etc. see bottom of this page. Although declarative partitioning in PostgreSQL 10 reduces a lot of manual steps, such 3rd party plugins still offer features that the core system does not provide.<br />
<br />
See the various blogs out there describing both the new declarative partitioning and the older inheritance-based implementation.<br />
<br />
=== Resolved Issues ===<br />
* SELECT, UPDATE, DELETE (in 8.2) : They can be handled with constraint_exclusion.<br />
* TRUNCATE (in 8.4) : TRUNCATE for a parent table is expanded into child tables.<br />
* ANALYZE (in 9.0) : {{MessageLink|20091229201145.CF641753FB7@cvs.postgresql.org|ANALYZE to compute such stats for tables that have subclasses}}<br />
* MAX()/MIN() (in 9.1) : Smarter partition detection.<br />
* NO INHERIT constraints (in 9.2) make it possible to define a constraint only on the parent such that it will always be excluded; declarative partitioning (in upcoming 10) always excludes the parent without any additional configuration<br />
* With declarative partitioning (in upcoming 10), tuples inserted into the parent partitioned table are automatically routed to the leaf partitions<br />
<br />
=== Limitations (of declarative partitioning in PostgreSQL 10) ===<br />
''Note: some of these limitations are already addressed and committed for PostgreSQL 11 and work will continue in this area''<br />
<br />
* No support for hash partitioning<br />
* No support for UPDATEs that cause rows to move from one partition to another<br />
* No support for routing tuples to partitions that are foreign tables<br />
* No support for index constraints, such as UNIQUE, across the entire partition tree; indexes need to be defined on the individual leaf partitions (unique indexes span only the individual partitions)<br />
* No support for referencing partitioned parent tables in foreign key relationships, nor is there support for referencing regular tables from partitioned parent tables<br />
* No support for defining row triggers on the partitioned parent tables<br />
* No support for "catch-all" / "fallback" / "default" partition<br />
* No support for "splitting" or "merging" partitions using dedicated commands<br />
* No support for automatic creation of partitions<br />
<br />
== Overviews of Project Goals ==<br />
* [[:Image:Partitioning Requirements.pdf | Partitioning Requirements document from Simon Riggs (2008)]]<br />
* [[PgCon 2008 Developer Meeting#Partitioning_Roadmap|PGCon 2008 Developer meeting roadmap]]<br />
<br />
=== List discussions ===<br />
<br />
* [http://www.postgresql.org/message-id/1115677858.3830.131.camel@localhost.localdomain <nowiki>(2005-05) Table Partitioning, Part 1</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00375.php <nowiki>(2007-03) Auto creation of Partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00151.php <nowiki>(2007-04) Re: Auto Partitioning Patch - WIP version 1</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00028.php <nowiki>(2008-01) Dynamic Partitioning using Segment Visibility Maps</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00248.php <nowiki>(2008-01) Named vs Unnamed Partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00387.php <nowiki>(2008-01) Storage Model for Partitioning</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00413.php <nowiki>(2008-01) Declarative partitioning grammar</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01097.php <nowiki>(2008-10) Auto-Partitioning patch discussion</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2009-03/msg00897.php <nowiki>(2009-03) Partitioning feature</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-performance/2009-05/msg00005.php <nowiki>(2009-05) Transparent table partitioning in future version of PG?</nowiki>]<br />
* [http://archives.postgresql.org/message-id/1247564358.11347.1308.camel@ebony.2ndQuadrant <nowiki>(2009-07) Comments on automatic DML routing and explicit partitioning subcommands</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01831.php <nowiki>(2009-10) Patch for automated partitioning</nowiki>]<br />
* [http://archives.postgresql.org/message-id/20091112195450.A967.52131E4D@oss.ntt.co.jp <nowiki>(2009-11) Syntax for partitioning</nowiki>]<br />
* [http://archives.postgresql.org/message-id/4AFADD6A.9070002@asterdata.com <nowiki>(2009-11) Partitioning support for COPY</nowiki>]<br />
* [http://www.postgresql.org/message-id/20100114181323.9A33.52131E4D@oss.ntt.co.jp <nowiki>(2010-01) Partitioning syntax</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-07/msg01519.php <nowiki>(2010-07) Scalability of the planner with non trivial number of partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-07/msg01449.php <nowiki>(2011-07) New partitioning WAS: Check constraints on partition parents only?</nowiki>]<br />
* [http://www.postgresql.org/message-id/20140829155607.GF7705@eldon.alvh.no-ip.org <nowiki>(2014-08) On partitioning</nowiki>]<br />
* [http://www.postgresql.org/message-id/54EC32B6.9070605@lab.ntt.co.jp <nowiki>(2015-02) Partitioning WIP patch</nowiki>]<br />
* [http://www.postgresql.org/message-id/55D3093C.5010800@lab.ntt.co.jp <nowiki>(2015-08) Declarative partitioning</nowiki>]<br />
* [https://www.postgresql.org/message-id/ad16e2f5-fc7c-cc2d-333a-88d4aa446f96@lab.ntt.co.jp <nowiki>(2016-08) Declarative partitioning - another take</nowiki>]<br />
<br />
== Possible Directions ==<br />
<br />
=== Oracle-Style ===<br />
Allow users to declare their intention with partitioned tables. Ie, declare what the partition key is and what range or values are covered by each partition.<br />
<br />
I think this would mean two new types of relation. One "meta-table" that acts like a view, in that it doesn't have an attached filenode. It would also have some kind of meta data about the partition key but no view definition, it would act like parent tables in nested table structure do now. The other would be "partition" which would be a separate namespace from tables and would have attached information about what values of the partition key it covered.<br />
<br />
Pros:<br />
<br />
* Makes it more reasonable to handle inserts automatically since the structure is explicit and doesn't require making logical deductions. <br />
* More idiot-proof, ie you can't set up nonsensical combinations of constraints.<br />
* Consistent with other databases and DBA expectations.<br />
<br />
Cons:<br />
<br />
* Less flexible, you can't set up arbitrary non-traditional structures such as having some data in the parent table or having extra columns in some children.<br />
<br />
Background:<br />
* [http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_7002.htm Oracle CREATE TABLE syntax]<br />
* [http://download-east.oracle.com/docs/cd/B13789_01/server.101/b10736/parpart.htm Partitioning in Oracle 10g]<br />
* [http://download-east.oracle.com/docs/cd/B13789_01/server.101/b10739/partiti.htm#i1006820 Partition management in Oracle 10g]<br />
* [http://www.oracle.com/technetwork/articles/sql/11g-partitioning-084209.html Partition management in Oracle 11g including interval partitions]<br />
* [http://dev.mysql.com/doc/refman/5.1/en/partitioning.html MySQL partitioning]<br />
<br />
<br />
=== DB2-Style ===<br />
<br />
DB2 uses modifier clauses in the CREATE TABLE statement for partitioning. It includes a native form of sharding in the same implementation<br />
{|<br />
! Clause in the CREATE TABLE statement || DB2 feature name<br />
|-<br />
| DISTRIBUTE BY HASH || DPF - Database Partitioning Feature<br />
|-<br />
| ORGANIZE BY DIMENSION || MDC - Multidimensional Clustering<br />
|-<br />
| PARTITION BY RANGE || TP - Table partitioning<br />
|}<br />
<br />
The clauses in any combination to achieve the desired effect.<br />
(cfr. https://www.ibm.com/developerworks/data/library/techarticle/dm-0608mcinerney/)<br />
<br />
- DPF splits into "database partitions"(we would call them shards). "Each database partition has its own set of computing resources, including CPU and storage. In a DPF environment, each table row is distributed to a database partition according to the distribution key specified in the CREATE TABLE statement. When a query is processed, the request is divided so each database partition processes the rows that it is responsible for." <br />
<br />
- MDC enables rows with similar values across multiple dimensions to be physically clustered together on disk. <br />
This clustering allows for efficient I/O for typical analytical queries. For example, all rows where Product='car', Region='East', and SaleMonthYear='Jan09' can be stored in the same storage location, known as a block.<br />
<br />
- TP is what we know as "range partitioning" or "list partitioning", and is implemented in a very similar way as what Postgres currently has: "the user can manually define each data partition, including the range of values to include in that data partition." (and MDC automatically allocates storage for it). "Each TP partition is a separate database object (unlike other tables which are a single database object). Consequently, TP supports attaching and detaching a data partition from the TP table. A detached partitions becomes a regular table. As well, each data partition can be placed in its own table space, if desired."<br />
<br />
The key point seems to be that all three features are orthogonal among them, and can be added at table creation time as well as later on. Moreover, sharding is made a first-class citizen and directly supported by the DB. ISTM that we could leverage an evolved version of postgres_fdw (plus some code borrowed from pg_shard and/or PL/Proxy) to this effect.<br />
<br />
<br />
MQTs (materialized query tables) ---what we call materialized views--- are also subject to partitioning (apparently, also to sharding) directly.<br />
<br />
Syntax Examples:<br />
<br />
CREATE TABLE orders(id INT, shipdate DATE, …)<br />
PARTITION BY RANGE(shipdate)<br />
(<br />
STARTING '1/1/2006' ENDING '12/31/2006' <br />
EVERY 3 MONTHS<br />
)<br />
<br />
Auto-partitioning by interval is nice to have ...<br />
<br />
<br />
CREATE TABLE orders(id INT, shipdate DATE, …)<br />
PARTITION BY RANGE(shipdate)<br />
(<br />
PARTITION q4_05 STARTING MINVALUE,<br />
PARTITION q1_06 STARTING '1/1/2006',<br />
PARTITION q2_06 STARTING '4/1/2006',<br />
PARTITION q3_06 STARTING '7/1/2006',<br />
PARTITION q4_06 STARTING '10/1/2006' <br />
ENDING ‘12/31/2006'<br />
)<br />
<br />
This is equivalent to "VALUES LESS THAN"(technically VALUES GREATER THAN) and includes a limit<br />
<br />
The partition manipulation syntax (here, addition) is nice, too:<br />
ALTER TABLE orders<br />
ATTACH PARTITION q1_07<br />
STARTING '01/01/2007'<br />
ENDING '03/31/2007'<br />
FROM TABLE neworders<br />
<br />
<br />
References:<br />
* https://www.ibm.com/developerworks/data/library/techarticle/dm-0608mcinerney/<br />
* http://www.ibm.com/developerworks/data/library/techarticle/dm-0605ahuja2/<br />
<br />
=== MySQL-style ===<br />
<br />
Fairly basic, supports RANGE, LIST and HASH<br />
<br />
CREATE TABLE ti (id INT, amount DECIMAL(7,2), tr_date DATE)<br />
ENGINE=INNODB<br />
PARTITION BY HASH( MONTH(tr_date) )<br />
PARTITIONS 6;<br />
<br />
References:<br />
* http://dev.mysql.com/doc/refman/5.6/en/partitioning-overview.html<br />
<br />
=== Trigger-based ===<br />
First attempts to support auto-partitioning have been made using triggers.<br />
* avoid specific languages such as pgpsql that requires 'CREATE LANGUAGE'<br />
* performance of C trigger 4 to 5 times faster than pgpsql<br />
* insert/copy returns 0 rows when all rows have been routed by trigger from master to child tables<br />
* chaining triggers allow tunable behavior in case of rows not matching any partition: add an error trigger, move to an overflow table, create new partitions dynamically<br />
* constraint_exclusion does not work well with prepared statements. It might possible to convert CHECKs to One-Time Filter plan nodes if the condition is a variable.<br />
<br />
= Active Work In Progress =<br />
<br />
<br />
== Syntax ==<br />
Syntax is proposed at "[https://commitfest-old.postgresql.org/action/patch_view?id=207 Syntax for partitioning]", [https://commitfest-old.postgresql.org/action/patch_view?id=266 second version]. The syntax resembles [[Oracle]] and [[MySQL]]. See also [[Todo#Administration]] (Simplify ability to create partitioned tables).<br />
<br />
-- create partitioned table and child partitions at once.<br />
CREATE TABLE parent (...)<br />
PARTITION BY [ RANGE | LIST ] ( key ) [ opclass ]<br />
[ (<br />
PARTITION child<br />
{<br />
VALUES LESS THAN { ... | MAXVALUE } -- for RANGE<br />
| VALUES [ IN ] ( { ... | DEFAULT } ) -- for LIST<br />
}<br />
[ WITH ( ... ) ] [ TABLESPACE tbs ]<br />
[, ...]<br />
) ] ;<br />
<br />
-- add a partition key to a table.<br />
ALTER TABLE parent PARTITION BY [ RANGE | LIST ] ( key ) [ opclass ] [ (...) ] ;<br />
<br />
-- create a new partition on a partitioned table.<br />
CREATE PARTITION child ON parent VALUES ... ;<br />
<br />
-- add a table as a partition.<br />
ALTER TABLE parent ATTACH PARTITION child VALUES ... ;<br />
<br />
-- Remove a partition as a normal table.<br />
ALTER TABLE parent DETACH PARTITION child ;<br />
<br />
== Internal representation ==<br />
On-disk structure is included in the "Syntax for partitioning" patch.<br />
On-memory structure will be proposed in a future patch<br />
<br />
=== On-disk structure ===<br />
A new system table "pg_partition" added.<br />
Partition keys are stored in it.<br />
<br />
CREATE TABLE pg_catalog.pg_partition<br />
(<br />
partrelid oid NOT NULL, -- partitioned table oid<br />
partopclass oid NOT NULL, -- operator class to compare keys<br />
partkind "char" NOT NULL, -- kind of partition: RANGE or LIST<br />
partkey text, -- partition key expression<br />
<br />
PRIMARY KEY (partrelid),<br />
FOREIGN KEY (partrelid) REFERENCES pg_class (oid),<br />
FOREIGN KEY (partopclass) REFERENCES pg_opclass (oid)<br />
)<br />
WITHOUT OIDS ;<br />
<br />
A new column "inhvalues" are added into pg_inherits.<br />
Partition values for each partition are stored in it.<br />
<br />
ALTER TABLE pg_class.pg_inherits ADD COLUMN inhvalues anyarray ;<br />
<br />
* RANGE partition has an upper value of the range in inhvalues.<br />
* LIST partition has an array with multiple elements in inhvalues.<br />
* An overflow partition has an empty array in inhvalues.<br />
* A normal inherited table has a NULL in inhvalues.<br />
<br />
=== On-memory structure ===<br />
A cached list of partitions are sorted by partition values and stored in the relcache for the parent table. Changes to the partitions would need to invalidate parent caches to ensure the cache is accurately maintained.<br />
<br />
== Operations ==<br />
=== INSERT ===<br />
INSERT TRIGGERs will be replaced with specialized tuple-routing feature using on-memory structure. Tuples will be routed in O(log N). It also solve "0 row affected" problem in INSERT TRIGGERs.<br />
<br />
=== SELECT, UPDATE, DELETE ===<br />
CHECK constraints continue to be used for a while.<br />
<br />
It would be improved using on-memory structure; instead of CHECK constraints for each child tables, we can use a sorted list in the parent table. Constraint exclusion can be in O(log N) order instead of O(N) of now.<br />
<br />
=== VACUUM, CLUSTER, REINDEX ===<br />
We don't expand those commands for now, but they might have to be expanded like as TRUNCATE.<br />
<br />
= Future improvements =<br />
They are hard to fix in 9.0, but should continue to be improved in the future releases.<br />
<br />
=== Syntax ===<br />
* Support SPLIT and MERGE for existing partitions. See also [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01831.php Kedar's patch]<br />
* Support UPDATE of partition keys and values.<br />
* Support adding a partition between existing partitions. It requires SPLIT feature.<br />
* Support sub-partitions.<br />
* Support some partition kinds for GIS types. For example, "PARTITION BY GIST" holds partition keys as a GiST tree in on-memory structure.<br />
* Support HASH partitions. Each partition could be a FOREIGN TABLE in [[SQL/MED]]. In other words, it is [[PL/Proxy]] integration.<br />
* Support CREATE TABLE AS -- CREATE TABLE tbl PARTITION BY ... AS SELECT ...;<br />
<br />
=== Executor ===<br />
* SELECT FOR SHARE/UPDATE for parent tables.<br />
* Prepared statements that uses partition keys in place holders.<br />
** An idea is to convert check constraints into One-Time_Filter [http://archives.postgresql.org/message-id/20081013172100.87A1.52131E4D@oss.ntt.co.jp]<br />
* Unique constraint over multiple partitions, when each partition has a unique index on set/superset of partition keys<br />
* Unique constraints over multiple partitions in the general case (typically called as "global index").<br />
<br />
=== Planner ===<br />
* Optimization for min/max, LIMIT + ORDER BY, GROUP BY on partition keys.<br />
* Optimization when constraint exclusion are used with stable or volatile functions. It is a very common case that the partition key is timestamp and compared with now().<br />
* Join optimization for two partitioned tables.<br />
<br />
= Third-Party Tools =<br />
<br />
=== PG Partition Manager ===<br />
* [https://github.com/keithf4/pg_partman Project Home Page]<br />
* This is an extension that automates time & serial based partitioning (basically does interval partitioning setting up the right triggers for you). <br />
* Handles setting up, partitioning existing data, dropping unneeded child tables, & undoing partitioning.<br />
<br />
[[Category:Table partitioning]]</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30927New in postgres 102017-09-25T22:12:52Z<p>Jer: /* What's New In PostgreSQL 10 */ general pg10 link</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
* Magnus Hagander [https://www.hagander.net/talks/PostgreSQL_10.pdf PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around backup automation. Users should specifically test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30926New in postgres 102017-09-25T22:11:10Z<p>Jer: /* What's New In PostgreSQL 10 */ another general link</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
* Broce Momjian [http://momjian.us/main/writings/pgsql/features.pdf Major Features: Postgres 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around backup automation. Users should specifically test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30925New in postgres 102017-09-25T22:09:30Z<p>Jer: /* Latch Wait times in pg_stat_activity */ much better wait event write-up</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Significant Expansion of Wait Events in pg_stat_activity ===<br />
<br />
PostgreSQL 9.6 code was instrumented with a total of 69 wait events. PostgreSQL 10 expands the instrumentation and now includes 184 wait events. In particular 67+ I/O related events were added and 31+ latch-related events were added.<br />
<br />
The wait_event_type and wait_event columns added to the pg_stat_activity view in Postgres 9.6 give us a significant new window to find which parts of the system are causing query delays and gives us very accurate statistics on where we are losing performance.<br />
<br />
* Bruce Momjian [https://momjian.us/main/blogs/pgblog/2017.html#February_28_2017 Wait Event Reporting]<br />
* Robert Haas [https://www.postgresql.org/message-id/flat/CA%2BTgmoav9Q5v5ZGT3%2BwP_1tQjT6TGYXrwrDcTRrWimC%2BZY7RRA%40mail.gmail.com#CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com pgbench vs wait events]<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around backup automation. Users should specifically test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30924New in postgres 102017-09-25T21:49:35Z<p>Jer: /* Additional Parallelism */ list of specific parallel query improvements from rhaas blog</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism ===<br />
<br />
(wording from Robert Haas' blog post, linked below)<br />
<br />
* Parallel Merge Join: In PostgreSQL 9.6, only hash joins and nested loops can be performed in the parallel portion of a plan. In PostgreSQL 10, merge joins can also be performed in the parallel portion of the plan.<br />
* Parallel Bitmap Heap Scan: One process scans the index and builds a data structure in shared memory indicating all of the heap pages that need to be scanned, and then all cooperating processes can perform the heap scan in parallel.<br />
* Parallel Index Scan and Index-Only Scan: It's now possible for the driving table to be scanned using an index-scan or an index-only scan.<br />
* Gather Merge: If each worker is producing sorted output, then gather those results in a way that preserves the sort order.<br />
* Subplan-Related Improvements: A table with an uncorrelated subplan can appear in the parallel portion of the plan.<br />
* Pass Query Text To Workers: The query text associated with a parallel worker will show up in pg_stat_activity.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* Robert Haas [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Latch Wait times in pg_stat_activity ===<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around backup automation. Users should specifically test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30923New in postgres 102017-09-25T21:23:25Z<p>Jer: /* What's New In PostgreSQL 10 */ another general link</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
* Robert Haas [https://rhaas.blogspot.jp/2017/04/new-features-coming-in-postgresql-10.html New Features Coming in PostgreSQL 10]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism ===<br />
<br />
Some additional plan nodes can be executed in parallel, particularly Index Scans.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Latch Wait times in pg_stat_activity ===<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around backup automation. Users should specifically test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30922New in postgres 102017-09-25T21:21:57Z<p>Jer: /* Native Partitioning */ more links on partitioning</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
'''''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning'''''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* Hubert Lubaczewski [https://www.depesz.com/2017/02/06/waiting-for-postgresql-10-implement-table-partitioning/ Table Partitioning Examples] (depesz.com) <br />
* Keith Fiske [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
* Robert Haas [https://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html Plans for Partitioning in v11] and [https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com email on partitioning next steps] (a.k.a. important limitations in v10)<br />
<br />
=== Additional Parallelism ===<br />
<br />
Some additional plan nodes can be executed in parallel, particularly Index Scans.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Latch Wait times in pg_stat_activity ===<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around backup automation. Users should specifically test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=Table_partitioning&diff=30921Table partitioning2017-09-25T20:51:02Z<p>Jer: Undo revision 30918 by Jer (talk) oops, referenced devel/master branch instead of 10</p>
<hr />
<div>= Background =<br />
<br />
== Status Quo ==<br />
Starting in PostgreSQL 10, we have declarative partitioning. With it, there is dedicated syntax to create range and list *partitioned* tables and their partitions. Although significant limitations still exist in the usage of partitioned tables, such as the inability to create indexes, row-level triggers, etc. on the partitioned parent table, a lot of manual steps are now rendered unnecessary.<br />
<br />
It is still possible to use the older methods of partitioning if need to implement some custom partitioning criteria (other than the range and list methods that declarative partitioning natively supports), or if the limitations of declarative partitioned tables are seen as hindering. See [http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html PostgreSQL Partitioning] for details. There are some 3rd party plugins that simplify the (manual) task/triggers, etc. see bottom of this page. Although declarative partitioning in PostgreSQL 10 reduces a lot of manual steps, such 3rd party plugins still offer features that the core system does not provide.<br />
<br />
See the various blogs out there describing both the new declarative partitioning and the older inheritance-based implementation.<br />
<br />
=== Resolved Issues ===<br />
* SELECT, UPDATE, DELETE (in 8.2) : They can be handled with constraint_exclusion.<br />
* TRUNCATE (in 8.4) : TRUNCATE for a parent table is expanded into child tables.<br />
* ANALYZE (in 9.0) : {{MessageLink|20091229201145.CF641753FB7@cvs.postgresql.org|ANALYZE to compute such stats for tables that have subclasses}}<br />
* MAX()/MIN() (in 9.1) : Smarter partition detection.<br />
* NO INHERIT constraints (in 9.2) make it possible to define a constraint only on the parent such that it will always be excluded; declarative partitioning (in upcoming 10) always excludes the parent without any additional configuration<br />
* With declarative partitioning (in upcoming 10), tuples inserted into the parent partitioned table are automatically routed to the leaf partitions<br />
<br />
=== Limitations (of declarative partitioning in PostgreSQL 10) ===<br />
* No support for hash partitioning<br />
* No support for UPDATEs that cause rows to move from one partition to another<br />
* No support for routing tuples to partitions that are foreign tables<br />
* No support for index constraints, such as UNIQUE, across the entire partition tree; indexes need to be defined on the individual leaf partitions (unique indexes span only the individual partitions)<br />
* No support for referencing partitioned parent tables in foreign key relationships, nor is there support for referencing regular tables from partitioned parent tables<br />
* No support for defining row triggers on the partitioned parent tables<br />
* No support for "catch-all" / "fallback" / "default" partition<br />
* No support for "splitting" or "merging" partitions using dedicated commands<br />
* No support for automatic creation of partitions<br />
<br />
== Overviews of Project Goals ==<br />
* [[:Image:Partitioning Requirements.pdf | Partitioning Requirements document from Simon Riggs (2008)]]<br />
* [[PgCon 2008 Developer Meeting#Partitioning_Roadmap|PGCon 2008 Developer meeting roadmap]]<br />
<br />
=== List discussions ===<br />
<br />
* [http://www.postgresql.org/message-id/1115677858.3830.131.camel@localhost.localdomain <nowiki>(2005-05) Table Partitioning, Part 1</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00375.php <nowiki>(2007-03) Auto creation of Partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00151.php <nowiki>(2007-04) Re: Auto Partitioning Patch - WIP version 1</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00028.php <nowiki>(2008-01) Dynamic Partitioning using Segment Visibility Maps</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00248.php <nowiki>(2008-01) Named vs Unnamed Partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00387.php <nowiki>(2008-01) Storage Model for Partitioning</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00413.php <nowiki>(2008-01) Declarative partitioning grammar</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01097.php <nowiki>(2008-10) Auto-Partitioning patch discussion</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2009-03/msg00897.php <nowiki>(2009-03) Partitioning feature</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-performance/2009-05/msg00005.php <nowiki>(2009-05) Transparent table partitioning in future version of PG?</nowiki>]<br />
* [http://archives.postgresql.org/message-id/1247564358.11347.1308.camel@ebony.2ndQuadrant <nowiki>(2009-07) Comments on automatic DML routing and explicit partitioning subcommands</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01831.php <nowiki>(2009-10) Patch for automated partitioning</nowiki>]<br />
* [http://archives.postgresql.org/message-id/20091112195450.A967.52131E4D@oss.ntt.co.jp <nowiki>(2009-11) Syntax for partitioning</nowiki>]<br />
* [http://archives.postgresql.org/message-id/4AFADD6A.9070002@asterdata.com <nowiki>(2009-11) Partitioning support for COPY</nowiki>]<br />
* [http://www.postgresql.org/message-id/20100114181323.9A33.52131E4D@oss.ntt.co.jp <nowiki>(2010-01) Partitioning syntax</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-07/msg01519.php <nowiki>(2010-07) Scalability of the planner with non trivial number of partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-07/msg01449.php <nowiki>(2011-07) New partitioning WAS: Check constraints on partition parents only?</nowiki>]<br />
* [http://www.postgresql.org/message-id/20140829155607.GF7705@eldon.alvh.no-ip.org <nowiki>(2014-08) On partitioning</nowiki>]<br />
* [http://www.postgresql.org/message-id/54EC32B6.9070605@lab.ntt.co.jp <nowiki>(2015-02) Partitioning WIP patch</nowiki>]<br />
* [http://www.postgresql.org/message-id/55D3093C.5010800@lab.ntt.co.jp <nowiki>(2015-08) Declarative partitioning</nowiki>]<br />
* [https://www.postgresql.org/message-id/ad16e2f5-fc7c-cc2d-333a-88d4aa446f96@lab.ntt.co.jp <nowiki>(2016-08) Declarative partitioning - another take</nowiki>]<br />
<br />
== Possible Directions ==<br />
<br />
=== Oracle-Style ===<br />
Allow users to declare their intention with partitioned tables. Ie, declare what the partition key is and what range or values are covered by each partition.<br />
<br />
I think this would mean two new types of relation. One "meta-table" that acts like a view, in that it doesn't have an attached filenode. It would also have some kind of meta data about the partition key but no view definition, it would act like parent tables in nested table structure do now. The other would be "partition" which would be a separate namespace from tables and would have attached information about what values of the partition key it covered.<br />
<br />
Pros:<br />
<br />
* Makes it more reasonable to handle inserts automatically since the structure is explicit and doesn't require making logical deductions. <br />
* More idiot-proof, ie you can't set up nonsensical combinations of constraints.<br />
* Consistent with other databases and DBA expectations.<br />
<br />
Cons:<br />
<br />
* Less flexible, you can't set up arbitrary non-traditional structures such as having some data in the parent table or having extra columns in some children.<br />
<br />
Background:<br />
* [http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_7002.htm Oracle CREATE TABLE syntax]<br />
* [http://download-east.oracle.com/docs/cd/B13789_01/server.101/b10736/parpart.htm Partitioning in Oracle 10g]<br />
* [http://download-east.oracle.com/docs/cd/B13789_01/server.101/b10739/partiti.htm#i1006820 Partition management in Oracle 10g]<br />
* [http://www.oracle.com/technetwork/articles/sql/11g-partitioning-084209.html Partition management in Oracle 11g including interval partitions]<br />
* [http://dev.mysql.com/doc/refman/5.1/en/partitioning.html MySQL partitioning]<br />
<br />
<br />
=== DB2-Style ===<br />
<br />
DB2 uses modifier clauses in the CREATE TABLE statement for partitioning. It includes a native form of sharding in the same implementation<br />
{|<br />
! Clause in the CREATE TABLE statement || DB2 feature name<br />
|-<br />
| DISTRIBUTE BY HASH || DPF - Database Partitioning Feature<br />
|-<br />
| ORGANIZE BY DIMENSION || MDC - Multidimensional Clustering<br />
|-<br />
| PARTITION BY RANGE || TP - Table partitioning<br />
|}<br />
<br />
The clauses in any combination to achieve the desired effect.<br />
(cfr. https://www.ibm.com/developerworks/data/library/techarticle/dm-0608mcinerney/)<br />
<br />
- DPF splits into "database partitions"(we would call them shards). "Each database partition has its own set of computing resources, including CPU and storage. In a DPF environment, each table row is distributed to a database partition according to the distribution key specified in the CREATE TABLE statement. When a query is processed, the request is divided so each database partition processes the rows that it is responsible for." <br />
<br />
- MDC enables rows with similar values across multiple dimensions to be physically clustered together on disk. <br />
This clustering allows for efficient I/O for typical analytical queries. For example, all rows where Product='car', Region='East', and SaleMonthYear='Jan09' can be stored in the same storage location, known as a block.<br />
<br />
- TP is what we know as "range partitioning" or "list partitioning", and is implemented in a very similar way as what Postgres currently has: "the user can manually define each data partition, including the range of values to include in that data partition." (and MDC automatically allocates storage for it). "Each TP partition is a separate database object (unlike other tables which are a single database object). Consequently, TP supports attaching and detaching a data partition from the TP table. A detached partitions becomes a regular table. As well, each data partition can be placed in its own table space, if desired."<br />
<br />
The key point seems to be that all three features are orthogonal among them, and can be added at table creation time as well as later on. Moreover, sharding is made a first-class citizen and directly supported by the DB. ISTM that we could leverage an evolved version of postgres_fdw (plus some code borrowed from pg_shard and/or PL/Proxy) to this effect.<br />
<br />
<br />
MQTs (materialized query tables) ---what we call materialized views--- are also subject to partitioning (apparently, also to sharding) directly.<br />
<br />
Syntax Examples:<br />
<br />
CREATE TABLE orders(id INT, shipdate DATE, …)<br />
PARTITION BY RANGE(shipdate)<br />
(<br />
STARTING '1/1/2006' ENDING '12/31/2006' <br />
EVERY 3 MONTHS<br />
)<br />
<br />
Auto-partitioning by interval is nice to have ...<br />
<br />
<br />
CREATE TABLE orders(id INT, shipdate DATE, …)<br />
PARTITION BY RANGE(shipdate)<br />
(<br />
PARTITION q4_05 STARTING MINVALUE,<br />
PARTITION q1_06 STARTING '1/1/2006',<br />
PARTITION q2_06 STARTING '4/1/2006',<br />
PARTITION q3_06 STARTING '7/1/2006',<br />
PARTITION q4_06 STARTING '10/1/2006' <br />
ENDING ‘12/31/2006'<br />
)<br />
<br />
This is equivalent to "VALUES LESS THAN"(technically VALUES GREATER THAN) and includes a limit<br />
<br />
The partition manipulation syntax (here, addition) is nice, too:<br />
ALTER TABLE orders<br />
ATTACH PARTITION q1_07<br />
STARTING '01/01/2007'<br />
ENDING '03/31/2007'<br />
FROM TABLE neworders<br />
<br />
<br />
References:<br />
* https://www.ibm.com/developerworks/data/library/techarticle/dm-0608mcinerney/<br />
* http://www.ibm.com/developerworks/data/library/techarticle/dm-0605ahuja2/<br />
<br />
=== MySQL-style ===<br />
<br />
Fairly basic, supports RANGE, LIST and HASH<br />
<br />
CREATE TABLE ti (id INT, amount DECIMAL(7,2), tr_date DATE)<br />
ENGINE=INNODB<br />
PARTITION BY HASH( MONTH(tr_date) )<br />
PARTITIONS 6;<br />
<br />
References:<br />
* http://dev.mysql.com/doc/refman/5.6/en/partitioning-overview.html<br />
<br />
=== Trigger-based ===<br />
First attempts to support auto-partitioning have been made using triggers.<br />
* avoid specific languages such as pgpsql that requires 'CREATE LANGUAGE'<br />
* performance of C trigger 4 to 5 times faster than pgpsql<br />
* insert/copy returns 0 rows when all rows have been routed by trigger from master to child tables<br />
* chaining triggers allow tunable behavior in case of rows not matching any partition: add an error trigger, move to an overflow table, create new partitions dynamically<br />
* constraint_exclusion does not work well with prepared statements. It might possible to convert CHECKs to One-Time Filter plan nodes if the condition is a variable.<br />
<br />
= Active Work In Progress =<br />
<br />
<br />
== Syntax ==<br />
Syntax is proposed at "[https://commitfest-old.postgresql.org/action/patch_view?id=207 Syntax for partitioning]", [https://commitfest-old.postgresql.org/action/patch_view?id=266 second version]. The syntax resembles [[Oracle]] and [[MySQL]]. See also [[Todo#Administration]] (Simplify ability to create partitioned tables).<br />
<br />
-- create partitioned table and child partitions at once.<br />
CREATE TABLE parent (...)<br />
PARTITION BY [ RANGE | LIST ] ( key ) [ opclass ]<br />
[ (<br />
PARTITION child<br />
{<br />
VALUES LESS THAN { ... | MAXVALUE } -- for RANGE<br />
| VALUES [ IN ] ( { ... | DEFAULT } ) -- for LIST<br />
}<br />
[ WITH ( ... ) ] [ TABLESPACE tbs ]<br />
[, ...]<br />
) ] ;<br />
<br />
-- add a partition key to a table.<br />
ALTER TABLE parent PARTITION BY [ RANGE | LIST ] ( key ) [ opclass ] [ (...) ] ;<br />
<br />
-- create a new partition on a partitioned table.<br />
CREATE PARTITION child ON parent VALUES ... ;<br />
<br />
-- add a table as a partition.<br />
ALTER TABLE parent ATTACH PARTITION child VALUES ... ;<br />
<br />
-- Remove a partition as a normal table.<br />
ALTER TABLE parent DETACH PARTITION child ;<br />
<br />
== Internal representation ==<br />
On-disk structure is included in the "Syntax for partitioning" patch.<br />
On-memory structure will be proposed in a future patch<br />
<br />
=== On-disk structure ===<br />
A new system table "pg_partition" added.<br />
Partition keys are stored in it.<br />
<br />
CREATE TABLE pg_catalog.pg_partition<br />
(<br />
partrelid oid NOT NULL, -- partitioned table oid<br />
partopclass oid NOT NULL, -- operator class to compare keys<br />
partkind "char" NOT NULL, -- kind of partition: RANGE or LIST<br />
partkey text, -- partition key expression<br />
<br />
PRIMARY KEY (partrelid),<br />
FOREIGN KEY (partrelid) REFERENCES pg_class (oid),<br />
FOREIGN KEY (partopclass) REFERENCES pg_opclass (oid)<br />
)<br />
WITHOUT OIDS ;<br />
<br />
A new column "inhvalues" are added into pg_inherits.<br />
Partition values for each partition are stored in it.<br />
<br />
ALTER TABLE pg_class.pg_inherits ADD COLUMN inhvalues anyarray ;<br />
<br />
* RANGE partition has an upper value of the range in inhvalues.<br />
* LIST partition has an array with multiple elements in inhvalues.<br />
* An overflow partition has an empty array in inhvalues.<br />
* A normal inherited table has a NULL in inhvalues.<br />
<br />
=== On-memory structure ===<br />
A cached list of partitions are sorted by partition values and stored in the relcache for the parent table. Changes to the partitions would need to invalidate parent caches to ensure the cache is accurately maintained.<br />
<br />
== Operations ==<br />
=== INSERT ===<br />
INSERT TRIGGERs will be replaced with specialized tuple-routing feature using on-memory structure. Tuples will be routed in O(log N). It also solve "0 row affected" problem in INSERT TRIGGERs.<br />
<br />
=== SELECT, UPDATE, DELETE ===<br />
CHECK constraints continue to be used for a while.<br />
<br />
It would be improved using on-memory structure; instead of CHECK constraints for each child tables, we can use a sorted list in the parent table. Constraint exclusion can be in O(log N) order instead of O(N) of now.<br />
<br />
=== VACUUM, CLUSTER, REINDEX ===<br />
We don't expand those commands for now, but they might have to be expanded like as TRUNCATE.<br />
<br />
= Future improvements =<br />
They are hard to fix in 9.0, but should continue to be improved in the future releases.<br />
<br />
=== Syntax ===<br />
* Support SPLIT and MERGE for existing partitions. See also [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01831.php Kedar's patch]<br />
* Support UPDATE of partition keys and values.<br />
* Support adding a partition between existing partitions. It requires SPLIT feature.<br />
* Support sub-partitions.<br />
* Support some partition kinds for GIS types. For example, "PARTITION BY GIST" holds partition keys as a GiST tree in on-memory structure.<br />
* Support HASH partitions. Each partition could be a FOREIGN TABLE in [[SQL/MED]]. In other words, it is [[PL/Proxy]] integration.<br />
* Support CREATE TABLE AS -- CREATE TABLE tbl PARTITION BY ... AS SELECT ...;<br />
<br />
=== Executor ===<br />
* SELECT FOR SHARE/UPDATE for parent tables.<br />
* Prepared statements that uses partition keys in place holders.<br />
** An idea is to convert check constraints into One-Time_Filter [http://archives.postgresql.org/message-id/20081013172100.87A1.52131E4D@oss.ntt.co.jp]<br />
* Unique constraint over multiple partitions, when each partition has a unique index on set/superset of partition keys<br />
* Unique constraints over multiple partitions in the general case (typically called as "global index").<br />
<br />
=== Planner ===<br />
* Optimization for min/max, LIMIT + ORDER BY, GROUP BY on partition keys.<br />
* Optimization when constraint exclusion are used with stable or volatile functions. It is a very common case that the partition key is timestamp and compared with now().<br />
* Join optimization for two partitioned tables.<br />
<br />
= Third-Party Tools =<br />
<br />
=== PG Partition Manager ===<br />
* [https://github.com/keithf4/pg_partman Project Home Page]<br />
* This is an extension that automates time & serial based partitioning (basically does interval partitioning setting up the right triggers for you). <br />
* Handles setting up, partitioning existing data, dropping unneeded child tables, & undoing partitioning.<br />
<br />
[[Category:Table partitioning]]</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30920New in postgres 102017-09-25T18:48:54Z<p>Jer: /* Native Partitioning */ link to partitioning wiki page</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
''[[Table_partitioning]]: Background and Limitations of PostgreSQL 10 Partitioning''<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
<br />
=== Additional Parallelism ===<br />
<br />
Some additional plan nodes can be executed in parallel, particularly Index Scans.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Latch Wait times in pg_stat_activity ===<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around backup automation. Users should specifically test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jerhttps://wiki.postgresql.org/index.php?title=Table_partitioning&diff=30919Table partitioning2017-09-25T18:45:44Z<p>Jer: /* Limitations (of declarative partitioning in PostgreSQL 10) */</p>
<hr />
<div>= Background =<br />
<br />
== Status Quo ==<br />
Starting in PostgreSQL 10, we have declarative partitioning. With it, there is dedicated syntax to create range and list *partitioned* tables and their partitions. Although significant limitations still exist in the usage of partitioned tables, such as the inability to create indexes, row-level triggers, etc. on the partitioned parent table, a lot of manual steps are now rendered unnecessary.<br />
<br />
It is still possible to use the older methods of partitioning if need to implement some custom partitioning criteria (other than the range and list methods that declarative partitioning natively supports), or if the limitations of declarative partitioned tables are seen as hindering. See [http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html PostgreSQL Partitioning] for details. There are some 3rd party plugins that simplify the (manual) task/triggers, etc. see bottom of this page. Although declarative partitioning in PostgreSQL 10 reduces a lot of manual steps, such 3rd party plugins still offer features that the core system does not provide.<br />
<br />
See the various blogs out there describing both the new declarative partitioning and the older inheritance-based implementation.<br />
<br />
=== Resolved Issues ===<br />
* SELECT, UPDATE, DELETE (in 8.2) : They can be handled with constraint_exclusion.<br />
* TRUNCATE (in 8.4) : TRUNCATE for a parent table is expanded into child tables.<br />
* ANALYZE (in 9.0) : {{MessageLink|20091229201145.CF641753FB7@cvs.postgresql.org|ANALYZE to compute such stats for tables that have subclasses}}<br />
* MAX()/MIN() (in 9.1) : Smarter partition detection.<br />
* NO INHERIT constraints (in 9.2) make it possible to define a constraint only on the parent such that it will always be excluded; declarative partitioning (in upcoming 10) always excludes the parent without any additional configuration<br />
* With declarative partitioning (in upcoming 10), tuples inserted into the parent partitioned table are automatically routed to the leaf partitions<br />
<br />
=== Limitations (of declarative partitioning in PostgreSQL 10) ===<br />
* No support for hash partitioning<br />
* No support for UPDATEs that cause rows to move from one partition to another<br />
* No support for routing tuples to partitions that are foreign tables<br />
* No support for index constraints, such as UNIQUE, across the entire partition tree; indexes need to be defined on the individual leaf partitions (unique indexes span only the individual partitions)<br />
* No support for referencing partitioned parent tables in foreign key relationships, nor is there support for referencing regular tables from partitioned parent tables<br />
* No support for defining row triggers on the partitioned parent tables<br />
* No support for "splitting" or "merging" partitions using dedicated commands<br />
* No support for automatic creation of partitions<br />
<br />
== Overviews of Project Goals ==<br />
* [[:Image:Partitioning Requirements.pdf | Partitioning Requirements document from Simon Riggs (2008)]]<br />
* [[PgCon 2008 Developer Meeting#Partitioning_Roadmap|PGCon 2008 Developer meeting roadmap]]<br />
<br />
=== List discussions ===<br />
<br />
* [http://www.postgresql.org/message-id/1115677858.3830.131.camel@localhost.localdomain <nowiki>(2005-05) Table Partitioning, Part 1</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00375.php <nowiki>(2007-03) Auto creation of Partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00151.php <nowiki>(2007-04) Re: Auto Partitioning Patch - WIP version 1</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00028.php <nowiki>(2008-01) Dynamic Partitioning using Segment Visibility Maps</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00248.php <nowiki>(2008-01) Named vs Unnamed Partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00387.php <nowiki>(2008-01) Storage Model for Partitioning</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00413.php <nowiki>(2008-01) Declarative partitioning grammar</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01097.php <nowiki>(2008-10) Auto-Partitioning patch discussion</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2009-03/msg00897.php <nowiki>(2009-03) Partitioning feature</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-performance/2009-05/msg00005.php <nowiki>(2009-05) Transparent table partitioning in future version of PG?</nowiki>]<br />
* [http://archives.postgresql.org/message-id/1247564358.11347.1308.camel@ebony.2ndQuadrant <nowiki>(2009-07) Comments on automatic DML routing and explicit partitioning subcommands</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01831.php <nowiki>(2009-10) Patch for automated partitioning</nowiki>]<br />
* [http://archives.postgresql.org/message-id/20091112195450.A967.52131E4D@oss.ntt.co.jp <nowiki>(2009-11) Syntax for partitioning</nowiki>]<br />
* [http://archives.postgresql.org/message-id/4AFADD6A.9070002@asterdata.com <nowiki>(2009-11) Partitioning support for COPY</nowiki>]<br />
* [http://www.postgresql.org/message-id/20100114181323.9A33.52131E4D@oss.ntt.co.jp <nowiki>(2010-01) Partitioning syntax</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-07/msg01519.php <nowiki>(2010-07) Scalability of the planner with non trivial number of partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-07/msg01449.php <nowiki>(2011-07) New partitioning WAS: Check constraints on partition parents only?</nowiki>]<br />
* [http://www.postgresql.org/message-id/20140829155607.GF7705@eldon.alvh.no-ip.org <nowiki>(2014-08) On partitioning</nowiki>]<br />
* [http://www.postgresql.org/message-id/54EC32B6.9070605@lab.ntt.co.jp <nowiki>(2015-02) Partitioning WIP patch</nowiki>]<br />
* [http://www.postgresql.org/message-id/55D3093C.5010800@lab.ntt.co.jp <nowiki>(2015-08) Declarative partitioning</nowiki>]<br />
* [https://www.postgresql.org/message-id/ad16e2f5-fc7c-cc2d-333a-88d4aa446f96@lab.ntt.co.jp <nowiki>(2016-08) Declarative partitioning - another take</nowiki>]<br />
<br />
== Possible Directions ==<br />
<br />
=== Oracle-Style ===<br />
Allow users to declare their intention with partitioned tables. Ie, declare what the partition key is and what range or values are covered by each partition.<br />
<br />
I think this would mean two new types of relation. One "meta-table" that acts like a view, in that it doesn't have an attached filenode. It would also have some kind of meta data about the partition key but no view definition, it would act like parent tables in nested table structure do now. The other would be "partition" which would be a separate namespace from tables and would have attached information about what values of the partition key it covered.<br />
<br />
Pros:<br />
<br />
* Makes it more reasonable to handle inserts automatically since the structure is explicit and doesn't require making logical deductions. <br />
* More idiot-proof, ie you can't set up nonsensical combinations of constraints.<br />
* Consistent with other databases and DBA expectations.<br />
<br />
Cons:<br />
<br />
* Less flexible, you can't set up arbitrary non-traditional structures such as having some data in the parent table or having extra columns in some children.<br />
<br />
Background:<br />
* [http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_7002.htm Oracle CREATE TABLE syntax]<br />
* [http://download-east.oracle.com/docs/cd/B13789_01/server.101/b10736/parpart.htm Partitioning in Oracle 10g]<br />
* [http://download-east.oracle.com/docs/cd/B13789_01/server.101/b10739/partiti.htm#i1006820 Partition management in Oracle 10g]<br />
* [http://www.oracle.com/technetwork/articles/sql/11g-partitioning-084209.html Partition management in Oracle 11g including interval partitions]<br />
* [http://dev.mysql.com/doc/refman/5.1/en/partitioning.html MySQL partitioning]<br />
<br />
<br />
=== DB2-Style ===<br />
<br />
DB2 uses modifier clauses in the CREATE TABLE statement for partitioning. It includes a native form of sharding in the same implementation<br />
{|<br />
! Clause in the CREATE TABLE statement || DB2 feature name<br />
|-<br />
| DISTRIBUTE BY HASH || DPF - Database Partitioning Feature<br />
|-<br />
| ORGANIZE BY DIMENSION || MDC - Multidimensional Clustering<br />
|-<br />
| PARTITION BY RANGE || TP - Table partitioning<br />
|}<br />
<br />
The clauses in any combination to achieve the desired effect.<br />
(cfr. https://www.ibm.com/developerworks/data/library/techarticle/dm-0608mcinerney/)<br />
<br />
- DPF splits into "database partitions"(we would call them shards). "Each database partition has its own set of computing resources, including CPU and storage. In a DPF environment, each table row is distributed to a database partition according to the distribution key specified in the CREATE TABLE statement. When a query is processed, the request is divided so each database partition processes the rows that it is responsible for." <br />
<br />
- MDC enables rows with similar values across multiple dimensions to be physically clustered together on disk. <br />
This clustering allows for efficient I/O for typical analytical queries. For example, all rows where Product='car', Region='East', and SaleMonthYear='Jan09' can be stored in the same storage location, known as a block.<br />
<br />
- TP is what we know as "range partitioning" or "list partitioning", and is implemented in a very similar way as what Postgres currently has: "the user can manually define each data partition, including the range of values to include in that data partition." (and MDC automatically allocates storage for it). "Each TP partition is a separate database object (unlike other tables which are a single database object). Consequently, TP supports attaching and detaching a data partition from the TP table. A detached partitions becomes a regular table. As well, each data partition can be placed in its own table space, if desired."<br />
<br />
The key point seems to be that all three features are orthogonal among them, and can be added at table creation time as well as later on. Moreover, sharding is made a first-class citizen and directly supported by the DB. ISTM that we could leverage an evolved version of postgres_fdw (plus some code borrowed from pg_shard and/or PL/Proxy) to this effect.<br />
<br />
<br />
MQTs (materialized query tables) ---what we call materialized views--- are also subject to partitioning (apparently, also to sharding) directly.<br />
<br />
Syntax Examples:<br />
<br />
CREATE TABLE orders(id INT, shipdate DATE, …)<br />
PARTITION BY RANGE(shipdate)<br />
(<br />
STARTING '1/1/2006' ENDING '12/31/2006' <br />
EVERY 3 MONTHS<br />
)<br />
<br />
Auto-partitioning by interval is nice to have ...<br />
<br />
<br />
CREATE TABLE orders(id INT, shipdate DATE, …)<br />
PARTITION BY RANGE(shipdate)<br />
(<br />
PARTITION q4_05 STARTING MINVALUE,<br />
PARTITION q1_06 STARTING '1/1/2006',<br />
PARTITION q2_06 STARTING '4/1/2006',<br />
PARTITION q3_06 STARTING '7/1/2006',<br />
PARTITION q4_06 STARTING '10/1/2006' <br />
ENDING ‘12/31/2006'<br />
)<br />
<br />
This is equivalent to "VALUES LESS THAN"(technically VALUES GREATER THAN) and includes a limit<br />
<br />
The partition manipulation syntax (here, addition) is nice, too:<br />
ALTER TABLE orders<br />
ATTACH PARTITION q1_07<br />
STARTING '01/01/2007'<br />
ENDING '03/31/2007'<br />
FROM TABLE neworders<br />
<br />
<br />
References:<br />
* https://www.ibm.com/developerworks/data/library/techarticle/dm-0608mcinerney/<br />
* http://www.ibm.com/developerworks/data/library/techarticle/dm-0605ahuja2/<br />
<br />
=== MySQL-style ===<br />
<br />
Fairly basic, supports RANGE, LIST and HASH<br />
<br />
CREATE TABLE ti (id INT, amount DECIMAL(7,2), tr_date DATE)<br />
ENGINE=INNODB<br />
PARTITION BY HASH( MONTH(tr_date) )<br />
PARTITIONS 6;<br />
<br />
References:<br />
* http://dev.mysql.com/doc/refman/5.6/en/partitioning-overview.html<br />
<br />
=== Trigger-based ===<br />
First attempts to support auto-partitioning have been made using triggers.<br />
* avoid specific languages such as pgpsql that requires 'CREATE LANGUAGE'<br />
* performance of C trigger 4 to 5 times faster than pgpsql<br />
* insert/copy returns 0 rows when all rows have been routed by trigger from master to child tables<br />
* chaining triggers allow tunable behavior in case of rows not matching any partition: add an error trigger, move to an overflow table, create new partitions dynamically<br />
* constraint_exclusion does not work well with prepared statements. It might possible to convert CHECKs to One-Time Filter plan nodes if the condition is a variable.<br />
<br />
= Active Work In Progress =<br />
<br />
<br />
== Syntax ==<br />
Syntax is proposed at "[https://commitfest-old.postgresql.org/action/patch_view?id=207 Syntax for partitioning]", [https://commitfest-old.postgresql.org/action/patch_view?id=266 second version]. The syntax resembles [[Oracle]] and [[MySQL]]. See also [[Todo#Administration]] (Simplify ability to create partitioned tables).<br />
<br />
-- create partitioned table and child partitions at once.<br />
CREATE TABLE parent (...)<br />
PARTITION BY [ RANGE | LIST ] ( key ) [ opclass ]<br />
[ (<br />
PARTITION child<br />
{<br />
VALUES LESS THAN { ... | MAXVALUE } -- for RANGE<br />
| VALUES [ IN ] ( { ... | DEFAULT } ) -- for LIST<br />
}<br />
[ WITH ( ... ) ] [ TABLESPACE tbs ]<br />
[, ...]<br />
) ] ;<br />
<br />
-- add a partition key to a table.<br />
ALTER TABLE parent PARTITION BY [ RANGE | LIST ] ( key ) [ opclass ] [ (...) ] ;<br />
<br />
-- create a new partition on a partitioned table.<br />
CREATE PARTITION child ON parent VALUES ... ;<br />
<br />
-- add a table as a partition.<br />
ALTER TABLE parent ATTACH PARTITION child VALUES ... ;<br />
<br />
-- Remove a partition as a normal table.<br />
ALTER TABLE parent DETACH PARTITION child ;<br />
<br />
== Internal representation ==<br />
On-disk structure is included in the "Syntax for partitioning" patch.<br />
On-memory structure will be proposed in a future patch<br />
<br />
=== On-disk structure ===<br />
A new system table "pg_partition" added.<br />
Partition keys are stored in it.<br />
<br />
CREATE TABLE pg_catalog.pg_partition<br />
(<br />
partrelid oid NOT NULL, -- partitioned table oid<br />
partopclass oid NOT NULL, -- operator class to compare keys<br />
partkind "char" NOT NULL, -- kind of partition: RANGE or LIST<br />
partkey text, -- partition key expression<br />
<br />
PRIMARY KEY (partrelid),<br />
FOREIGN KEY (partrelid) REFERENCES pg_class (oid),<br />
FOREIGN KEY (partopclass) REFERENCES pg_opclass (oid)<br />
)<br />
WITHOUT OIDS ;<br />
<br />
A new column "inhvalues" are added into pg_inherits.<br />
Partition values for each partition are stored in it.<br />
<br />
ALTER TABLE pg_class.pg_inherits ADD COLUMN inhvalues anyarray ;<br />
<br />
* RANGE partition has an upper value of the range in inhvalues.<br />
* LIST partition has an array with multiple elements in inhvalues.<br />
* An overflow partition has an empty array in inhvalues.<br />
* A normal inherited table has a NULL in inhvalues.<br />
<br />
=== On-memory structure ===<br />
A cached list of partitions are sorted by partition values and stored in the relcache for the parent table. Changes to the partitions would need to invalidate parent caches to ensure the cache is accurately maintained.<br />
<br />
== Operations ==<br />
=== INSERT ===<br />
INSERT TRIGGERs will be replaced with specialized tuple-routing feature using on-memory structure. Tuples will be routed in O(log N). It also solve "0 row affected" problem in INSERT TRIGGERs.<br />
<br />
=== SELECT, UPDATE, DELETE ===<br />
CHECK constraints continue to be used for a while.<br />
<br />
It would be improved using on-memory structure; instead of CHECK constraints for each child tables, we can use a sorted list in the parent table. Constraint exclusion can be in O(log N) order instead of O(N) of now.<br />
<br />
=== VACUUM, CLUSTER, REINDEX ===<br />
We don't expand those commands for now, but they might have to be expanded like as TRUNCATE.<br />
<br />
= Future improvements =<br />
They are hard to fix in 9.0, but should continue to be improved in the future releases.<br />
<br />
=== Syntax ===<br />
* Support SPLIT and MERGE for existing partitions. See also [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01831.php Kedar's patch]<br />
* Support UPDATE of partition keys and values.<br />
* Support adding a partition between existing partitions. It requires SPLIT feature.<br />
* Support sub-partitions.<br />
* Support some partition kinds for GIS types. For example, "PARTITION BY GIST" holds partition keys as a GiST tree in on-memory structure.<br />
* Support HASH partitions. Each partition could be a FOREIGN TABLE in [[SQL/MED]]. In other words, it is [[PL/Proxy]] integration.<br />
* Support CREATE TABLE AS -- CREATE TABLE tbl PARTITION BY ... AS SELECT ...;<br />
<br />
=== Executor ===<br />
* SELECT FOR SHARE/UPDATE for parent tables.<br />
* Prepared statements that uses partition keys in place holders.<br />
** An idea is to convert check constraints into One-Time_Filter [http://archives.postgresql.org/message-id/20081013172100.87A1.52131E4D@oss.ntt.co.jp]<br />
* Unique constraint over multiple partitions, when each partition has a unique index on set/superset of partition keys<br />
* Unique constraints over multiple partitions in the general case (typically called as "global index").<br />
<br />
=== Planner ===<br />
* Optimization for min/max, LIMIT + ORDER BY, GROUP BY on partition keys.<br />
* Optimization when constraint exclusion are used with stable or volatile functions. It is a very common case that the partition key is timestamp and compared with now().<br />
* Join optimization for two partitioned tables.<br />
<br />
= Third-Party Tools =<br />
<br />
=== PG Partition Manager ===<br />
* [https://github.com/keithf4/pg_partman Project Home Page]<br />
* This is an extension that automates time & serial based partitioning (basically does interval partitioning setting up the right triggers for you). <br />
* Handles setting up, partitioning existing data, dropping unneeded child tables, & undoing partitioning.<br />
<br />
[[Category:Table partitioning]]</div>Jerhttps://wiki.postgresql.org/index.php?title=Table_partitioning&diff=30918Table partitioning2017-09-25T18:44:20Z<p>Jer: /* Limitations (of declarative partitioning in PostgreSQL 10) */ according to pg10 docs we do have default partitions</p>
<hr />
<div>= Background =<br />
<br />
== Status Quo ==<br />
Starting in PostgreSQL 10, we have declarative partitioning. With it, there is dedicated syntax to create range and list *partitioned* tables and their partitions. Although significant limitations still exist in the usage of partitioned tables, such as the inability to create indexes, row-level triggers, etc. on the partitioned parent table, a lot of manual steps are now rendered unnecessary.<br />
<br />
It is still possible to use the older methods of partitioning if need to implement some custom partitioning criteria (other than the range and list methods that declarative partitioning natively supports), or if the limitations of declarative partitioned tables are seen as hindering. See [http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html PostgreSQL Partitioning] for details. There are some 3rd party plugins that simplify the (manual) task/triggers, etc. see bottom of this page. Although declarative partitioning in PostgreSQL 10 reduces a lot of manual steps, such 3rd party plugins still offer features that the core system does not provide.<br />
<br />
See the various blogs out there describing both the new declarative partitioning and the older inheritance-based implementation.<br />
<br />
=== Resolved Issues ===<br />
* SELECT, UPDATE, DELETE (in 8.2) : They can be handled with constraint_exclusion.<br />
* TRUNCATE (in 8.4) : TRUNCATE for a parent table is expanded into child tables.<br />
* ANALYZE (in 9.0) : {{MessageLink|20091229201145.CF641753FB7@cvs.postgresql.org|ANALYZE to compute such stats for tables that have subclasses}}<br />
* MAX()/MIN() (in 9.1) : Smarter partition detection.<br />
* NO INHERIT constraints (in 9.2) make it possible to define a constraint only on the parent such that it will always be excluded; declarative partitioning (in upcoming 10) always excludes the parent without any additional configuration<br />
* With declarative partitioning (in upcoming 10), tuples inserted into the parent partitioned table are automatically routed to the leaf partitions<br />
<br />
=== Limitations (of declarative partitioning in PostgreSQL 10) ===<br />
* No support for hash partitioning<br />
* No support for UPDATEs that cause rows to move from one partition to another<br />
* No support for routing tuples to partitions that are foreign tables<br />
* No support for index constraints, such as UNIQUE, across the entire partition tree; indexes need to be defined on the individual leaf partitions (unique indexes span only the individual partitions)<br />
* No support for referencing partitioned parent tables in foreign key relationships, nor is there support for referencing regular tables from partitioned parent tables<br />
* No support for defining row triggers on the partitioned parent tables<br />
* No support for "splitting" or "merging" partitions using dedicated commands<br />
<br />
== Overviews of Project Goals ==<br />
* [[:Image:Partitioning Requirements.pdf | Partitioning Requirements document from Simon Riggs (2008)]]<br />
* [[PgCon 2008 Developer Meeting#Partitioning_Roadmap|PGCon 2008 Developer meeting roadmap]]<br />
<br />
=== List discussions ===<br />
<br />
* [http://www.postgresql.org/message-id/1115677858.3830.131.camel@localhost.localdomain <nowiki>(2005-05) Table Partitioning, Part 1</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2007-03/msg00375.php <nowiki>(2007-03) Auto creation of Partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2007-04/msg00151.php <nowiki>(2007-04) Re: Auto Partitioning Patch - WIP version 1</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00028.php <nowiki>(2008-01) Dynamic Partitioning using Segment Visibility Maps</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00248.php <nowiki>(2008-01) Named vs Unnamed Partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00387.php <nowiki>(2008-01) Storage Model for Partitioning</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-01/msg00413.php <nowiki>(2008-01) Declarative partitioning grammar</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2008-10/msg01097.php <nowiki>(2008-10) Auto-Partitioning patch discussion</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2009-03/msg00897.php <nowiki>(2009-03) Partitioning feature</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-performance/2009-05/msg00005.php <nowiki>(2009-05) Transparent table partitioning in future version of PG?</nowiki>]<br />
* [http://archives.postgresql.org/message-id/1247564358.11347.1308.camel@ebony.2ndQuadrant <nowiki>(2009-07) Comments on automatic DML routing and explicit partitioning subcommands</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01831.php <nowiki>(2009-10) Patch for automated partitioning</nowiki>]<br />
* [http://archives.postgresql.org/message-id/20091112195450.A967.52131E4D@oss.ntt.co.jp <nowiki>(2009-11) Syntax for partitioning</nowiki>]<br />
* [http://archives.postgresql.org/message-id/4AFADD6A.9070002@asterdata.com <nowiki>(2009-11) Partitioning support for COPY</nowiki>]<br />
* [http://www.postgresql.org/message-id/20100114181323.9A33.52131E4D@oss.ntt.co.jp <nowiki>(2010-01) Partitioning syntax</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-07/msg01519.php <nowiki>(2010-07) Scalability of the planner with non trivial number of partitions</nowiki>]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-07/msg01449.php <nowiki>(2011-07) New partitioning WAS: Check constraints on partition parents only?</nowiki>]<br />
* [http://www.postgresql.org/message-id/20140829155607.GF7705@eldon.alvh.no-ip.org <nowiki>(2014-08) On partitioning</nowiki>]<br />
* [http://www.postgresql.org/message-id/54EC32B6.9070605@lab.ntt.co.jp <nowiki>(2015-02) Partitioning WIP patch</nowiki>]<br />
* [http://www.postgresql.org/message-id/55D3093C.5010800@lab.ntt.co.jp <nowiki>(2015-08) Declarative partitioning</nowiki>]<br />
* [https://www.postgresql.org/message-id/ad16e2f5-fc7c-cc2d-333a-88d4aa446f96@lab.ntt.co.jp <nowiki>(2016-08) Declarative partitioning - another take</nowiki>]<br />
<br />
== Possible Directions ==<br />
<br />
=== Oracle-Style ===<br />
Allow users to declare their intention with partitioned tables. Ie, declare what the partition key is and what range or values are covered by each partition.<br />
<br />
I think this would mean two new types of relation. One "meta-table" that acts like a view, in that it doesn't have an attached filenode. It would also have some kind of meta data about the partition key but no view definition, it would act like parent tables in nested table structure do now. The other would be "partition" which would be a separate namespace from tables and would have attached information about what values of the partition key it covered.<br />
<br />
Pros:<br />
<br />
* Makes it more reasonable to handle inserts automatically since the structure is explicit and doesn't require making logical deductions. <br />
* More idiot-proof, ie you can't set up nonsensical combinations of constraints.<br />
* Consistent with other databases and DBA expectations.<br />
<br />
Cons:<br />
<br />
* Less flexible, you can't set up arbitrary non-traditional structures such as having some data in the parent table or having extra columns in some children.<br />
<br />
Background:<br />
* [http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_7002.htm Oracle CREATE TABLE syntax]<br />
* [http://download-east.oracle.com/docs/cd/B13789_01/server.101/b10736/parpart.htm Partitioning in Oracle 10g]<br />
* [http://download-east.oracle.com/docs/cd/B13789_01/server.101/b10739/partiti.htm#i1006820 Partition management in Oracle 10g]<br />
* [http://www.oracle.com/technetwork/articles/sql/11g-partitioning-084209.html Partition management in Oracle 11g including interval partitions]<br />
* [http://dev.mysql.com/doc/refman/5.1/en/partitioning.html MySQL partitioning]<br />
<br />
<br />
=== DB2-Style ===<br />
<br />
DB2 uses modifier clauses in the CREATE TABLE statement for partitioning. It includes a native form of sharding in the same implementation<br />
{|<br />
! Clause in the CREATE TABLE statement || DB2 feature name<br />
|-<br />
| DISTRIBUTE BY HASH || DPF - Database Partitioning Feature<br />
|-<br />
| ORGANIZE BY DIMENSION || MDC - Multidimensional Clustering<br />
|-<br />
| PARTITION BY RANGE || TP - Table partitioning<br />
|}<br />
<br />
The clauses in any combination to achieve the desired effect.<br />
(cfr. https://www.ibm.com/developerworks/data/library/techarticle/dm-0608mcinerney/)<br />
<br />
- DPF splits into "database partitions"(we would call them shards). "Each database partition has its own set of computing resources, including CPU and storage. In a DPF environment, each table row is distributed to a database partition according to the distribution key specified in the CREATE TABLE statement. When a query is processed, the request is divided so each database partition processes the rows that it is responsible for." <br />
<br />
- MDC enables rows with similar values across multiple dimensions to be physically clustered together on disk. <br />
This clustering allows for efficient I/O for typical analytical queries. For example, all rows where Product='car', Region='East', and SaleMonthYear='Jan09' can be stored in the same storage location, known as a block.<br />
<br />
- TP is what we know as "range partitioning" or "list partitioning", and is implemented in a very similar way as what Postgres currently has: "the user can manually define each data partition, including the range of values to include in that data partition." (and MDC automatically allocates storage for it). "Each TP partition is a separate database object (unlike other tables which are a single database object). Consequently, TP supports attaching and detaching a data partition from the TP table. A detached partitions becomes a regular table. As well, each data partition can be placed in its own table space, if desired."<br />
<br />
The key point seems to be that all three features are orthogonal among them, and can be added at table creation time as well as later on. Moreover, sharding is made a first-class citizen and directly supported by the DB. ISTM that we could leverage an evolved version of postgres_fdw (plus some code borrowed from pg_shard and/or PL/Proxy) to this effect.<br />
<br />
<br />
MQTs (materialized query tables) ---what we call materialized views--- are also subject to partitioning (apparently, also to sharding) directly.<br />
<br />
Syntax Examples:<br />
<br />
CREATE TABLE orders(id INT, shipdate DATE, …)<br />
PARTITION BY RANGE(shipdate)<br />
(<br />
STARTING '1/1/2006' ENDING '12/31/2006' <br />
EVERY 3 MONTHS<br />
)<br />
<br />
Auto-partitioning by interval is nice to have ...<br />
<br />
<br />
CREATE TABLE orders(id INT, shipdate DATE, …)<br />
PARTITION BY RANGE(shipdate)<br />
(<br />
PARTITION q4_05 STARTING MINVALUE,<br />
PARTITION q1_06 STARTING '1/1/2006',<br />
PARTITION q2_06 STARTING '4/1/2006',<br />
PARTITION q3_06 STARTING '7/1/2006',<br />
PARTITION q4_06 STARTING '10/1/2006' <br />
ENDING ‘12/31/2006'<br />
)<br />
<br />
This is equivalent to "VALUES LESS THAN"(technically VALUES GREATER THAN) and includes a limit<br />
<br />
The partition manipulation syntax (here, addition) is nice, too:<br />
ALTER TABLE orders<br />
ATTACH PARTITION q1_07<br />
STARTING '01/01/2007'<br />
ENDING '03/31/2007'<br />
FROM TABLE neworders<br />
<br />
<br />
References:<br />
* https://www.ibm.com/developerworks/data/library/techarticle/dm-0608mcinerney/<br />
* http://www.ibm.com/developerworks/data/library/techarticle/dm-0605ahuja2/<br />
<br />
=== MySQL-style ===<br />
<br />
Fairly basic, supports RANGE, LIST and HASH<br />
<br />
CREATE TABLE ti (id INT, amount DECIMAL(7,2), tr_date DATE)<br />
ENGINE=INNODB<br />
PARTITION BY HASH( MONTH(tr_date) )<br />
PARTITIONS 6;<br />
<br />
References:<br />
* http://dev.mysql.com/doc/refman/5.6/en/partitioning-overview.html<br />
<br />
=== Trigger-based ===<br />
First attempts to support auto-partitioning have been made using triggers.<br />
* avoid specific languages such as pgpsql that requires 'CREATE LANGUAGE'<br />
* performance of C trigger 4 to 5 times faster than pgpsql<br />
* insert/copy returns 0 rows when all rows have been routed by trigger from master to child tables<br />
* chaining triggers allow tunable behavior in case of rows not matching any partition: add an error trigger, move to an overflow table, create new partitions dynamically<br />
* constraint_exclusion does not work well with prepared statements. It might possible to convert CHECKs to One-Time Filter plan nodes if the condition is a variable.<br />
<br />
= Active Work In Progress =<br />
<br />
<br />
== Syntax ==<br />
Syntax is proposed at "[https://commitfest-old.postgresql.org/action/patch_view?id=207 Syntax for partitioning]", [https://commitfest-old.postgresql.org/action/patch_view?id=266 second version]. The syntax resembles [[Oracle]] and [[MySQL]]. See also [[Todo#Administration]] (Simplify ability to create partitioned tables).<br />
<br />
-- create partitioned table and child partitions at once.<br />
CREATE TABLE parent (...)<br />
PARTITION BY [ RANGE | LIST ] ( key ) [ opclass ]<br />
[ (<br />
PARTITION child<br />
{<br />
VALUES LESS THAN { ... | MAXVALUE } -- for RANGE<br />
| VALUES [ IN ] ( { ... | DEFAULT } ) -- for LIST<br />
}<br />
[ WITH ( ... ) ] [ TABLESPACE tbs ]<br />
[, ...]<br />
) ] ;<br />
<br />
-- add a partition key to a table.<br />
ALTER TABLE parent PARTITION BY [ RANGE | LIST ] ( key ) [ opclass ] [ (...) ] ;<br />
<br />
-- create a new partition on a partitioned table.<br />
CREATE PARTITION child ON parent VALUES ... ;<br />
<br />
-- add a table as a partition.<br />
ALTER TABLE parent ATTACH PARTITION child VALUES ... ;<br />
<br />
-- Remove a partition as a normal table.<br />
ALTER TABLE parent DETACH PARTITION child ;<br />
<br />
== Internal representation ==<br />
On-disk structure is included in the "Syntax for partitioning" patch.<br />
On-memory structure will be proposed in a future patch<br />
<br />
=== On-disk structure ===<br />
A new system table "pg_partition" added.<br />
Partition keys are stored in it.<br />
<br />
CREATE TABLE pg_catalog.pg_partition<br />
(<br />
partrelid oid NOT NULL, -- partitioned table oid<br />
partopclass oid NOT NULL, -- operator class to compare keys<br />
partkind "char" NOT NULL, -- kind of partition: RANGE or LIST<br />
partkey text, -- partition key expression<br />
<br />
PRIMARY KEY (partrelid),<br />
FOREIGN KEY (partrelid) REFERENCES pg_class (oid),<br />
FOREIGN KEY (partopclass) REFERENCES pg_opclass (oid)<br />
)<br />
WITHOUT OIDS ;<br />
<br />
A new column "inhvalues" are added into pg_inherits.<br />
Partition values for each partition are stored in it.<br />
<br />
ALTER TABLE pg_class.pg_inherits ADD COLUMN inhvalues anyarray ;<br />
<br />
* RANGE partition has an upper value of the range in inhvalues.<br />
* LIST partition has an array with multiple elements in inhvalues.<br />
* An overflow partition has an empty array in inhvalues.<br />
* A normal inherited table has a NULL in inhvalues.<br />
<br />
=== On-memory structure ===<br />
A cached list of partitions are sorted by partition values and stored in the relcache for the parent table. Changes to the partitions would need to invalidate parent caches to ensure the cache is accurately maintained.<br />
<br />
== Operations ==<br />
=== INSERT ===<br />
INSERT TRIGGERs will be replaced with specialized tuple-routing feature using on-memory structure. Tuples will be routed in O(log N). It also solve "0 row affected" problem in INSERT TRIGGERs.<br />
<br />
=== SELECT, UPDATE, DELETE ===<br />
CHECK constraints continue to be used for a while.<br />
<br />
It would be improved using on-memory structure; instead of CHECK constraints for each child tables, we can use a sorted list in the parent table. Constraint exclusion can be in O(log N) order instead of O(N) of now.<br />
<br />
=== VACUUM, CLUSTER, REINDEX ===<br />
We don't expand those commands for now, but they might have to be expanded like as TRUNCATE.<br />
<br />
= Future improvements =<br />
They are hard to fix in 9.0, but should continue to be improved in the future releases.<br />
<br />
=== Syntax ===<br />
* Support SPLIT and MERGE for existing partitions. See also [http://archives.postgresql.org/pgsql-hackers/2009-10/msg01831.php Kedar's patch]<br />
* Support UPDATE of partition keys and values.<br />
* Support adding a partition between existing partitions. It requires SPLIT feature.<br />
* Support sub-partitions.<br />
* Support some partition kinds for GIS types. For example, "PARTITION BY GIST" holds partition keys as a GiST tree in on-memory structure.<br />
* Support HASH partitions. Each partition could be a FOREIGN TABLE in [[SQL/MED]]. In other words, it is [[PL/Proxy]] integration.<br />
* Support CREATE TABLE AS -- CREATE TABLE tbl PARTITION BY ... AS SELECT ...;<br />
<br />
=== Executor ===<br />
* SELECT FOR SHARE/UPDATE for parent tables.<br />
* Prepared statements that uses partition keys in place holders.<br />
** An idea is to convert check constraints into One-Time_Filter [http://archives.postgresql.org/message-id/20081013172100.87A1.52131E4D@oss.ntt.co.jp]<br />
* Unique constraint over multiple partitions, when each partition has a unique index on set/superset of partition keys<br />
* Unique constraints over multiple partitions in the general case (typically called as "global index").<br />
<br />
=== Planner ===<br />
* Optimization for min/max, LIMIT + ORDER BY, GROUP BY on partition keys.<br />
* Optimization when constraint exclusion are used with stable or volatile functions. It is a very common case that the partition key is timestamp and compared with now().<br />
* Join optimization for two partitioned tables.<br />
<br />
= Third-Party Tools =<br />
<br />
=== PG Partition Manager ===<br />
* [https://github.com/keithf4/pg_partman Project Home Page]<br />
* This is an extension that automates time & serial based partitioning (basically does interval partitioning setting up the right triggers for you). <br />
* Handles setting up, partitioning existing data, dropping unneeded child tables, & undoing partitioning.<br />
<br />
[[Category:Table partitioning]]</div>Jerhttps://wiki.postgresql.org/index.php?title=User:Jer&diff=30917User:Jer2017-09-25T18:07:03Z<p>Jer: Created page with "Jeremy Schneider - [https://about.me/jeremy_schneider about.me] I started with Oracle's database engine but these days I'm focused on PostgreSQL. You can also find me on [ht..."</p>
<hr />
<div>Jeremy Schneider - [https://about.me/jeremy_schneider about.me]<br />
<br />
I started with Oracle's database engine but these days I'm focused on PostgreSQL.<br />
<br />
You can also find me on [https://twitter.com/jer_s twitter] and [https://www.ardentperf.com my blog].</div>Jerhttps://wiki.postgresql.org/index.php?title=New_in_postgres_10&diff=30916New in postgres 102017-09-25T17:25:41Z<p>Jer: /* What's New In PostgreSQL 10 */ add useful links about new features in general</p>
<hr />
<div>= What's New In PostgreSQL 10 =<br />
<br />
General Links:<br />
* [https://www.postgresql.org/docs/10/static/release-10.html Release Notes]<br />
* [http://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL_10_New_Features_en_20170522-1.pdf PostgreSQL 10 New Features With Examples] - HPE.com<br />
* [[PostgreSQL10_Roadmap]]<br />
<br />
== Big Data ==<br />
<br />
=== Native Partitioning ===<br />
<br />
In 10, partitioning tables is now an attribute of the table:<br />
<br />
CREATE TABLE table_name ( ... )<br />
[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) }<br />
<br />
CREATE TABLE table_name<br />
PARTITION OF parent_table [ (<br />
) ] FOR VALUES partition_bound_spec<br />
<br />
'''Example'''<br />
<br />
Before:<br />
CREATE TABLE padre (<br />
id SERIAL,<br />
pais INTEGER,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
);<br />
<br />
CREATE TABLE hija_2017 (<br />
CONSTRAINT pk_2017 PRIMARY KEY (id),<br />
CONSTRAINT ck_2017 CHECK (fch_creado < DATE '2015-01-01' )<br />
) INHERITS (padre);<br />
CREATE INDEX idx_2017 ON hija_2017 (fch_creado);<br />
<br />
Today:<br />
CREATE TABLE padre (<br />
id SERIAL NOT NULL,<br />
nombre TEXT NOT NULL,<br />
fch_creado TIMESTAMPTZ NOT NULL<br />
)<br />
PARTITION BY RANGE ( id );<br />
<br />
CREATE TABLE hijo_0<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (unbounded) TO (10);<br />
<br />
CREATE TABLE hijo_1<br />
PARTITION OF padre (id, PRIMARY KEY (id), UNIQUE (nombre))<br />
FOR VALUES FROM (10) TO (unbounded);<br />
<br />
This means that users no longer need to create triggers for routing data; it's all handled by the system.<br />
<br />
'''Another Example:'''<br />
<br />
For example, we might decide to partition the `book_history` table, probably a good idea since that table is liable to accumulate data forever. Since it's a log table, we'll range partition it, with one partition per month.<br />
<br />
First, we create a "master" partition table, which will hold no data but forms a template for the rest of the partitions:<br />
<br />
libdata=# CREATE TABLE book_history (<br />
book_id INTEGER NOT NULL,<br />
status BOOK_STATUS NOT NULL,<br />
period TSTZRANGE NOT NULL )<br />
PARTITION BY RANGE ( lower (period) );<br />
<br />
Then we create several partitions, one per month:<br />
<br />
libdata=# CREATE TABLE book_history_2016_09<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-09-01 00:00:00') TO ('2016-10-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_08<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-08-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
CREATE TABLE<br />
libdata=# CREATE TABLE book_history_2016_07<br />
PARTITION OF book_history<br />
FOR VALUES FROM ('2016-07-01 00:00:00') TO ('2016-09-01 00:00:00');<br />
ERROR: partition "book_history_2016_07" would overlap partition "book_history_2016_08"<br />
<br />
As you can see, the system even prevents accidental overlap. New rows will automatically be stored in the correct partition, and SELECT queries will search the appropriate partitions.<br />
<br />
* [https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 commit]<br />
* [https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#ddl-partitioning-declarative Documentation]<br />
* [https://www.keithf4.com/postgresql-10-built-in-partitioning/ Built-in Partitioning]<br />
<br />
=== Additional Parallelism ===<br />
<br />
Some additional plan nodes can be executed in parallel, particularly Index Scans.<br />
<br />
'''Example:'''<br />
<br />
For example, if we wanted to search financial transaction history by an indexed column, I can now execute it in one-quarter the time by using four parallel workers:<br />
<br />
accounts=# \timing<br />
Timing is on.<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 group by bid;<br />
...<br />
Time: 324.903 ms<br />
<br />
accounts=# set max_parallel_workers_per_gather=4;<br />
SET<br />
Time: 0.822 ms<br />
accounts=# SELECT bid, count(*) FROM account_history<br />
WHERE delta > 1000 GROUP BY bid;<br />
...<br />
Time: 72.864 ms<br />
<br />
(this assumes an index on bid, delta)<br />
<br />
Links:<br />
<br />
* [http://rhaas.blogspot.com.ar/2017/03/parallel-query-v2.html Parallel Query v2]<br />
<br />
=== Additional FDW Push-Down ===<br />
<br />
=== Faster Analytics Queries ===<br />
<br />
== Replication and Scaling ==<br />
<br />
=== Logical Replication ===<br />
<br />
Streaming replication is a fast, secure and is a perfect mechanism for high availability/disaster recovery needs. As it works on the whole instance, replicating only part of the primary server is not possible, nor is it possible to write on the secondary. Logical replication will allow us to tackle those use-cases.<br />
<br />
'''Example:'''<br />
<br />
Suppose I decide I want to replicate just the fines and loans tables from my public library database to the billing system so that they can process amounts owed. I would create a publication from those two tables with this command:<br />
<br />
libdata=# CREATE PUBLICATION financials FOR TABLE ONLY loans, ONLY fines;<br />
CREATE PUBLICATION<br />
<br />
Then, in the billing database, I would create two tables that looked identical to the tables I'm replicating, and have the same names. They can have additional columns and a few other differences. Particularly, since I'm not copying the patrons or books tables, I'll want to drop some foreign keys that they origin database has. I also need to create any special data types or other database artifacts required for those tables. Often the easiest way to do this is selective use of the `pg_dump` and `pg_restore` backup utilities:<br />
<br />
origin# pg_dump libdata -Fc -f /netshare/libdata.dump<br />
<br />
replica# pg_restore -d libdata -s -t loans -t fines /netshare/libdata.dump<br />
<br />
Following that, I can start a Subscription to those two tables:<br />
<br />
libdata=# CREATE SUBSCRIPTION financials<br />
CONNECTION 'dbname=libdata user=postgres host=172.17.0.2'<br />
PUBLICATION financials;<br />
NOTICE: synchronized table states<br />
NOTICE: created replication slot "financials" on publisher<br />
CREATE SUBSCRIPTION<br />
<br />
This will first copy a snapshot of the data currently in the tables, and then start catching up from the transaction log. Once it's caught up, you can check status in pg_stat_subscription:<br />
<br />
libdata=# SELECT * FROM pg_stat_subscription;<br />
-[ RECORD 1 ]---------+---------------------<br />
subid | 16475<br />
subname | financials<br />
pid | 167<br />
relid |<br />
received_lsn | 0/1FBEAF0<br />
last_msg_send_time | 2017-06-07 00:59:44<br />
last_msg_receipt_time | 2017-06-07 00:59:44<br />
latest_end_lsn | 0/1FBEAF0<br />
latest_end_time | 2017-06-07 00:59:44<br />
<br />
blogs:<br />
<br />
* [https://blog.2ndquadrant.com/logical-replication-postgresql-10/ Logical Replication in PostgreSQL 10]<br />
<br />
=== Quorum Commit for Synchronous Replication ===<br />
While version 9.6 introduced quorum based synchronous replication, <br />
<br />
synchronous_commit = 'remote_apply'<br />
<br />
version 10 improves the synchronous_standby_names GUC by adding the FIRST and ANY keywords:<br />
<br />
synchronous_standby_names = ANY 2(node1,node2,node3);<br />
synchronous_standby_names = FIRST 2(node1,node2);<br />
<br />
FIRST was the previous behaviour, and the nodes priority is following the list order in order to get a quorum. ANY now means that any node in the list is now able to provide the required quorum. This will give extra flexibility to complex replication setups.<br />
<br />
=== Connection "Failover" in libpq ===<br />
<br />
[http://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Implement failover on libpq connect level]<br />
<br />
=== Traceable Commit ===<br />
<br />
[https://blog.2ndquadrant.com/traceable-commit-postgresql-10/ Traceable commit for PostgreSQL 10]<br />
<br />
=== Physical Replication ===<br />
<br />
Improved performance of the replay of 2-phase commits<br />
<br />
Improved performance of replay when access exclusive locks are held on objects on the standby server. This can significantly improve performance in cases where temporary tables are being used.<br />
<br />
== Administration ==<br />
<br />
=== Compression support for pg_receivewal ===<br />
<br />
== SQL features ==<br />
<br />
=== Identity Columns ===<br />
<br />
[https://blog.2ndquadrant.com/postgresql-10-identity-columns/ PostgreSQL 10 identity columns explained]<br />
<br />
=== Crash Safe, Replicable Hash Indexes ===<br />
<br />
=== Transition Tables for Triggers ===<br />
<br />
This feature makes AFTER STATEMENT triggers both useful and performant by<br />
exposing, as appropriate, the old and new rows to queries. Before this feature,<br />
AFTER STATEMENT triggers had no direct access to these, and the workarounds were<br />
byzantine and had poor performance. Much trigger logic can now be written as<br />
AFTER STATEMENT, avoiding the need to do the expensive context switches at each<br />
row that FOR EACH ROW triggers require.<br />
<br />
== XML and JSON == <br />
<br />
=== XMLTable ===<br />
<br />
[https://blog.2ndquadrant.com/xmltable-intro/ XMLTABLE] is a SQL-standard feature that allows transforming an XML document to table format,<br />
making it much easier to process XML data in the database.<br />
Coupled with foreign tables pointing to external XML data, this can greatly simplify ETL processing.<br />
<br />
=== Full Text Search support for JSON and JSONB ===<br />
<br />
You can now create Full Text Indexes on JSON and JSONB columns.<br />
<br />
This involves converting the JSONB field to a `tsvector`, then creating an specific language full-text index on it:<br />
<br />
libdata=# CREATE INDEX bookdata_fts ON bookdata<br />
USING gin (( to_tsvector('english',bookdata) ));<br />
CREATE INDEX<br />
<br />
Once that's set up, you can do full-text searching against all of the values in your JSON documents:<br />
<br />
libdata=# SELECT bookdata -> 'title'<br />
FROM bookdata<br />
WHERE to_tsvector('english',bookdata) @@ to_tsquery('duke'); <br />
------------------------------------------<br />
"The Tattooed Duke"<br />
"She Tempts the Duke"<br />
"The Duke Is Mine"<br />
"What I Did For a Duke"<br />
<br />
== Security ==<br />
<br />
=== SCRAM Authentication ===<br />
<br />
=== New "monitoring" roles for permission grants ===<br />
<br />
=== Restrictive Policies for Row Level Security ===<br />
<br />
== Performance ==<br />
<br />
=== Cross-column Statistics ===<br />
<br />
Real-world data frequently contains correlated data in table columns, which can easily fool the query planner into thinking WHERE clauses are more selective than they really are, which can cause some queries to become very slow. [https://www.postgresql.org/docs/devel/static/sql-createstatistics.html Multivariate statistics objects] can be used to let the planner learn about this, which proofs it against making such mistakes. [https://www.postgresql.org/docs/devel/static/planner-stats.html#planner-stats-extended This manual section] explains the feature in more detail, and [https://www.postgresql.org/docs/devel/static/multivariate-statistics-examples.html this section] shows some examples. This feature in PostgreSQL represents an advance in the state of the art for all SQL databases.<br />
<br />
[https://blog.2ndquadrant.com/pg-phriday-crazy-correlated-column-crusade/ PG Phriday: Crazy Correlated Column Crusade]<br />
<br />
=== Latch Wait times in pg_stat_activity ===<br />
<br />
=== Query Planner Improvements ===<br />
<br />
In join planning, detect cases where the inner side of the join can only produce a single row for each outer side row. During execution this allows early skipping to the next outer row once a match is found. This can also remove the requirement for mark and restore during Merge Joins, which can significantly improve performance in some cases.<br />
<br />
== Other Features ==<br />
<br />
=== ICU Collation Support ===<br />
<br />
[https://blog.2ndquadrant.com/icu-support-postgresql-10/ More robust collations with ICU support in PostgreSQL 10]<br />
<br />
=== amcheck B-Tree consistency/corruption checking tool ===<br />
<br />
[https://www.postgresql.org/docs/10/static/amcheck.html PostgreSQL 10 amcheck documentation]<br />
<br />
== Backwards-Incompatible Changes ==<br />
<br />
Version 10 has a number of backwards-incompatible changes which may affect system administration, particularly around backup automation. Users should specifically test for the incompatibilities before upgrading in production.<br />
<br />
=== Change in Version Numbering ===<br />
<br />
As of Version 10, PostgreSQL no longer uses three-part version numbers, but is shifting to two-part version numbers. This means that version 10.1 will be the first patch update to PostgreSQL 10, ''instead of'' a new major version. Scripts and tools which detect PostgreSQL version may be affected.<br />
<br />
The community strongly recommends that tools use either the GUC [https://www.postgresql.org/docs/9.2/static/runtime-config-preset.html server_version_num] (on the backend), or the libpq status function [https://www.postgresql.org/docs/9.2/static/libpq-status.html PQserverVersion] in libpq to get the server version. This returns a six-digit integer version number which will be consistently sortable and comparable between versions 9.6 and 10.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! Version String<br />
! Major Version<br />
! Update Number<br />
! version_num<br />
|-<br />
|9.6.0<br />
|9.6<br />
|0<br />
|090600<br />
|-<br />
|9.6.3<br />
|9.6<br />
|3<br />
|090603<br />
|-<br />
|10.0<br />
|10<br />
|0<br />
|100000<br />
|-<br />
|10.1<br />
|10<br />
|1<br />
|100001<br />
|}<br />
<br />
* [http://www.databasesoup.com/2016/05/changing-postgresql-version-numbering.html Changing Postgres Version Numbering]<br />
<br />
=== Renaming of "xlog" to "wal" Globally (and location/lsn) ===<br />
<br />
In order to avoid confusion leading to data loss, everywhere we previously used the abbreviation "xlog" to refer to the transaction log, including directories, functions, and parameters for executables, we now use "wal". Similarly, the word "location" in function names, where used to refer to transaction log location, has been replaced with "lsn".<br />
<br />
This will require many users to reprogram custom backup and transaction log management scripts, as well as monitoring replication.<br />
<br />
Two directories have been renamed:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Directory <br />
! 10 Directory<br />
|-<br />
| pg_xlog || pg_wal<br />
|-<br />
| pg_clog || pg_xact<br />
|}<br />
<br />
Additionally, depending on where your installation packages come from, the default activity log location may have been renamed from "pg_log" to just "log".<br />
<br />
Many administrative functions have been renamed to use "wal" and "lsn":<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
! 9.6 Function Name <br />
! 10 Function Name <br />
|-<br />
| pg_current_xlog_flush_location || pg_current_wal_flush_lsn<br />
|-<br />
| pg_current_xlog_insert_location || pg_current_wal_insert_lsn<br />
|-<br />
| pg_current_xlog_location || pg_current_wal_lsn<br />
|-<br />
| pg_is_xlog_replay_paused || pg_is_wal_replay_paused<br />
|-<br />
| pg_last_xlog_receive_location || pg_last_wal_receive_lsn<br />
|-<br />
| pg_last_xlog_replay_location || pg_last_wal_replay_lsn<br />
|-<br />
| pg_switch_xlog || pg_switch_wal<br />
|-<br />
| pg_xlog_location_diff || pg_wal_lsn_diff<br />
|-<br />
| pg_xlog_replay_pause || pg_wal_replay_pause<br />
|-<br />
| pg_xlog_replay_resume || pg_wal_replay_resume<br />
|-<br />
| pg_xlogfile_name || pg_walfile_name<br />
|-<br />
| pg_xlogfile_name_offset || pg_walfile_name_offset<br />
|}<br />
<br />
Some system views and functions have had attribute renames:<br />
* pg_stat_replication:<br />
** write_location -> write_lsn<br />
** sent_location -> sent_lsn<br />
** flush_location -> flush_lsn<br />
** replay_location -> replay_lsn<br />
* pg_create_logical_replication_slot: wal_position -> lsn<br />
* pg_create_physical_replication_slot: wal_position -> lsn<br />
* pg_logical_slot_get_changes: location -> lsn<br />
* pg_logical_slot_peek_changes: location -> lsn<br />
<br />
Several command-line executables have had parameters renamed:<br />
<br />
* pg_receivexlog has been renamed to pg_receivewal.<br />
* pg_resetxlog has been renamed to pg_resetwal.<br />
* pg_xlogdump has been renamed to pg_waldump.<br />
* initdb and pg_basebackup have a --waldir option rather than --xlogdir.<br />
* pg_basebackup now has --wal-method rather than --xlog-method.<br />
<br />
=== Drop Support for FE/BE 1.0 Protocol ===<br />
<br />
PostgreSQL's original [https://www.postgresql.org/docs/current/static/protocol.html client/server protocol], version 1.0, will no longer be supported as of PostgreSQL 10. Since version 1.0 was superceded by version 2.0 in 1998, it is unlikely that any existing clients still use it.<br />
<br />
=== Change Defaults around Replication and pg_basebackup ===<br />
<br />
=== Drop Support for Floating Point Timestamps ===<br />
<br />
=== Remove contrib/tsearch2 ===<br />
<br />
Tsearch2, the older, contrib module version of our built-in full text search, has been removed from contrib and will no longer be built as part of PostgreSQL packages. Users who have been continuously upgrading since before version 8.3 will need to either manually modify their databases to use the built-in tsearch objects before upgrading to PostgreSQL 10, or will need to compile tsearch2 themselves from scratch and install it.<br />
<br />
=== Drop pg_dump Support for Databases Older than 8.0 ===<br />
<br />
Databases running on PostgreSQL version 7.4 and earlier will not be supported by 10's pg_dump or pg_dumpall. If you need to convert a database that old, use version 9.6 or earlier to upgrade it in two stages.</div>Jer