May 2015 Fsync Permissions Bug

From PostgreSQL wiki
Jump to navigationJump to search

May 2015 fsync Permissions Bug FAQ

On May 22, 2015, the PostgreSQL project released a set of updates to all supported versions of PostgreSQL. One of the fixes included in this batch of updates forced fsyncing of all PostgreSQL files on a restart after a crash. This fix was added to prevent certain kinds of data corruption which can occur if a system hosting a database has several failures in a row.

Unfortunately, this fix causes issues with some users' PostgreSQL setups due to file permissions issues, which can cause PostgreSQL to refuse to restart after an unexpected shutdown, or when restoring from a binary backup (PITR).

This issue is slated to be fixed in the 2015-06-04 Update Release.

Who is affected by this bug?

Users who:

  1. applied the 9.4.2, 9.3.7, 9.2.11, 9.1.16 and/or 9.0.20 PostgreSQL updates
  2. have one or more files or directories, or symlinks to one or more files or directories, not owned or writeable by the "postgres" user (or other installation owner) under the postgres data directory (PGDATA).

Note that condition 2 is common to SSL-enabled Debian and Ubuntu installations of PostgreSQL 9.1, 9.0, and earlier, but may affect other users as well. Most users on other platforms are not affected, as all files and links under PGDATA are owned by the "postgres" user by default.

What are the symptoms?

If you experience the bug, PostgreSQL will refuse to restart after a crash, or a restore from binary backup, with an error message similar to the following:

* Starting PostgreSQL 9.1 database server
* The PostgreSQL server failed to start. Please check the log output:
2015-05-26 03:27:20 UTC [331-1] LOG:  database system was interrupted; last known up at 2015-05-21 19:56:58 UTC
2015-05-26 03:27:20 UTC [331-2] FATAL:  could not open file "/etc/ssl/certs/ssl-cert-snakeoil.pem": Permission denied
2015-05-26 03:27:20 UTC [330-1] LOG:  startup process (PID 331) exited with exit code 1
2015-05-26 03:27:20 UTC [330-2] LOG:  aborting startup due to startup process failure

For more information, see the original bug report.

I've hit this bug and I can't restart Postgres. What do I do?

As a temporary workaround, change the permissions on any symlinked files to being writable by the Postgres user. For example, on Ubuntu, with PostgreSQL 9.1, the following should work:

WARNING: Make sure these configuration files are symbolic links before executing these commands! If someone has customized the server.crt or server.key file, you can erase them by following these steps. It's a good idea to make a backup of the server.crt and server.key files before removing them.

(as root)
# go to PGDATA directory
cd /var/lib/postgresql/9.1/main 
ls -l server.crt server.key

# confirm both of those files are symbolic links
# to files in /etc/ssl before going further

# remove symlinks to SSL certs
rm server.crt
rm server.key 

# copy the SSL certs to the local directory
cp /etc/ssl/certs/ssl-cert-snakeoil.pem server.crt
cp /etc/ssl/private/ssl-cert-snakeoil.key server.key

# set permissions on ssl certs
# and postgres ownership on everything else
# just in case
chown postgres *
chmod 640 server.crt server.key

service postgresql start

You will need to adapt the above example to your specific circumstances, but that should give you a general idea of what to do. The requirement is that the postgres user must have write access to everything in PGDATA or symlinked from PGDATA.

Should I not apply the updates?

The 9.4.2 and 9.3.7 updates fix a serious bug which causes unrecoverable data loss under some circumstances. As such, the PostgreSQL project considers a temporary workaround involving file permissions to be a less serious risk than the fixed bugs, and recommends applying the updates once you've verified and changed file permissions, if required.

Other users who are not at risk for the fsync issue are also recommended to apply the update at the next downtime.

Will you be fixing it soon?

The PostgreSQL project expects to release another update very soon which addresses the file permissions issue. The expected release date for this update is June 4.