Committing checklist

From PostgreSQL wiki
Jump to navigationJump to search

This document is an attempt to list common checks that PostgreSQL project Committers may want to adopt as part of a checklist of things to check before pushing. There are certain classic mistakes that even experienced committers have been known to make occasionally. In the real world, many mistakes happen when a step is skipped over during a routine process, perhaps caused by a seemingly insignificant last minute change. It's important to learn from these mistakes.

This checklist isn't intended as something that committers will adopt wholesale. Rather, it is intended as a starting point for creating your own semi-customized checklist. Since your final checklist is supposed to be used more or less mechanically, it shouldn't ever be too long, and should be organized into sections to make it easier to skip items where irrelevant. In short, if it's worth adopting something as a standard practice that you return to again and again, it's probably also worth writing that down, to formalize it. Use discretion when deciding what makes sense for you.

Basic checks

  • Double-check release build compiler warnings.
  • make check-world.
    • You may want to speed this up by using the following recipe:
make -j16 -s install;make -Otarget -j10 -s check-world  && echo "quick make-check world success" || echo "quick make-check world failure"
  • Consider the need for a catversion bump.
  • Don't assume that you haven't broken the doc build if you make even a trivial doc change.
    • Removing a GUC can break instances in the release notes where they're referenced.
    • Even grep can miss this, since references to the GUC will have dashes rather than underscores, plus possibly other variations.
  • Validate *printf calls for trailing newlines.
  • Spellcheck the patch.
  • Verify that long lines are not better broken into several shorter lines:
git diff origin/master | grep -E '^(\+|diff)' | sed 's/^+//' | expand -t4 | awk "length > 78 || /^diff/"
  • Run pgindent, pgperltidy, and reformat-dat-files on changed files; keep the changes minimal.
  • Run pgperlcritic on modified Perl files.
  • Update version numbers, if needed:
CATALOG_VERSION_NO, PG_CONTROL_VERSION, XLOG_PAGE_MAGIC, PGSTAT_FILE_FORMAT_ID
  • Update function/other OIDs, if needed;

Regression test checks

  • When adding core regression test files, make sure that they're added to both serial and parallel schedules.

(But release 14 and later have only the parallel schedule.)

  • Look for alternative output files for any regression test you're updating the output of.
    • Some tests have alternative output files to work around portability issues.
    • Most of the time it works to just apply the same patch to the other variants as the delta you're observing for the output file that's relevant to your own platform.
    • Occasionally you may have to just see what the buildfarm says.

Git checks

Basic

  • Do a dry run before really pushing by using --dry-run.
  • Look at "git status"; anything missing?
  • Author and committer timestamps should match.

This can be an issue if you're in the habit of rebasing, or apply a patch with "git am". Make sure that your setup displays both in "git log", by specifying "--pretty=fuller", or changing the git format config. The easiest way to make both timestamps match is to amend the commit like so:

git commit --amend --reset-author

If you have "autosetuprebase = always" in your git config, then a last minute "git pull" could cause a rebase, which could cause author and committer timestamps to diverge a bit. In practice, small differences between author and committer timestamp are not considered to be a problem.

Discussion: https://postgr.es/m/XXXXXXXXXXX
Back-patch depth?
What should the release notes say?
Credit any reviewer.
  • When making references to other commits, it's a good idea to use the first 9 chars of the commit SHA. Fewer than 9 means there will be no hyperlink in the HTTP interface. More than 9 is not required.
  • Note compatibility issues in commit message, so that they'll get picked up later, when release notes are written.
  • Check merge with master (not applicable to commits).
  • If you're using a dedicated ssh key with a passphrase, you may find it useful to deliberately disable it when you're done pushing:
$ ssh-add -d ~/.ssh/id_rsa_postgres

Backpatching and git

Commit messages for multiple branches should be identical when back-patching, in order to have tooling recognize the redundancy for purposes of compiling release notes, and other things of that nature.

  • Easiest way to get commit metadata consistent is to not worry about commit messages outside of the master branch at first. Commit message on backbranches could initially be something like "pending 9.6".
  • Perform the following procedure on each back branch when you're done, by checking out each individual branch in gitmaster local clone, and doing this for master branch commit which has good commit message:
git commit --amend --reset-author -C <commit>

You now have the same commit message on each branch. This means that the src/tools/git_changelog utility script will present the commits from each affected local branch together, as one logical change. (This script is used as a starting point when writing back branch release notes. Note that the concept of "one logical change" is not a standard git concept.)

  • Use git push origin : --dry-run to dry-run pushing all branches at once. Once satisfied, remove --dry-run to actually push. --dry-run is doubly important if you push each branch individually.

Maintaining ABI compatibility while backpatching

Avoid breaking ABI compatibility. It's unacceptable for extensions built against an earlier point release to break in a more recent point release.

  • You can only really change the signature of a function with local linkage, perhaps with a few rare exceptions.
  • You cannot modify any struct definition in src/include/*. If any new members must be added to a struct, put them at the end in backbranches. It's okay to have a different struct layout in master. Even then, extensions that allocate the struct can break via a dependency on its size.
  • Move new enum values to the end.

See this message for more considerations on ABI preservation.

GUC checks

  • When adding a new GUC, postgresql.conf.sample needs to be updated, too.
  • Is the GUC group the right one?

Advanced smoke tests

  • Valgrind memcheck + "make installcheck".
  • CLOBBER_CACHE_ALWAYS.
  • When doing anything that touches WAL-logging, consider creating a replica, and making sure that wal_consistency_checking=all passes on replica while master runs "make installcheck". WAL_DEBUG makes any bug that this throws up easier to isolate.
  • "#define COPY_PARSE_PLAN_TREES" and "#define WRITE_READ_PARSE_PLAN_TREES" can catch omissions or other mistakes when "src/backend/nodes/*" were changed.
  • check for unaligned access with things from c.h like -fsanitize=alignment
  • sqlsmith (for grammar changes, and ??)

Policies

The following list contains frequently cited, but not well documented, project policies with respect to treatment of certain types of coding/testing/patching patterns. Likely this ought to be moved at some point to postgresql.org, but for the moment we will capture knowledge here. In no particular order:

  • Null pointer dereference crashes with no other discernible consequences are not considered security issues and should be fixed as routine bugs.
  • All backend-side global variables should be marked with PGDLLIMPORT, so that they are available to extensions on Windows.
  • Do not leave global objects (roles, subscriptions, tablespaces) behind after a regression test run, since that would clutter the installation in case of using installcheck mode. However, this rule doesn't apply to TAP tests, which always create temporary installations.
  • Transient roles created by regression test cases should be named regress_something, to reduce the risks of running such cases against installed servers. Likewise for subscriptions and tablespaces. For databases, the rule is different for historical reasons: the DB name must include regression. Compile with -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS to get automatic checking of this rule (and note that some buildfarm animals do that).
  • We want to keep recently-out-of-support (currently, back to 9.2) branches buildable on modern systems. That extends to fixing build failures, regression-test failures, and easy-to-fix compiler warnings. But do not change the external behavior of out-of-support branches. The point of this rule is to allow old branches to be built on modern systems for purposes of back-testing psql, pg_dump, etc against old servers; it would do little good to test against behavior that's different from the last released version of the branch. The point of allowing warning fixes is so that people building these branches needn't expend brain cells re-verifying that it's safe to ignore the warnings.
  • Internal errors with SQLSTATE XX000 (e.g., plain elog) should not be triggerable from SQL. Always add an errcode() call if you think the error could be reachable in normal operation.
  • Refer to src/test/perl/README as a guide for writing TAP tests.
  • All "*.h" files should pass src/tools/pginclude/headerscheck, in both regular and --cplusplus modes. See the README file beside it for details.
  • Ignore '\r' in text files (such as configuration files) unconditionally, because people sometimes try to use files with DOS-style newlines on Unix machines, where the C library won't hide that from us.
  • Every "*.c" file should start by including postgres.h, postgres_fe.h, or c.h as appropriate; and then there is no need for any "*.h" file to explicitly include any of these. The core reason for this policy is to make it easy to verify that pg_config_os.h is included before any system headers such as <stdio.h>; without that, we have portability issues on some platforms due to variation in largefile options across different modules in the backend. Also, if "*.h" files were responsible for choosing which of these key headers to include, "*.h" files that need to be includable in either frontend or backend compiles would be in trouble.
  • By convention, "*.c" files should include system and external-library headers after postgres.h/postgres_fe.h/c.h and before other Postgres headers.
  • Do not add the output of EXPLAIN statements to the regression tests without using COSTS OFF or filtering the output through a function that will hide exact costs and rowcounts. Failing to do this usually results in platform-dependent failures.
  • Datatype I/O functions must not be marked volatile. (Presently, this is checked by the regression tests for core datatypes, but not for contrib.)

Release freezes

Please be aware of upcoming release dates, and avoid committing into branches on which a release is scheduled, starting at noon UTC on the Saturday before the release date and extending until the release tags have been applied (which usually happens before midnight UTC on the Tuesday). This "quiet time" allows for full buildfarm testing before tarballs are made, and for correcting any packager trouble reports after the initial wrap. Security and release-note patches of course break this rule; otherwise, don't do it without consulting the release team.

Releases of supported back branches normally happen according to this schedule. Dates for beta, RC, and GA releases of a new branch are announced by the Release Management Team. Note that there is no freeze on the master branch unless a beta release is due to be made from it.