Cfbot

From PostgreSQL wiki
Jump to navigationJump to search

Frequently Asked Questions

What is this?

In short, Cfbot is a bridge between traditional mailing list-based open source software development and a modern continuous integration system.

PostgreSQL is developed on the pgsql-hackers mailing list. Email threads that contain patches proposed for commit are registered in a Commitfest entry, analogous in some ways to a "pull request". The Cfbot checks all registered patches for bitrot and triggers continuous integration testing by maintaining its own set of branches on Github.

Can I have CI without Cfbot?

Yes! As you can see here, any Github account that has Cirrus CI enabled (free for open source projects) will trigger CI testing on every commit (see the instructions in src/tools/ci/README). (Other public git hosts and CI systems could also be supported, if someone does the work to propose the control files to make that happen.) The Cfbot is just a way to do the same for email threads, and tabulate the results in one place, to make it easier to keep track of the state of 200-300 active patches at a time.

How does this relate to the "build farm"?

The build farm tests PostgreSQL on ~10 OSes in a wide variety of configurations maintained by anyone who wants to add a machine, on the master branch and currently supported release branches, but only *after* patches are committed and become part of PostgreSQL. We strive to keep the build farm green, not least because it's a shared resource, so when it's broken, it affects all developers. It's usually the point we learn if we've broken something on the rare closed source OSes AIX and Solaris that cloud-based CI doesn't support (though we could add them, if someone who owns such a system were prepared to set up a persistent worker and burn CPU all day long).

The cfbot tests patches on 4 OSes (could be expanded to ~7 by adding more open source OSes) when they're proposed, before they've made it into the tree. So, it's earlier in the development life cycle of a patch, and not a shared resource.

Before even proposing a patch, the same testing can be done in your own Github account, as mentioned above.

Who is it for?

Individual developers can look at their personal patch list to watch out for bitrot and failures.

Reviewers, Commitfest managers and committers can look at the whole result table to find things to investigate.

Why not use Github pull requests?

This suggestion has so far not had much success in the community, and is a bigger topic than the author of this humble tool can address. Cfbot is an attempt to give similar benefits while maintaining the traditional email-based workflow and Commitfest system.

Which threads does it look for patches in?

The most recently active thread associated with Commitfest submissions in non-final states in the current Commitfest is scanned for patches, and the test results are reported on the main page. Threads registered in the following Commitfest are scanned too, and they show up on the next page. Individual contributor pages, visible by clicking on a name, show submissions from the current and next Commitfests, mixed together.

Which attachments are considered to be patches?

First, it ignores emails unless they have a .diff[.gz], .patch[.gz], .tar.gz, .tar.bz2, .tgz attachment. It can be useful to post an alternative or unrelated patch with a .txt ending if it is not intended to replace the main proposed patch. (Code: interesting emails are noticed here).

When it sees an interesting email, it decompresses and expands as required first, and then applies .diff and .patch files after sorting by name them so that 0001, 0002 etc prefixes are respected. Everything else is ignored, so if you send an email with attachments 0001-foo.patch, 0002-bar.patch and 0001-baz.txt only the first two are applied. This can be useful for extra material that might be a patch but shouldn't be considered part of the submission. (Code: interesting patches are applied here).

Each email should contain a self-sufficient set of patches to be applied the current master branch; there is no way to say "this is incremental on top of the patches already attached to some other email".

Patches are applied with GNU patch for now. It might be better to use git am at some point, but the goal was to support existing practices, and many contributors post raw diffs rather than git format-patch files.

When are patches tested?

Whenever a new patch is posted as an attachment (within a few minutes typically), and also periodically to check against changes in the master branch. The period is currently 48 hours, but this is subject to change depending on resource consumption. (Code: deciding which patch to process next happens here.)

What do the icons mean?

A hollow checkmark (tick) or cross means that the result has not changed recently. A solid checkmark or cross means that the result is a change; for example, it was previously failing, and now passing, or vice versa. The "alt" text (visible in most browsers by hovering with a mouse) shows the name and status of a "task". If you click on the icon, Cirrus shows the output and artifacts such as logs and regression test diffs.

Why are there multiple icons for each patch?

These are the results of separate "tasks" (Cirrus CI terminology). Broadly, they correspond to different operating systems that we test on, concurrently. The tasks are not defined by Cfbot (though in earlier versions they were, before .cirrus.yml was promoted to being part of the PostgreSQL source tree for everyone to use and maintain) and may change, and can even be expanded as part of an individual patch submission, but here's a list as of the time of writing:

  • SanityCheck: a quick smoke test to avoid starting all the other longer tasks if they'll certainly fail
  • FreeBSD
  • Linux
  • Windows
  • macOS
  • CompilerWarnings: check for warnings with both GCC and Clang, with asserts and without, and try to build the documentation.

Why Cirrus CI?

There are many competitors in the CI space and Cfbot did previously use a couple of others, but at the time of writing Cirrus is the only system we can find with these properties:

  • allows test results to be viewed by anyone without some kind of account
    • users who don't want to open an account on a commercial service benefit from CI merely by sending patches to our mailing list
    • URLs can be shared in our mailing lists for all to read, as part of public discussions
  • has resource limits high enough to handle testing ~300 branches every couple of days
  • has support for a good range of operating systems and raw virtual machines, not just Linux and not just containers
  • has super responsive, proactive and helpful support staff

What are the future plans for Cfbot?

In no particular order:

  • Integrate with commitfest.postgresql.org (ie show results there)
  • Become webhook based (commitfest pokes cfbot with new patch, cfbot pushes branch, cirrus pokes cfbot with results, cfbot pokes commitfest with results), rather than the current polling/webscraping/cron system
  • Migrate from Thomas's little cloud machine to postgresql.org infrastructure (also means migrating from FreeBSD to Debian)
  • Use a separate Cirrus task to apply patches, instead of the current scheme (applying inside a FreeBSD jail, a bit too magical and won't work on Debian)

How do I change/improve the tests?

The tests themselves are part of the PostgreSQL tree. See the documentation.

To improve the way CI invokes the tests, modify .cirrus.yml, test in your own Github account, and propose changes to the pgsql-hackers list.

Who do I contact?

If you would like to discuss CI for PostgreSQL in general, the pgsql-hackers mailing list. For issues relating to the way Cfbot makes branches or collects results, please contact thomas.munro-at-gmail.com, or alternatively discuss on the mailing list and CC Thomas.

See also