PostgreSQL Buildfarm Howto

From PostgreSQL wiki

Revision as of 16:25, 11 August 2014 by Tgl (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

PostgreSQL BuildFarm is a distributed build system designed to detect build failures on a large collection of platforms and configurations. This software is written in Perl. If you're not comfortable with Perl then you possibly don't want to run this, even though the only adjustment you should ever need is to the config file (which is also Perl).

Contents

Get the Software

Download from the buildfarm server Unpack it and put it somewhere. You can put the config file in a different place from the run_build.pl script if you want to, but the simplest thing is to put it in the same place. Decide which user you will run the script as - it must be a user who can run the PostgreSQL server programs (on Unix that means it must *not* run as root). Do everything else here as that user.

Other Prerequisites

Git
Must be version 1.6 or later.
All tools required for building Postgres from a Git checkout
GNU make, bison, flex, etc
See the Postgres documentation
ccache
This isn't absolutely necessary, but it greatly reduces the amount of CPU your buildfarm member will consume ... at the price of more disk space usage

Choose a setup for a base git mirror that all your branches will pull from.

Most buildfarm members run on more than one branch, and if you do it's good practice to set up a mirror on the buildfarm machine and then just clone that for each branch. The official publicly available git repository is at

and there is a mirror at

Either should be suitable for cloning.

The simplest way to set up a mirror is simply to have the buildfarm script create and maintain it for you. If you do that, the mirror will be updated at the start of a run when it checks to see if any changes have occurred that might require a new build. To do that, all you need to do is set the following two options in your config file:

 git_keep_mirror => 'true',
 git_ignore_mirror_failure => 'true',

If you would rather clone the github mirror for your local mirror instead of the authoritative community repo (doing so can keep load off the community server, which is a good thing), then set the config variable to point to it like this:

 scmrepo => 'git://github.com/postgres/postgres.git',

The mirror will be placed in your build root, above the branch directories.

You can also opt to create and maintain a git mirror yourself, something like this:

 git clone --mirror git://git.postgresql.org/git/postgresql.git pgsql-base.git

When that is done, add an entry to your crontab to keep it up to date, something like:

 20,50 * * * * cd /path/to/pgsql-base.git && git fetch -q

One downside of doing this is that your mirror will only be as up to date as the last time you ran the cron update.

To have your buildfarm installation use a local mirror you maintain yourself, set the config variable:

 scmrepo => '/path/to/pgsql-base.git',

Of course, in this case you don't set the git_keep_mirror option.

Create a directory where builds will run.

This should be dedicated to the use of the build farm. Make sure there's plenty of space - on my machine each branch can use up to about 700Mb during a build. You can use the directory where the script lives, or a subdirectory of it, or a completely different directory.

If you're using ccache, the cache directory can use up to 1Gb by default. You can reduce that if you like (see the ccache documentation), but it's good to allow at least 100Mb per active branch.

Edit the build-farm.conf file

Notable things you probably need to set include the following:

 %conf

scmrepo
Set this to indicate the path to your Git mirror
scm_url
If you are not using the Community git repository, or want to point the changesets at a different server, set this URL to indicate where to find a given Git commit on the web. For instance, for the github mirror, this value should be: http://github.com/postgres/postgres.git/commit/ - don't forget the trailing "/".

Once you have registered your Buildfarm animal you will need to set these, but for initial testing just leave them as-is:

animal
This will need to be set to the animal name you were given by the Buildfarm coordinators
secret
This must be the password indicated by the Buildfarm coordinators

Adjust other config variables "make", "config_opts", and (if you don't use ccache) "config_env" to suit your environment, and to choose which optional Postgres configuration options you want to build with.

You should not need to adjust other variables.

You may verify that you didn't screw things up too badly by running "perl -cw build-farm.conf". That verifies that the configuration is still legitimate Perl.

Change the shebang line in the run_build script.

If the path to your perl installation isn't "/usr/bin/perl", edit the #! line in run_build.pl so it is correct. This is the ONLY line in that file you should ever need to edit.

Check that required perl modules are present.

Run "perl -cw run_build.pl". If you get errors about missing perl modules you will need to install them. Most of the required modules are standard modules in any perl distribution. The rest are all standard CPAN modules, and available either from there or from your OS distribution. When you don't get an error any more, run the same test on run_web_txn.pl, and also on run_branches.pl if you plan to use that (see below). When all is clear you are ready to start testing.

Run in test mode.

With a PATH that matches what you will have when running from cron, run the script in no-send, no-status, verbose mode. Something like this:

 ./run_build.pl --nosend --nostatus --verbose

and watch the fun begin. If this results in failures because it can't find some executables (especially gmake and git), you might need to change the config file again, this time changing the "build_env" with another setting something like:

 PATH => "/usr/local/bin:$ENV{PATH}",

Also, if you put the config file somewhere else, you will need to use the --config=/path/to/build-farm.conf option.

If trying to diagnose problems, interesting summary information may be found in the file web-txn.data, which is found in a build-specific directory, of the form $build_root/$CURRENT_BRANCH/$animal.lastrun-logs/web-txn.data

If particular steps of a build failed, logs for those steps may be found in that same directory.

Test running from cron

When you have that running, it's time to try with cron. Put a line in your crontab that looks something like this:

 43 * * * * cd /location/of/run_build.pl/ && ./run_build.pl --nosend --verbose

Again, add the --config option if needed. Notice that this time we didn't specify --nostatus. That means that (after the first run) the script won't do any build work unless the Git repo has changed. Check that your cron job runs (it should email you the results, unless you tell it to send them elsewhere).

You can, and probably should, drop the --verbose option once things are working.

The frequency with which the cron job is launched is up to you, though we do suggest that active branches get built at least once a day. The build script will automatically exit if it finds a previous invocation still running, so you do not need to worry about scheduling jobs too close together. Think of the cron frequency as how often the buildfarm animal will wake up to see if there have been changes in the Git repo.

Choose which branches you want to build

By default run_build.pl builds the HEAD branch. If you want to build some other branch, you can do so by specifying the name on the commandline, e.g.

 run_build.pl REL9_4_STABLE

The old way to build multiple branches was to create a cron job for each active branch, along the lines of:

6 * * * * cd /home/andrew/buildfarm && ./run_build.pl --nosend
30 4 * * * cd /home/andrew/buildfarm && ./run_build.pl --nosend REL9_4_STABLE

But there's a better way ...

Using run_branches.pl

There is a wrapper script that makes running multiple branches much easier. To build all the branches that are currently being maintained by the project, uncomment this line in the config file:

 # $conf{branches_to_build} = 'ALL'; # or [qw( HEAD RELx_y_STABLE etc )]

and instead of running run_build.pl, use run_branches.pl with the --run-all option. This script accepts all the options that run_build.pl does, and passes them through. So now your crontab could just look like this:

 6 * * * * cd /home/andrew/buildfarm && ./run_branches.pl --run-all

One of the advantages of this approach is that you don't need to manually retire a branch when the Postgres project ends support for it, nor to add one when there's a new stable branch. The script contacts the server to get a list of branches that we're currently interested in, and then builds them. This is now the recommended method of running a buildfarm member.

If you don't want to build every one of the back branches, you can also use HEAD_PLUS_LATEST, or HEAD_PLUS_LATESTn for any n, in the $conf{branches_to_build} setting.

Register your new buildfarm member.

Once this is all running happily, you can register to upload your results to the central server. Registration can be done on the buildfarm server at http://www.pgbuildfarm.org/cgi-bin/register-form.pl. When you receive your approval by email, you will edit the "animal" and "secret" lines in your config file, remove the --nosend flags, and you are done.

Please also join the pgbuildfarm-members mailing list at http://pgfoundry.org/mail/?group_id=1000040 This is a low-traffic list for owners of buildfarm members.

Bugs

Please file bug reports concerning the buildfarm script (but not Postgres itself) on the tracker at pgFoundry

What if you can't use git for some reason?

You can still run in CVS mode against a git-cvs gateway. There is one available for the master (aka HEAD) and REL9_0_STABLE branches.

You will need to be on release 4.2 or later of the buildfarm client, and have the following settings in your config file:

 scm => 'cvs',
 scmrepo => ':pserver:anonymous@git.postgresql.org:/postgresql.git',
 use_git_cvsserver => 'true',

You will also need to do this, once:

 cvs -d :pserver:anonymous@git.postgresql.org:/postgresql.git login

An empty password will do.

Using this mode of running is a fall-back. If you can use git you should.

Running on Windows

There are three build environments for Windows: Cygwin, MinGW/MSys, and Microsoft Visual C++. The buildfarm can run with each of these environments. This section discusses requirements for the buildfarm, rather than requirements for building on Windows, which are covered elsewhere.

Cygwin

There is almost nothing extra to be done for Cygwin. You need to make sure that cygserver is running, and you should set MAX_CONNECTIONS=>3 and CYGWIN=>'server' in the build_env stanza of the buildfarm config. Other than that it should be just like running on Unix.

MinGW/Msys

For MinGW/MSys, you need both the MSys DTK version of perl installed, and a native Windows perl - I have only tested with ActiveState perl, which I have found to be rock solid. You need to run the main buildfarm script using the MSYS DTK perl, and the web transaction script using native Perl. that mean you need to change the first line of the run_web_txn.pl script so it reads something like:

 #!/c/perl/bin/perl

You should make sure that the PATH is set in your config file to put the Native perl ahead of the MSys DTK perl. It's a good idea to have a runbf.bat file that you can call from the Windows scheduler. Mine looks like this:

 @echo off
 setlocal
 c:
 cd \msys\1.0\bin
 c:\msys\1.0\bin\sh.exe --login -c "cd bf && ./run_build.pl --verbose %1 >> bftask.out 2>&1"

Set up a non-privileged Windows user to run this jobs as. set up the buildfarm as above as that user. Then create scheduler jobs that call runbf.bat with an optional branch name argument.

Microsoft Visual C++

For MSVC you need to edit the config file more extensively. Make sure the 'using_msvc' setting is on. Also, there is a section of the file specially for MSVC builds. As with MinGW, you need a native Windows perl installed. It appears that Windows Git does not like to clone local repositories specified with forward slashes (this is pretty horrible - almost all Windows programs are quite happy with forward slashes. Make sure you specify the repository using backslashes or weird things will happen. Again, you will need a runbf.bat file for the windows scheduler. Mine looks like this:

 @echo off
 c:
 cd \prog\bf
 c:\perl\bin\perl run_build.pl --verbose %1 %2 %3 %4 >> bfout.txt

You will also need a tar command capable of bundling up the logs to send to the server. The best one I have found for use on Windows is bsdtar, part of the libarchive collection at http://sourceforge.net/projects/gnuwin32/files/. This is also a good place to get many of the libraries you need for optional pieces of MSVC and MinGW builds.

Running multiple buildfarm members on a single machine

Sometimes you might want to run more than one buildfarm member on a single machine. Possible reasons for doing this include testing different compilers, and running with different build options. For example, on one FreeBSD machine I have two members; one does a normal build and the other does a build with -DCLOBBER_CACHE_ALWAYS set. Or on a Windows machine one might want to test both the 32 bit and 64 bit mingw-w64 compilers.

The simplest way to do this is to do it all in the same location. Get one member working, then copy the config file to something with the other member's name and change the animal name and password, and whatever in the config will be different from the first one. The members can share a git mirror and build root. There are locking provisions that prevent instances of the buildfarm scripts from tripping over each other. If you are using ccache, you should ensure that each member gets a separate ccache location. The best way to do that is probably to put the member name into the ccache directory name.

Personal tools