Working with Git

From PostgreSQL wiki

Revision as of 18:45, 19 May 2012 by Boshomi (Talk | contribs)

Jump to: navigation, search

This page collects various wisdom on working with the PostgreSQL Git repository. There are also Other Git Repositories you might work with, most notably the official Github mirror which you might fork on that site.

Contents

Getting Started

A simple way to get started might look like this:

git clone git://git.postgresql.org/git/postgresql.git
cd postgresql
git checkout -b my-cool-feature
$EDITOR
git commit -a
git diff master my-cool-feature | filterdiff --format=context > ../my-cool-feature.patch

Note that git checkout -b my-cool-feature creates a new branch and checks it out at the same time. Typically, you would develop each feature in a separate branch.

See the documentation and tutorials at http://git.or.cz/ for a more detailed Git introduction. For a more detailed lesson, check out http://progit.org and maybe get a hardcopy to help support the site.

You may wish to put the following in your .git/info/exclude GitExclude (Now that the master repository has been converted to git, the standard .gitignore files should cover all build products, so you don't need most of what is listed in that reference. You might still want to exclude *~ though.)

PostgreSQL developers have traditionally preferred context diffs (diff -c) over unified diffs (diff -u). At least some major committers who you will probably have to run your patch by heavily prefer context diffs. Bizarrely, Git doesn't easily produce context diffs. So there is some need for fiddling around, either using the filterdiff utility as above (from the patchutils distribution) or using the method described in Context diffs with Git below.

Keeping your master branch local synchronized

First, add the origin as a remote. You only need to do this once:

git remote add origin git://git.postgresql.org/git/postgresql.git

Next, fetch from your public git repository:

git fetch origin master

Merge any new patches from your public repository:

git merge FETCH_HEAD

Merge in any changes from the main branch:

git fetch origin master
git merge FETCH_HEAD

Now check that it still compiles, passes regression, etc. Make sure you've invoked ./configure, and then:

make check
make maintainer-clean

Assuming all that's good, do a dry run.

git push --dry-run origin master

If that's happy, push it out to your public repository.

git push origin master

If not, fix any merge failures, do an other dry run, and push.

Tracking Other Branches

Lets say you're happy tracking master, but you'd really like to track any one of the other potential branches at git.postgresql.org

git remote add <super-fun-branch> git://git.postgresql.org/super-fun-branch.git
git fetch super-fun-branch
git checkout super-fun-branch #this will stage your remote branch for a local checkout
git checkout -b super-fun-branch-name #the name can be wahtever you choose

Now you have a local branch within your local git repo tracking a different branches history. Most importantly, you can now push to that repo if you have to without making an explicit clone to track the history. It's pretty much impossible to not share some common history with the master branch.

Using Back Branches

Since the git repository contains branches for each of the major versions of PostgreSQL, it's easy to work on the latest code from an older version instead of the current one. Here's how you might list the possibilities and checkout an older version:

 git branch -r
 git checkout -b REL8_3_STABLE origin/REL8_3_STABLE

Note that if you've already checked out and used a later version, you might need to clean up some of the files left behind by it. It's suggested to run:

 make maintainer-clean

To get rid of as many of those as possible. You might need to delete some files left behind after that anyway before git will allow you to do the checkout (src/interfaces/ecpg/preproc/preproc.y can be a problem with the specific example above).

Testing a patch

This is a typical setup to review a patch text file, as typically sent by e-mail:

git checkout -b feature-to-review
patch -p1 < feature.patch

If the patch fails to apply, there will be file.rej files left behind showing the part that didn't apply. If your directory tree is clean of build information, you can easily find these later using:

git status

Context diffs with Git

Copy git-external-diff into libexec/git-core/ of your git installation and configure git to use that wrapper with:

git config [--global] diff.external git-external-diff

--global makes the configuration global for your user - otherwise it is just configured for the current repository.

For every command which displays diffs in some way you can use the parameter "--[no-]-ext-diff" to enable respectively disable using the external diff command.

For the git diff command --ext-diff is enabled by default - for any other command like git log -p or git format-patch it is not!

This method should work on all platforms supported by git.


If you don't want to configure the external wrapper permanently or you want to overwrite it you can also git like:

export GIT_EXTERNAL_DIFF=git-external-diff
git diff --[no-]ext-diff

Options to diff

When using the above wrapper you can override the options for diff by specifying the environment variable DIFF_OPTS - it defaults to -pcd. Also, postgres uses 4 column tab spacing (see also [1]).

export DIFF_OPTS=-pcd
export LESS=-x4

Patch cleanup

Patch diff submission works best when the author does a round of self-review of the actual patch--not just the code, but the physical diff file produced. Creating Clean Patches covers practices commonly used to produce better patch diff output.

Publishing Your Work

If you develop a feature over a longer period of time, you want to allow for intermediate review. The traditional approach to that has been emailing huge patches around. The more advanced approach that we want to try (see also Peter Eisentraut's blog entry) is that you push your Git branches to a private area on git.postgresql.org, where others can pull your work, operate on it using the familiar Git tools, and perhaps even send you improvements as Git-formatted patches. See the git.postgresql.org site for instructions on how to sign up, and how to use the repository. You may need to eventually create a patch via e-mail as part of officially Submitting a Patch.

Pushing New Branches

If you create a new branch, generally for a new feature test, you'll need to push it to git.postgresql.org.

 git push origin new_feature_branch

Note that, if you have a completely blank repository (such as a new repo for a pgfoundry project) then not even the branch "master" will exist and will need to be pushed.

If you are working with the postgresql core code, do NOT casually make up your own branches and push them, without clearing it on the pgsql-hackers list first. Generally, you want to use your private repo area instead.

Removing a Branch

Once your feature has been committed to the PostgreSQL repository, you can usually remove your local feature branch. This works as follows:

# switch to a different branch
git checkout master
git branch -D my-cool-feature

Working with the users/foo/postgres.git

One option while requesting a project at git.postgresql.org is to have a clone of the main postgresql repository.

That is very nice feature, but how do you sync the upstream code?!

One method is to create a git clone in your own repository and add a new remote to handle the syncing :

# clone your repos
git clone ssh://git@git.postgresql.org/users/foo/postgres.git my_postgres
# add a new remote
git remote add pgmaster git://git.postgresql.org/git/postgresql.git
# track some old versions
git checkout -b REL8_3_STABLE origin/REL8_3_STABLE
git checkout -b REL8_4_STABLE origin/REL8_4_STABLE
# change the remote of master and our old versions tracked
git config branch.REL8_3_STABLE.remote pgmaster
git config branch.REL8_4_STABLE.remote pgmaster
git config branch.master.remote pgmaster
# pull from postgres official git for each branch
# and finally push to origin
git checkout master
git pull
git push origin
git checkout REL8_3_STABLE
git pull
git push origin
git checkout REL8_4_STABLE
git pull
git push origin


This way, PostgreSQL is easy to sync for each branch. Pulling from the official and pushing to your own repository.

Create your own branch and work as usual. Users who have a local clone of the postgresql.git can add your branch in their repository and happily merge, just as you do.

Using the Web Interface

Try the web interface at http://git.postgresql.org/. It offers browsing, "blame" functionality, snapshots, and other advanced features, and it is much faster than CVSweb. Even if you don't care for Git or version control systems, you will probably enjoy the web interface.

RSS Feeds

The Git service provides RSS feeds that report about commits to the repositories. Some people may find this to be an alternative to subscribing to the pgsql-committers mailing list. The URL for the RSS feed from the PostgreSQL repository is http://git.postgresql.org/?p=postgresql.git;a=rss. Other options are available; they can be found via the home page of the web interface.

PostgreSQL Style

The PostgreSQL source uses 4-character tabs, making the output from git diff look odd. You can fix that by putting this into your.git/config file:

 [core]
   pager = less -x4

Continuing the "rsync the CVSROOT" workflow

Aidan van Dyk published a nice tutorial on how to keep several branches using a single copy of historical objects. This is roughly equivalent to keeping several checkouts of a rsync'ed copy of CVSROOT, which is what some committers were used to doing with CVS.

Personal tools