Working with Git

From PostgreSQL wiki
Jump to navigationJump to search

This page collects various wisdom on working with the PostgreSQL Git repository. There are also Other Git Repositories you might work with, most notably the official Github mirror which you might fork on that site.

Getting Started

A simple way to get started might look like this:

git clone https://git.postgresql.org/git/postgresql.git
cd postgresql
git checkout -b my-cool-feature
$EDITOR
git commit -a
git diff --patience master my-cool-feature > ../my-cool-feature.patch

Note that git checkout -b my-cool-feature creates a new branch and checks it out at the same time. Typically, you would develop each feature in a separate branch.

See the documentation and tutorials at https://git-scm.com/doc/ext for a more detailed Git introduction. For an even more detailed lesson, check out the Pro Git book and maybe get a hardcopy to help support the site.

You may wish to put the following in your .git/info/exclude GitExclude. Now that the master repository has been converted to git, the standard .gitignore files should cover all build products, so you don't need most of what is listed in that reference. You might still want to exclude *~, tags, and /cscope.out, though.

Keeping your master branch local synchronized

First, add the origin as a remote. You only need to do this once:

git remote add origin https://git.postgresql.org/git/postgresql.git

Next, fetch from your public git repository:

git fetch origin master

Merge any new patches from your public repository:

git merge FETCH_HEAD

Merge in any changes from the main branch:

git fetch origin master
git merge FETCH_HEAD

Now check that it still compiles, passes regression, etc. Make sure you've invoked ./configure, and then:

make check
make maintainer-clean

Assuming all that's good, do a dry run.

git push --dry-run origin master

If that's happy, push it out to your public repository.

git push origin master

If not, fix any merge failures, do an other dry run, and push.

Tracking Other Branches

Lets say you're happy tracking master, but you'd really like to track any one of the other potential branches at git.postgresql.org

git remote add <super-fun-branch> https://git.postgresql.org/super-fun-branch.git
git fetch super-fun-branch
git checkout super-fun-branch #this will stage your remote branch for a local checkout
git checkout -b super-fun-branch-name #the name can be wahtever you choose

Now you have a local branch within your local git repo tracking a different branches history. Most importantly, you can now push to that repo if you have to without making an explicit clone to track the history. It's pretty much impossible to not share some common history with the master branch.

Using Back Branches

Since the git repository contains branches for each of the major versions of PostgreSQL, it's easy to work on the latest code from an older version instead of the current one. Here's how you might list the possibilities and checkout an older version:

 git branch -r
 git checkout -b REL_15_STABLE origin/REL_15_STABLE

Note that if you've already checked out and used a later version, you might need to clean up some of the files left behind by it. It's suggested to run:

 make maintainer-clean

To get rid of as many of those as possible. You might need to delete some files left behind after that anyway before git will allow you to do the checkout (src/interfaces/ecpg/preproc/preproc.y can be a problem with the specific example above).

Testing a patch

This is a typical setup to review a patch text file, as typically sent by e-mail:

git checkout -b feature-to-review
patch -p1 < feature.patch

If the patch fails to apply, there will be file.rej files left behind showing the part that didn't apply. If your directory tree is clean of build information, you can easily find these later using:

git status

Patch cleanup

Patch diff submission works best when the author does a round of self-review of the actual patch--not just the code, but the physical diff file produced. Creating Clean Patches covers practices commonly used to produce better patch diff output.

Publishing Your Work

If you develop a feature over a longer period of time, you want to allow for intermediate review. The traditional approach to that has been emailing huge patches around. The more advanced approach that we want to try (see also Peter Eisentraut's blog entry) is that you push your Git branches to a private area on git.postgresql.org, where others can pull your work, operate on it using the familiar Git tools, and perhaps even send you improvements as Git-formatted patches. See the git.postgresql.org site for instructions on how to sign up, and how to use the repository. You may need to eventually create a patch via e-mail as part of officially Submitting a Patch.

Pushing New Branches

If you create a new branch, generally for a new feature test, you'll need to push it to git.postgresql.org.

 git push origin new_feature_branch

Note that, if you have a completely blank repository then not even the branch "master" will exist and will need to be pushed.

If you are working with the postgresql core code, do NOT casually make up your own branches and push them, without clearing it on the pgsql-hackers list first. Generally, you want to use your private repo area instead.

Removing a Branch

Once your feature has been committed to the PostgreSQL repository, you can usually remove your local feature branch. This works as follows:

# switch to a different branch
git checkout master
git branch -D my-cool-feature

Using git hooks

Git hooks are scripts that run when certain events such as a commit or a push happen. They are placed in your .git/hooks directory. Here is a sample script that checks when you commit if your code has been properly indented, and optionally re-indents it for you (by setting the PGAUTOINDENT environment variable to yes). To use this, place it in .git/hooks/pre-commit and make it executable using chmod +x .git/hooks/pre-commit

#!/bin/sh
set -u
: ${PGAUTOINDENT:=no}

# the files in the commit
if git diff --cached --name-only --diff-filter=ACMR | grep src/tools/pgindent/typedefs.list > /dev/null; then
    # if typedefs.list is changed, we need to re-run pgindent on all files
    files='src contrib'
else
    files=$(git diff --cached --name-only --diff-filter=ACMR)
fi

check_indent () {
  # no need to filter files - pgindent ignores everything that isn't a
  # .c or .h file

  if [ "$PGAUTOINDENT" = yes ] ; then
    TEMPFILE=$(mktemp)
    trap "rm $TEMPFILE" EXIT
    if ! src/tools/pgindent/pgindent --check --diff $files > $TEMPFILE; then
      patch -p0 < $TEMPFILE
      echo "Commit abandoned. Rerun git add+commit to adopt pgindent changes"
      exit 1
    fi
  elif ! src/tools/pgindent/pgindent --check $files; then
    echo 'You need a pgindent run, e.g:'
    echo -n 'src/tools/pgindent/pgindent '
    if [ $files = 'src contrib' ]; then
        echo $files
    else
        echo '$(git diff --name-only --diff-filter=ACMR)'
    fi
    exit 1
  fi
  
}

# nothing to do if there are no files
test -z "$files" && exit 0
check_indent

Working with the users/foo/postgres.git

One option while requesting a project at git.postgresql.org is to have a clone of the main postgresql repository.

That is very nice feature, but how do you sync the upstream code?!

One method is to create a git clone in your own repository and add a new remote to handle the syncing :

# clone your repos
git clone ssh://git@git.postgresql.org/users/foo/postgres.git my_postgres
# add a new remote
git remote add pgmaster https://git.postgresql.org/git/postgresql.git
# track some old versions
git checkout -b REL8_3_STABLE origin/REL8_3_STABLE
git checkout -b REL8_4_STABLE origin/REL8_4_STABLE
# change the remote of master and our old versions tracked
git config branch.REL8_3_STABLE.remote pgmaster
git config branch.REL8_4_STABLE.remote pgmaster
git config branch.master.remote pgmaster
# pull from postgres official git for each branch
# and finally push to origin
git checkout master
git pull
git push origin
git checkout REL8_3_STABLE
git pull
git push origin
git checkout REL8_4_STABLE
git pull
git push origin


This way, PostgreSQL is easy to sync for each branch. Pulling from the official and pushing to your own repository.

Create your own branch and work as usual. Users who have a local clone of the postgresql.git can add your branch in their repository and happily merge, just as you do.

Using the Web Interface

Try the web interface at https://git.postgresql.org/. It offers browsing, "blame" functionality, snapshots, and other advanced features, and it is much faster than CVSweb. Even if you don't care for Git or version control systems, you will probably enjoy the web interface.

RSS Feeds

The Git service provides RSS feeds that report about commits to the repositories. Some people may find this to be an alternative to subscribing to the pgsql-committers mailing list. The URL for the RSS feed from the PostgreSQL repository is https://git.postgresql.org/gitweb/?p=postgresql.git;a=rss. Other options are available; they can be found via the home page of the web interface.

PostgreSQL Style

The PostgreSQL source uses 4-character tabs, making the output from git diff look odd. You can fix that by putting this into your.git/config file:

 [core]
   pager = less -x4

Continuing the "rsync the CVSROOT" workflow

Aidan van Dyk published a nice tutorial on how to keep several branches using a single copy of historical objects. This is roughly equivalent to keeping several checkouts of a rsync'ed copy of CVSROOT, which is what some committers were used to doing with CVS.