So, you want to be a developer?

From PostgreSQL wiki
Jump to navigationJump to search

by Selena Deckelmann

This document is meant as a guide for brand new developers, seeking to contribute to PostgreSQL, but unsure about how to get started, or "the right way" to get involved. Feedback is welcome, as are additional links to important documents, examples, tutorials and personal stories about contributing to the project.

We also have a Developer_FAQ

How to get started

Overview

Contributing to core PostgreSQL requires a few basic development tools - git, a C development environment and perl. Most modern Linux and BSD operating systems come with "-devel" packages usable for your development needs. At a very high level, you will:

  • Get the basic tools installed and working (git, a C development environment, and perl)
  • Clone our git source code repository
  • Compile PostgreSQL and successfully run the regression test suite

Now, you should be ready to start hacking on code!

Source code

Source code can be found at http://git.postgresql.org/gitweb?p=postgresql.git;a=summary

Once you have git installed, you can check this code out locally with the command:

 git clone https://git.postgresql.org/git/postgresql.git

While there are release tarballs available, you should use a git clone to work on code with the community.

Hacking PostgreSQL Resources

There are a couple of different resources out on the net about how to go about actually hacking on PostgreSQL; these are just a few:

Neil Conway and Gavin Sherry's original "Introduction to Hacking PostgreSQL": http://www.neilconway.org/talks/hacking/ ; presented at PgCon 2007 (http://www.pgcon.org/2007/schedule/events/8.en.html) and the PostgreSQL 10-year Anniversary Summit

Stephen Frost's 2013 PgCon Talk "Hacking PostgreSQL" (http://www.pgcon.org/2013/schedule/events/545.en.html) slides are here: http://snowman.net/~sfrost/hackingpg-pgcon13_20130506.pdf and his 2011 PgCon Talk "Review of Patch Reviewing" (http://www.pgcon.org/2011/schedule/events/368.en.html) slides are here: http://www.pgcon.org/2011/schedule/attachments/189_pg_patch_review_20110516.odp

Andrew Dunstan's "How to be a Happy Hacker", video here: http://www.youtube.com/watch?v=yFDyM29tB6k

Fabrízio Mello and Dickson Guedes's "Hacking PostgreSQL" youtube channel (PT-BR): https://www.youtube.com/channel/UCjq4gJg4tYy0NqEEo3t60IA

Style Guide

Working with our source code involves some general rules. These are documented in our core documentation: http://www.postgresql.org/docs/current/static/source.html

At a high level, we use 4-space tabbed indenting, strict ANSI C comment formatting, and our variable and function naming convention is to match the surrounding code. For example, if you see that variables use a CamelCase style, match that. If they use underscores, or are lowercase, match that. Readability and consistency within a section of code is of greater importance than universal consistency. If a section of code is being substantially reworked, developers sometimes will rework private function names and variable names to match current convention. However, projects to simply rename variables for the sake of renaming them to match current notions of coding style will be rejected.

Bug fixing

Bugs are posted to the mailing list: pgsql-bugs@postgresql.org

You can see an archive of reported bugs at: https://www.postgresql.org/list/pgsql-bugs/

Typically, a bug is posted via our bug reporting form, and then members of this list respond. Of course, not every issue posted to this list is actually a bug. A good way to learn more about how this process works is to subscribe to the list and observe for a while, before jumping in. Our software is quite complex, and development work spans a couple decades. With that history, many changes and ideas have been suggested, attempted, failed and succeeded. Please don't be discouraged if your initial ideas are rejected, significantly refactored or long-time contributors provide critical feedback to ideas or code. If contributors are responding, it is likely that they are attempting to provide direction to your work and suggesting that you try a different approach, rather than give up.

Another source of bug reports is the pgsql-general@postgresql.org mailing list. Subscribing and responding to issues posted to this list is a great way to become familiar with the common problems everyday users of PostgreSQL face. Many members of our development community are on this list and respond regularly to user issues. Reading archives, and attempting to respond to issues as they come up is a significant and useful contribution to our community.

Generally speaking, bug fixes are back-ported to affected branches whenever possible.

TODOs

It's worth checking if the feature of interest is found in the TODO list on our wiki: http://wiki.postgresql.org/wiki/TODO.

The entries there often have additional information about the feature and may point to reasons why it hasn't been implemented yet. We have attempted to organize issues and link in relevant discussions from our mailing list archives. Please read background information if it is available before attempting to resolve a TODO. If no background information is available, it is appropriate to post a question to pgsql-hackers@postgresql.org and request additional information and inquire about the status of any ongoing work on the problem.

Searching the PostgreSQL archives

Starting a project should always begin with a search of our PostgreSQL mailing list archives. You can start at: https://www.postgresql.org/list/pgsql-hackers/.

Our project's policy is to discuss as much of ongoing code work in public, including any in-progress patches, whenever possible. You may find significant and useful, but uncommitted, code in our archives that can either inform you about current or past work, or reduce the amount of work needed to accomplish a task. There are also some changes to our core project that were rejected, but are perfectly reasonable solutions to problems. Bottom line; searching our archives is a critical skill any member of our community must learn to be effective.

Brand new features

If you have a brand new idea for PostgreSQL, and you've already looked through our archives, scanned the TODO list and reviewed the code relevant to the change you'd like to make, it's time to dive into the pgsql-hackers@postgresql.org mailing list.

This is a very active list - posting 20-100 or more messages a day. If you are working on a project, it is prudent that you subscribe to the mailing list for at least the duration of the project. The list is quite large, and is made up of contributors and observers from the last 15 years of development effort.

Bruce Momjian created a presentation on how to get your patch accepted by the PostgreSQL community: http://momjian.us/main/writings/pgsql/patch.pdf

Your initial post for a new project to our mailing list should include:

  • A description of the problem to be solved, or feature to be implemented.
  • Links to relevant standards documentation.
  • A short description of the areas of source code to be modified.
  • Intended timeline for implementation.
  • Links to relevant previous discussions on PostgreSQL mailing lists about the problem or feature.
  • CC any members of the development community you'll be directly working with on the project.
  • Link to a wiki page on wiki.postgresql.org for ongoing status updates.

Your best chance of success in implementing a new feature is getting early involvement from members of the development community. It is entirely appropriate and necessary to initiate conversations about features on the pgsql-hackers mailing list, and request feedback in public from those developers who have worked on relevant or similar features in the past. We encourage this communication, and most active developers are willing and interested in providing mentorship in public for work that you undertake.

New features are always committed to 'master' (the development branch in the git repository). It is the project's policy not to add features to released major versions.

Commitfest and timing

The Commitfest process was designed to keep track of incoming patches, help synchronize development and commit effort, make the review process more obvious and transparent, and to encourage new people to participate in the development of PostgreSQL.

Developers are required to submit patches to the pgsql-hackers@postgresql.org mailing list before they will be reviewed. Once the email with the patch has been archived on the postgresql.org site, the patch can be linked into the Commitfest application (https://commitfest.postgresql.org). Commitfests are scheduled to start on the 15th of the month, and occur about every two months. We have had about five commitfests per year since the process was created.

Not all patches are required to go through the commitfest process, although most of any substantial size or requiring detailed code review will.

For the last couple of years, getting a major feature into a major dot release generally requires getting the patch into the review queue sometime between July-December. Feature freeze may happen in February, and new features will not be accepted until the new major release is complete. (A description and commentary on this is available at: http://rhaas.blogspot.com/2010/07/concurrent-development.html)

More information about Commitfests is at: http://wiki.postgresql.org/wiki/CommitFest

Participating in the development community

Information about the mailing lists is available on the Mailing Lists page, also reproduced here for your convenience.

Mailing List Culture

The PostgreSQL community exists world-wide on our mailing lists. As you dive into our community, you will encounter people with wildly varying levels of expertise for databases, software development and system administration. Excellent technical and professional advice is given freely on the mailing lists, but there is no guarantee or expectation that anyone can solve any particular problem. Flaming or personal attacks are not tolerated on our mailing lists, IRC or related forums connected to the postgresql.org site.

Above all, the PostgreSQL community's expectation is that each person treats the other with respect, and grants each other the benefit-of-the-doubt when it comes to terse or critical language. The Robustness Principle applies to participation in our community: Be conservative in what you send; be liberal in what you accept.

That said, our community is known for its aggressive and technical discussion style. For those unfamiliar with our community, our discussions can come across as insulting or overly critical. Please keep in mind that as a new contributor, you are encountering a new culture. Every culture has different rules about appropriate behavior, social norms, and expectations. Much like when learning a new language or visiting a new, unfamiliar country, your experiences while joining the PostgreSQL community will undoubtably include an "adjustment cycle". That can and likely will include high and low moments, friendly or otherwise.

As with any encounter with unfamiliar culture, you must take some time to get acquainted. Take extra time to communicate clearly. Ask for clarification if you're confused or a response doesn't make sense to you. Be careful to avoid personal attacks if someone makes a mistake. If there's one universal constant, it is that everyone makes mistakes.

Remember that we are a learning community, and with few exceptions, people are communicating with the intention of learning, sharing and refining ideas.

Email etiquette mechanics

Signatures that include "confidentiality notices" are useless in the context of PostgreSQL mailing lists. All messages to our lists are archived publicly, are immediately available worldwide and will not be removed from our archives. Please remove the notices from your email to our lists, particularly when posting code that you wish to be contributed or shared with our community.

When replying, please be respectful and use appropriate quoting. See the Mailing List Etiquette FAQ for details about what constitutes appropriate quoting when replying to mailing lists.

Our mailing lists are generally set to "reply to sender", but the preferred way to participate in threads is to "reply all". That means that you'll include both the email address of the sender and the mailing list in your response. Also, please do not send HTML-enriched email to the mailing lists.

Finally, our community generally does not "top post" in response to mailing list threads (See Wikipedia: Top Postingfor a definition of top posting).

Using the discussion lists

You can send an email directly to any of the mailing lists, without subscribing first. Any responses you receive or send should be sent to the list and CC correspondents.

If you wish to receive the mail traffic sent to a list, you can join using the subscribe form. You should receive an email in response from the mailing list manager software that handles the lists. If you wish change the various settings associated with your subscription or unsubscribe, you can do so using the web interface.

If you follow discussion through the web interface instead of subscribing, you will at some point wish to reply to a message sent to the list. Do not simply copy the message body and paste it into a message with a similar subject as a way to join the conversation. The mailing list relies on the "In-Reply-To" mail header in order to associate individual messages to their thread. If you don't know how to add this header manually, you should instead make use of the "raw" link provided on every message view to download the message as a file (in mbox format), then import it into your favorite email client and use the usual "Reply All" way of responding to mailing list messages.

Overview of discussion lists

We have two primary lists related to usage and development of postgresql: pgsql-general@postgresql.org and pgsql-hackers@postgresql.org. pgsql-general is the correct place to start if you are having a problem with your PostgreSQL installation, need help with installation, are a software developer using PostgreSQL or have a general question about the project. pgsql-hackers is the correct place to go if you have a patch to submit, would like to learn more about how to develop PostgreSQL itself, or are interested in database internals. We also have the pgsql-novice@postgresql.org list if you would like to try posting a question a smaller list, with a group of people who are there specifically to answer very basic questions.

If you are primarily interested in performance tuning, benchmarking or case studies from existing users regarding performance, pgsql-performance@postgresql.org is a great list to join.

If you're interested in contributing to website maintenance or editing, or system administration of PostgreSQL infrastructure, join the pgsql-www@postgresql.org mailing list.

If you have something to contribute to the PostgreSQL documentation, join the pgsql-docs@postgresql.org mailing list. The documentation is always in need of copy editors, testers and example generation.

If you're interested in staffing booths at conferences, giving talks at conferences, starting a user group or participating in a user group, join the pgsql-advocacy@postgresql.org mailing list. We are always in need of booth volunteers, speakers, case study writers and bloggers.

If you think you've found a bug in PostgreSQL and are new to our project, we suggest you ask about it on the pgsql-general list first, and then read our Bug Submission Guidelines and then go to our Bug Reporting form.

We also have User Group mailing lists, language-specific lists and some other specific projects with their own communities. You can find a comprehensive list of these at: http://www.postgresql.org/community/lists/

Wiki

Our wiki is active and frequently updated at: http://wiki.postgresql.org. We encourage contributors to add to the material there, and to make corrections to any errors found.

Projects related to PostgreSQL

There are hundreds of projects that are dependent upon, related to or extend PostgreSQL. You can find partial list of those projects at External Projects doc page, PGXN or the the software catalogue. Projects are written in a variety of languages, supported by international teams, and are generally fun to hack on. Spend some time exploring the ecosystem of projects around PostgreSQL to get a better feel for the variety and scope of ways that our database is used worldwide.

Our philosophy about conversations/code in public

The PostgreSQL project believes that public code review is the way to achieve our excellent quality of code. Therefore, patches for PostgreSQL must be discussed and submitted in public, and all patches are reviewed publicly. One exception to this policy is that security vulnerabilities may be disclosed to a private mailing list before fixes are published to help prevent exploitation of vulnerable users.

Related to that, conversations about code, design decisions and user experience occur on the mailing lists. We try to steer all project conversations to the mailing lists so that there is a record of the thought process behind decisions, and so that all the participants and observers of our lists can learn from them.

Resources on contributing to PostgreSQL

Acknowledgments

Thanks to Dave Page for feedback, editing and lots of questions.