PGXN

From PostgreSQL wiki

Revision as of 19:04, 28 April 2013 by Jjanes (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Not to be confused with PGXS

This document outlines the specification for the creation of PGXN, the PostgreSQL Extension Network (formerly known as "PGAN"). The purpose of PGXN is to provide a central repository and distribution system for open-source PostgreSQL extension libraries. In its first iteration it will consist of four basic parts:

  1. Upload Server. Users will be able to create logins and upload extension distributions.
  2. An rsync-able directory of extension distributions and metadata.
  3. A site for searching through extension releases and perusing extension documentation.
  4. A command-line client for downloading, testing, and installing extensions.

This document outlines the details for the structure of this first iteration of PGXN.

Contents

Upload Server

The models for the upload server include PAUSE and JAUSE. The interface will be very simple: A site to login to (upload.pgxn.org), if possible use existing postgresql.org logins. From there, users can upload release packages to the site and add or delete release managers.

Once a package is uploaded, it is considered released and its name registered to the uploading user. The upload server will add its metadata to its database and add the package and updated metadata to the rsync-able directory.

Extensions are unique by name across PGXN. If user "theory" uploads a extension named "pgTAP", no other user will be able to upload an extension with that name unless "theory" adds them as release managers via the upload server.

The upload server will have a Web UI for easy extension release, but will also provide a complete Web API so that release management can be scripted.

Distribution Layout

Distributions may be uploaded to the upload server as Gzip-, Bzip2-, or Zip-compressed packages. The upload server will unpack it, validate its contents, and repack it with Zip compression into a file ending with .pgz. A distribution package is required to have a file named META.json in its root directory. All other files and directories are optional, but may be required for added functionality on the search site or in the command-line client.

PGXN Meta

META.json is a JSON-formatted document containing metadata about the distribution, following the CPAN Meta spec, which has the advantage of being developed over the course of the last 10 years. This meta file describes the extension and its dependencies. Although most or all of the structure supported by the CPAN meta spec will be supported, at a minimum, the required keys will be:

  • name: The name of the extension. This must be unique across all extensions on PGXN. The names "PostgreSQL", "Postgres", "pgsql", and "psql" will be reserved.
  • version: The release version of the extension. This must be unique across all versions of the extension and must be numeric.
  • license: The license or licenses under which the extension is distributed, defined one of a set of predefined strings, a URL, or a structure identifying the license name and URL.

Other keys may be required for added functionality in the command-line client or the search site.

Other Contents

All other files are optional or may be put wherever makes sense for the build. There will, however, be a number of additional organizational expectations from the search site and the command-line client. Note that these are recommendations, but will make it easier to install an extension or to gain higher visibility on the search site.

  • Makefile: This is the standard PGXS build system for PostgreSQL extensions. See the contrib modules for examples.
  • t/ or test/ or tests/ or sql/: The test suite for the extension. The command-line client will always look for and run tests. If the files are in sql/ and there is also an expected/ directory, the tests are assumed to be pg_regress tests. Otherwise, they will be assumed to be pgTAP tests.
  • README or README.$extension_name: May contain anything you like, but would recommend that it contain a high-level introduction to the extension, in which the author tries to convince the reader how awesome the extension is, as succinctly as possible. May also include installation instructions, or they may be placed in a separate INSTALL file (which may be more useful when there are third-party dependencies to satisfy).
  • LICENSE: Under what terms is the extension and all the other stuff in the distribution distributed?
  • Changes or ChangeLog: Keep users abreast of the latest changes.
  • doc or docs or documentation: Documentation for the extension. May include any number of files. The search site will convert these to HTML for display on the site. The formats it will support must be indicated by file name extensions. Initially, we'll aim to support the formats and file-name suffixes supported by GitHub. The contents of the <title> tag in the resulting HTML files will be used to create links to the documentation files.

PGXN Directory

The entire directory of PGXN will be available for mirroring via rsync. This will allow anyone to create mirrors of PGXN for purposes of broader distribution or for private use. Mirror hosts can register as official mirrors, in which case their mirrors will be included in PGXN database and listed on the PGXN Web site.

Search Site

The main site for PGXN will contain information about all the distributions on PGXN, including relevant metadata. It will also include information about registered PGXN users, including a list of extensions released by each user. Furthermore, any documentation included in distributions will be formatted as individual HTML files on the site. And most importantly, all of the content of the site will be indexed and searchable. This functionality will be modeled on search.cpan.org and OpenJSAN.

The first goal of the site is to allow visitors to search for extensions that might meet their particular needs, and be able to read the documentation before downloading. The search engine will highlight the names, abstracts, and keywords of the modules, with secondary importance given to the entire contents of the documentation of each module.

The site will be updated from the upload server hourly, and will include a feed of recent releases uploaded to PGXN. This will make it easy for people to follow new developments in the PostgreSQL extension community in their news readers, as well as to integrate release information into other sites (such as a box on the main PostgreSQL site).

The second goal of the site is to highlight PostgreSQL extensions as a community resource, making it easy to find extensions via Google searches and the like. As bloggers and publications use direct links to the documentation for particular extensions, search rankings will only increase, exposing extensions included in the PGXN to wider visibility. This model has worked extremely well for the Perl community via search.cpan.org, and can work as well to highlight the extensibility and availability of extensions for PostgreSQL.

PGXN Client

The PGXN client will be a command-line client modeled on the CPAN and cpanminus clients. Essentially, it will provide an interface to easily download, build, test, and install PGXN distributions. It will be written in Perl so as to maximize the benefits provided by the existing CPAN clients.

For the initial implementation, the PGXN client will assume that extensions can be built using PGXS with:

make USE_PGXS=1
make USE_PGXS=1 install
make USE_PGXS=1 installcheck

If there is no expected directory in a distribution, rather than running make installcheck, the PGXN client will assume that any tests are pgTAP tests an run them as such instead. This will lower the barrier to writing tests, as test writers won't have to maintain expected output files as pg_regress requires.

The PGXN client will make no other assumptions about how to build and install extensions, leaving such to the PostgreSQL core. To the extent that PGXS-powered make works on a given platform, the client will support it.

The PGXN client will be configurable via a configuration file as well as at runtime. Configurations will include (among other settings):

  • What mirrors to use
  • What command to use to build (e.g., gmake)
  • What command to use to install (e.g., sudo make)
  • Where to cache data (PGXN index, downloads, builds, etc.)
  • Location of pg_config

Upon the first run, it will do its best to self-configure, so that users can immediately start downloading and building extensions with a minimum of hassle.

Implications for PostgreSQL

The design of the PGXN is such as to allow extensions to be created and released using the existing infrastructure provided by PostgreSQL itself. It has no opinions about how extensions should be built or improved in PostgreSQL, being focused more on distribution, documentation, community exposure, and ease of installation. Whatever decisions are made in this regard, the PGXN infrastructure will be updated as necessary to make the most of it.

Building on PGXN

Once the initial implementation of PGXN is complete and deployed, we can start building other tools to enhance its value to the community and to users. These include tools in the PostgreSQL core to make it easier to create, test, and install extension distributions, as well as tools to assist in the evaluation of extensions.

Think of this section as ideas for phase 2 projects. Some examples:

  • New PGXS targetsTo make extension creation and release management simpler, some new features for PGXS can be contributed to the PostgreSQL core. Such features might include new make targets to support extensions:
    • make distmeta: Generate a META.json file from the contents of variables in Makefile.
    • make test: Build a new database cluster and start PostgreSQL on an unreserved port -- including any DSOs created by make -- and then run the extension's test suite.
    • make dist: Create a distribution package ready for release on PGXN.
    • make distcheck: Like make dist, but instead of creating the package, it creates a distribution directory and then runs make test.One functional change might make it simpler to specify a schema into which to install an extension, like so:psql --set INSTALLSCHEMA=extensions extesion.sql
  • PGXN RatingsLike CPAN Ratings, users would be able to review and rate individual PGXN extensions.
  • PGXN Testers. Like CPAN Testers, volunteers can set up sandboxed test environments to regularly run tests and report results to a central database. This will allow ongoing testing to ensure that extensions work on various platforms and with various versions of PostgreSQL. Such results will help maintainers to keep their extensions working in supported environments and users to evaluate how well maintained extensions are.

FAQ

  • What's allowed to be released on PGXN? Open-source PostgreSQL extension release packages. The current contrib extensions serve as the model for the contents of such packages. Following the CPAN example, "no commercial software of any kind, not even share/guilt/donateware, will be allowed…any other policy would be open to nitpicking, or maybe even legal challenges."
  • WTF is an "extension"? An extension is a piece of software that adds functionality to PostgreSQL itself. Examples are data types (CITEXT, PERIOD), utilities (newsysviews, pgTAP), and procedural languages (PL/Ruby, PL/R), among others. See Extending SQL) for details. An extension is *not* a piece of software designed to run on top of PostgreSQL (Bricolage, Drupal).
  • What's not allowed to be released on PGXN? Non-package files (that is, files that are not tarballs, bzip-balls, or zip archives), closed-source distributions, and distributions with no license.
  • Who can release on PGXN? Any registered user.
  • Who can register for PGXN? Anyone who applies. Such registrations will be approved by volunteers, though if the existing community registrations can be used, that would be preferred.
  • Will there be an approval process? Short answer: No, because PGXN needs to KISS. Longer answer: No. Again following the CPAN example, PGXN "will stay an open and free forum, where the authors decide what they upload. Any further selection belongs to different fora." This is because "the first goal of PGXN is to make it easy to submit code and redistribute it. Ease of use and quality control are not the central problems [it] tries to solve." Frankly, moderation of releases is a significant reason that other communities have failed to duplicate the success of CPAN.
  • How is this different from pgFoundry? pgFoundry is for project hosting and includes SCM, issue tracking, mailing lists, Web hosting, and mirrored download support for any project related to PostgreSQL. PGXN will be for extensions distribution and mirroring, easy downloading and installation, documentation and metadata, and searching and reporting. The only thing in common with pgFoundry is uploading release packages. They otherwise serve very different purposes (project management vs. distribution and exposure).
  • Don't we need to wait for extensions in core? No. The current support for building extensions as exhibited by the contrib extensions works for most platforms. As core extension support improves, the PGXN client will be updated to take advantage of new features. Such improvements are expected to make PGXN more successful, but are not required to get it started now.
  • But don't you need the naming scheme to be worked out first? That might be ideal, but for now it will be adequate to simply require a unique name for a given extension. Once core supports formal extensions, its naming scheme will be adopted by PGXN.
  • Will it require changes to PGXS? No. As changes are made to PGXS, PGXN will take advantage of them, but none are required in order to bootstrap PGXN.
  • How will PGXN make it easy to distinguish the garbage from the viable extensions? The first step will be the search site, which will allow users to find extensions relevant to them and read their documentation. This will "often [be] enough to distinguish the good stuff from the crap," as Robert Haas says. As more extensions are released on PGXN with competing features and functionality, the addition of ratings features and dedicated testing will also make it easier to evaluate competing options.
  • What about Windows? The PGXN client will always follow the lead of the PostgreSQL core on installing extensions. If support for installing extensions on Windows improves such that a compiler is no longer required, the PGXN client will be modified as appropriate to take advantage of it. This applies not specifically to Windows, but to the ability of the core installer (or any future community-supported installer) to work on any platform.
  • What kind of security will it have? Each release package will have an accompanying SHA1 hash that the PGXN client will verify before installing an extension.

References

Personal tools