GSoC 2011

From PostgreSQL wiki

(Difference between revisions)
Jump to: navigation, search
(2011 Mentors)
(Project Ideas)
Line 20: Line 20:
 
* Create autotuning utilities for automated interaction with system information and/or logs
 
* Create autotuning utilities for automated interaction with system information and/or logs
 
* Help finish the JSON data type
 
* Help finish the JSON data type
 +
* PL/Erlang
 
* Please add ideas here!
 
* Please add ideas here!
  
Line 35: Line 36:
 
* Enhance Buildfarm to test external packages, patches, or performance: The PostgreSQL project maintains a public buildfarm that continuously communicates with several machines that checkout and build the PostgreSQL source on a regular basis. The idea behind this project is to extend the current buildfarm code to allow it download external modules and report back on their build status, to download unapplied patches and test them, or to run performance tests.
 
* Enhance Buildfarm to test external packages, patches, or performance: The PostgreSQL project maintains a public buildfarm that continuously communicates with several machines that checkout and build the PostgreSQL source on a regular basis. The idea behind this project is to extend the current buildfarm code to allow it download external modules and report back on their build status, to download unapplied patches and test them, or to run performance tests.
 
* Enhance "Performance-Farm" framework for continuous performance regression testing: Similar to the buildfarm, the idea behind this project is to create a public infrastructure that continuously communicates with several machines that checkout and build the PostgreSQL source on a regular basis, running and reporting on agreed performance benchmarks.
 
* Enhance "Performance-Farm" framework for continuous performance regression testing: Similar to the buildfarm, the idea behind this project is to create a public infrastructure that continuously communicates with several machines that checkout and build the PostgreSQL source on a regular basis, running and reporting on agreed performance benchmarks.
 +
* Finish the AOX integration for archives (see http://archives.beccati.org/ which has been done in PHP, has to be redone in django to integrate into the architecture)
  
 
=== External Applications ===
 
=== External Applications ===

Revision as of 16:09, 14 March 2011

Contents

What is GSoC?

Google Summer of Code (GSoC) is a global program that offers student developers stipends to write code for various open source software projects. We have worked with several open source, free software, and technology-related groups to identify and fund several projects over a three month period. Since its inception in 2005, the program has brought together over 4,500 students and more than more than 4,000 mentors & co-mentors from over 85 countries worldwide, all for the love of code. Through Google Summer of Code, accepted student applicants are paired with a mentor or mentors from the participating projects, thus gaining exposure to real-world software development scenarios and the opportunity for employment in areas related to their academic pursuits. In turn, the participating projects are able to more easily identify and bring in new developers. Best of all, more source code is created and released for the use and benefit of all.

PostgreSQL has an official summer of code page: http://www.postgresql.org/developer/summerofcode.html

We also have a Ukraine translation: http://webhostinggeeks.com/science/project-postgresql-ua

Project Ideas

  • Add support for multi-line GUCs (configuration variables)
  • Add ability to control CSV log fields and order and allow CSV log format changes on SIGHUP
  • Eliminate 1GB memory allocation limit for individual queries (MaxAllocSize and friends)
  • Build SELinux demonstration for addressing PCI compliance
  • Bulk insert for GiST
  • Improve the auto-configuration script (Python or Perl)
  • Write Foreign Data Wrappers for several external data sources (ODBC, SQL Server, Oracle, MySQL, CouchDB, Redis, etc.)
  • Integrate SQL/MED and PL/Proxy
  • Refactor the NewSysViews project to integrate it with information_schema for inclusion in core PostgreSQL.
  • Create autotuning utilities for automated interaction with system information and/or logs
  • Help finish the JSON data type
  • PL/Erlang
  • Please add ideas here!

More ideas:

Core Source Code

  • TODO Items: A number of the items on our TODO list have been marked as good projects for beginners who are new to the PostgreSQL code. Items on this list have the advantage of already having general community agreement that the feature is desireable. These items should also have some general discussion available in the mailing list archives to help get you started. You can find these items on the TODO list, they will be marked with an [E].
  • DDL Functions: Create a SQL-callable function capable of generating DDL scripts for objects within the database.
  • EXPLAIN Enhancements: Add information to EXPLAIN documenting I/O activity, discarded plans, costing detail, memory usage and more.
  • Text-Array Indexing: Add support for indexing text or multi-typed array data, capable of supporting indexed queries, similar to intarray.

Infrastructure

  • Enhance Buildfarm to test external packages, patches, or performance: The PostgreSQL project maintains a public buildfarm that continuously communicates with several machines that checkout and build the PostgreSQL source on a regular basis. The idea behind this project is to extend the current buildfarm code to allow it download external modules and report back on their build status, to download unapplied patches and test them, or to run performance tests.
  • Enhance "Performance-Farm" framework for continuous performance regression testing: Similar to the buildfarm, the idea behind this project is to create a public infrastructure that continuously communicates with several machines that checkout and build the PostgreSQL source on a regular basis, running and reporting on agreed performance benchmarks.
  • Finish the AOX integration for archives (see http://archives.beccati.org/ which has been done in PHP, has to be redone in django to integrate into the architecture)

External Applications

  • pgAdmin III / phpPgAdmin Enhancements: PostgreSQL supports a number of popular GUI Tools that are not distributed with the core project. Projects like these often have their own TODO lists and compatibility issues with the core PotgreSQL that need development. We welcome ideas for these projects as well.
  • pgAdmin III : add graph support to the server status window, add more ways to view an explain analyze in the query tool (and handle xml explain)
  • Procedural Language Improvements: PostgreSQL provides support for more than a dozen different procedural languages, however the level of support varies depending on the language implementation. Enhancing support of these procedural languages might include fixing build issues, adding SPI support, adding trigger support, adding support for IN/OUT parameters and more.
  • Replication and Clustering: PostgreSQL provides a wide range of replication solutions for varying types of replication needs. Many of these projects need assistance with building against different versions of PostgreSQL, installation and setup, administrative tools, and general bugfixing.
  • Teaching & Learning Tools: External tools to help in teaching the internals of PostgreSQL such as enhanced visual EXPLAIN, graphical models of the query engine, educational guides to the code, and interactive tools to demonstrate various query types.
  • PostGIS & GEOS: Add performance and feature enhancements to PostgreSQL's geographic data support.

Additional projects may be found by browsing the PostgreSQL Development Projects website.

Projects

The GSoC projects for 2011 are:

  • TBD :)

We are also working on a twitter list for those involved this year. If you are missing, please send us your info:

Key Info

  • March 7th - 11th, Mentoring Application Due
  • March 29th - April 9th, Student Applications Due

Project Admins

  • Selena Deckelmann - Co-admin, Mentor Summit attendee.
  • Josh Berkus - Co-admin, Mentor Summit attendee.
  • Robert Treat - Past mentor 2x, co-admin, Mentor Summit attendee.

2011 Mentors

  • Dave Page - Former mentor - pgAdmin, Windows, Packaging, Infrastructure
  • Heikki Linnakangas - Postgres Committer
  • Magnus Hagander - Postgres Committer, Windows, pgAdmin
  • Guillaume Lelarge - pgAdmin
  • Jehan-Guillaume de Rorthais - phpPgAdmin
  • Joe Abbate - Python-related, catalog-related projects
  • David E. Wheeler - Perl-related, extensions, PGXN
  • Mark Wong - benchmarking, monitoring, performance
  • Tatsuo Ishii - Postgres Committer, pgpool-II
  • Stephen Frost - Postgres contributor
  • Devrim Gündüz - Administration related software (dashboard)

additional reviewers

  • Josh Berkus - auto-configuration, performance testing
  • Selena Deckelmann - configuration, testing
  • Andreas Scherbaum - performance, configuration, testing

Past Success

  • Florian - read-only on snapshots, no advancement of xid on select statements
  • Ivan - Full-Text Support in phpPgAdmin
  • Mickael Deloison - pgScript engine for pgAdmin
  • Luis Alberto Ochoa Paz - Graphical query builder for pgAdmin
  • Leonardo Sapiras - browsing data through FK in phpPgAdmin + some additionnal stuffs

GOALS:

  • usable code
    • useful/novel ideas
    • research projects
  • longer term contributors

TODOs:

  • Kick-off Meeting for Community Members
  • Update GSOC page
  • Advertising?
  • Blog that we're participating and seeking students
  • Round of private emails to people who have participated in the past: Heikki, Simon, Mark
    • request interest, and then follow up in asking about possible topics for students
  • Mentor recruitment and then email to -hackers
    • do this much later when we have some proposals in?
  • Recruitment -- no organized group effort?
    • -announce, -general, -hackers
    • user group lists
    • phppgadmin/pgadmin
    • berkeley
    • Univ. of Maryland -- contact them?
  • Identify the commitfest that the code will be submitted to

Expectations

  • Stuff to keep students together:
    • Regular blogging from students
    • weekly group IRC checkin? -- two checkin times maybe?
  • Have students communicate on -hackers where appropriate (didn't really work?)
    • Or other relevant -devel lists
  • Mailing list
    • pgsql-students (?) vs. -hackers (?) maybe up to mentor?
    • mentors mailing list -admin mailing list, berkus said?
    • students mailing list via gsoc
Personal tools