A meeting of the most active PostgreSQL developers is being planned for Wednesday 16th May, 2012 near the University of Ottawa, prior to pgCon 2012. In order to keep the numbers manageable, this meeting is '''by invitation only'''. Unfortunately it is quite possible that we've overlooked important code developers during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).

Please note that this year the attendee numbers have been cut to try to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 9.2 release cycle. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies, unlike in previous years.

This is a PostgreSQL Community event. Room and refreshments/food sponsored by EnterpriseDB. Other companies sponsored attendance for their developers.

== Time & Location ==

The meeting will be from 8:30AM to 5PM, and will be in the "Red Experience" room at:

Novotel Ottawa
33 Nicholas Street
Ottawa
Ontario
K1N 9M7

Food and drink will be provided throughout the day, including breakfast from 8AM.

[http://maps.google.ca/maps?f=q&source=s_q&hl=en&geocode=&q=novotel+ottawa&aq=&sll=49.891235,-97.15369&sspn=36.237851,79.013672&ie=UTF8&hq=novotel+ottawa&hnear=&ll=45.421528,-75.683699&spn=0.036869,0.077162&z=14&iwloc=A&layer=c&cbll=45.425741,-75.689638&panoid=Z4FUGnkZkdHAOkIxyjjS9Q&cbp=12,25.83,,0,-0.6 View on Google Maps]

== Attendees ==

The following people have RSVPed to the meeting (in alphabetical order, by surname):

* Oleg Bartunov
* Josh Berkus (Secretary)
* Jeff Davis
* Andrew Dunstan
* Dimitri Fontaine
* Stephen Frost
* Peter Geoghegan
* Kevin Grittner
* Robert Haas
* Magnus Hagander
* Shigeru Hanada
* Hitoshi Harada
* KaiGai Kohei
* Tom Lane
* Noah Misch
* Bruce Momjian
* Dave Page (Chair)
* Simon Riggs
* Teodor Sigaev
* Greg Smith

== Proposed Agenda Items ==

Please list proposed agenda items here:

* Agree CommitFest schedule for 9.3 (Strawman from Simon)
** CF1 June 15, 2012 - 1 month
** CF2 Sep 15, 2012 - 1 month
** CF3 Nov 15, 2012 - 1 month
** CF4 Jan 15, 2013 - 2 months
* Queuing [Dimitri, Kevin]
** Description: efficient and transactional queuing is a very common need for application using databases, and could help implementing some internal features
** Goals: get an agreement that core is the right place where to solve that problem, and what parts of it we want in core exactly
* Materialized views [Kevin]
** Description: Declarative materialized views are a frequently requested feature, but means many things to many people. It's not likely that an initial implementation will address everything. We need a base set of functionality on which to build.
** Goals: Reach consensus on what a minimum feature set for commit would be.
* Partitioning and Segment Exclusion [Dimitri]
** Description: to solve partitioning, we need to agree on a global approach
** Goals: agreeing on SE as a basis for better partitioning, having a "GO" on working on SE
* MERGE: Challenges and priorities [Peter G]
** Description: Implementing the MERGE statement for 9.3. It is envisaged specifically as an atomic "upsert" operation.
** Goals: To get buy-in on various aspects of the feature's development, and, ideally, to secure reviewer resources or other support. Because of the complexity of the feature, early interest from reviewers is preferable.
* Row-level Access Control and SELinux [KaiGai]
** Security label on user tables
** Dynamic expandable enum data types
** Enforcement of triggers by extension
* Enhancement of FDW at v9.3 [KaiGai]
** Writable foreign tables
** Stuffs to be pushed down (Join, Aggregate, Sort, ...)
** Inheritance of foreign/regular tables
** Constraint (PK/FK) & Trigger support.
* Type registry [Andrew]
** Provide for known OIDs for non-builtin types, and possibly for their IO functions too
** Would make it possible to write code in core or in extension X that handles a type defined in extension Y.
* Ending CommitFests in a timely fashion, especially the last one. Avoiding a crush of massive feature patches at the end of the cycle. Handling big patches that aren't quite ready yet. Getting more people to help with patch review. [Robert]
* What Developers Want [Josh]
** Description: a top-5 list of features and obstacles to developer adoption of PostgreSQL (with slides)
** Goal: to set priorities for some features aimed at application users
* In-Place Upgrades & Checksums [Greg Smith, Simon]
** Description: Revisit in-place upgrades of the page format, now that pg_upgrade is available and multiple checksum implementations needing it have been proposed.
** Goal: Nail down some incremental milestones for 9.3 development to aim at.
* Autonomous Transactions [Simon]
** Overview of idea, relationship to stored procedures
** Feedback, buy-in and/or alternatives
* Parallel Query [Bruce Momjian]
** Hope to get buy-in for what parallel operations we are hoping to add in upcoming releases
* Report from Clustering Meeting [Josh] (10 min)
** Description: to summarize the discussions of the cluster-hackers meeting from the previous day
** Goal: inter-team synchronization. Possibly, decisions requested on specific in-core features.
* Double Write Buffers [Simon]
** Is anyone committing to do that for 9.3?

* Goals, priorities, and resources for 9.3 [All]
** For roadmap and planning purposes, set expectations and coordinate work schedules for 9.3. Confirm who is doing what, identify interested reviewers at start, and check for gaps.

== Agenda ==

{| border="1" cellpadding="4" cellspacing="0"
!Time
!Item
!Presenter
|- style="font-style:italic;background-color:lightgray;"
|08:00
|Breakfast
|

|- style="font-style:italic;background-color:lightgray;"
|08:30 - 08:45
|Welcome and introductions
|Dave Page

|-
|08:45 - 09:15
|Autonomous transactions
|Simon Riggs

|-
|09:15 - 09:40
|[[Queuing]]
|Dimitri Fontaine/Kevin Grittner

|-
|09:40 - 09:50
|Report from the Clustering Meeting
|Josh Berkus

|-
|09:50 - 10:10
|Type registry
|Andrew Dunstan

|-
|10:10 - 10:30
|Access control and SELinux
|KaiGai Kohei

|- style="font-style:italic;background-color:lightgray;"
|10:30 - 10:45
|Coffee break
|

|-
|10:45 - 11:15
|Enhancement of FDWs in 9.3
|KaiGai Kohei

|-
|11:15 - 11:30
|What developers want
|Josh Berkus

|-
|11:30 - 12:00
|Parallel Query
|Bruce Momjian

|-
|12:00 - 12:30
|MERGE: Challenges and priorities
|Peter Geoghegan

|- style="font-style:italic;background-color:lightgray;"
|12:30 - 13:30
|Lunch
|

|-
|13:30 - 14:00
|Materialised views
|Kevin Grittner

|-
|14:00 - 14:20
|In place upgrades and checksums
|Simon Riggs/Greg Smith

|-
|14:20 - 14:45
|Partitioning and segment exclusion
|Dimitri Fontaine

|-
|14:45 - 15:00
|Commitfest Schedule
|All

|- style="font-style:italic;background-color:lightgray;"
|15:00 - 15:15
|Tea break
|

|-
|15:15 - 15:40
|Commitfest management
|Robert Haas

|-
|15:40 - 16:45
|Goals, priorities, and resources for 9.3
|All

|- style="font-style:italic;background-color:lightgray;"
|16:45 - 17:00
|Any other business/group photo
|Dave Page

|- style="font-style:italic;background-color:lightgray;"
|17:00
|Finish
|
|}

==Minutes==

== 2012 Developer Meeting Minutes ==

Started with introductions.

=== Autonomous Transactions ===

Simon brought this to get some feedback on the idea. Autonomous transactions (ATX) are a transaction inside a transaction ... a new top-level transaction. In Oracle, it's not just one new transaction, it's a whole new context which can submit multiple new transactions. There is no connection between parent and child transactions, which can result in new types of deadlocks.

Each new transaction context would allocate a new pg_exec from a pg_proc call. Implementation is straightforwards, just have to handle locking. Allows us to implement stored procedures in an interesting way. If we treat a stored procedure as an autonomous transaction, then this solves some problems. We can put COMMIT< ROLLBACK, other things in stored procedures.

Tom suggested that ATX don't need to conflict with parent transaction locks. Noah pointed out some issues with that. We'd need to have a switch for Stored Procedures in order to indicate they are autonomous, like using CREATE STORED PROCEDURE. We'd be using an additional client slot for each ATX, which could be a problem. Oracle's limit on ATX is 70 per connection, which seems like a lot. Maybe we should try to hold them all to a single session like it was a subtransaction. Not sure if we can do this, Simons will need to take a look at is.

ATX also need to eventually be able to run utility commands, like VACUUM and CREATE INDEX CONCURRENTLY.

=== Queueing ===

Ultimately the materialized views will need some kind of queueing. Once we have queueing in core, it could be generally useful. CLUSTER CONCURRENTLY would need it, or application queues will need queueing structure. We might want to have it exposed at the SQL level. You put things in the queue, and at commit, others can see it. LISTEN/NOTIFY is sort of a queue, but is only one item and vanishes if you're not listening.

Like a table, but access semantics are different. Would need logged/unlogged queues. Some discussion about how queues are different from tables. Haas wondered about whether what we need for interal queues are the same as what users need for user-visible queues.

Queue-tables also need different performance characteristics. We don't need queues so much as we need deferred action. We also need background processes which wake up and check the queue. Queues could be built on top of tables. Discussion about uses, designs for queues ensued.

We need a really clear design spec for how queues would work. There are specific performance improvements we want for queueing, but they're likely to be just improvements on table performance. The idea is to have a generalized API instead of reinventing a bunch of times.

Next steps is to collect use cases. [[Queueing|Kevin & Dimitri will collect use cases on a wiki page]], to design an API. Performance optimization needs to look at access pattern. Simon pointed out that this works similar to fact tables where you want to move stuff forward constantly. Users might not use queues as pure FIFO.

Unlinking segments works for deleting from the beginning of a table but indexes could be a problem. Block numbers could be a practical problem, we might need wraparound, or reset-to-zero.

=== Report on Clustering Meeting ===

See [[PgCon2012CanadaClusterSummit|minutes]].

=== Type Registry ===

WIP idea. Hstores aren't build in, so they get an arbitrary OID, which causes issues with writing generic code. Looking up they type name is expensive. It would be nice to have a registry for types where people writing extensions are allocated an OID. Andrew gave example of hacking Postgres to support upgrading from the optional JSON type in 9.1 to the built-in type in 9.2.

We need to expose the pg_upgrade stuff as well, set_binary_upgrade. Should we use something other than and OID? We need the OIDs for upgrade and for drivers. Driver identicalness isn't the same as pg_upgradability, so we might want two different switches for that. Maybe we should have a new OID if you change the storage of a type?

What's the criterion for allocating an OID? We'll need some kind of judgement. We'll also need to block off the OID reserved space into sections. People generally found this to be a good idea. Andrew will create a wiki page and follow-up. We could just do this for contrib, but that's not really a good idea.

We could have CREATE TYPE ... WITH OID = ###, for base types only. The folks who want it for ENUM etc. are just replication/clustering authors. There was discussion of other approaches to handling these problems. Users will create types with OIDs which conflict.

=== Access Control and SE-Linux ===

Several components: to add security around user tables. Second, to add additional conditions around user queries. Third, a condition around new tuples which are inserted. Fourth, we should have ENUMs to represent user-defined security labels. Did some performance testing on the last part, having labels as OIDs was much faster and closer to non-SE performance.

There's concurrency issues around seeing new labels -- we'd have a huge issue with inserting the labels into the system table. Creating a new label could be a downtime event; we can have a utility command, and we can require users to create a new label first manually. But what happens if the new label isn't there? Should error just like a constraint.

Is there a way to query SE-linux to get all of the security labels? That's hard, because it's four fields. The last field is an issue for prediction. There's a lot of value in having row-level security be completely type-agnosic; we just have a string and we don't care what's in it.

An SE Label consists of: user, row field, type field, and (something inaudible). That last part is a kind of bitmap. Do we actually need that part, though? What's multi-category security, will we support that? How many different labels would you have on a specific table?

The idea of row-level security is to force quals on people. Currently it's not transparent. The discussion on labels needs to continue elsewhere.

Also we need to address FK and PK implementation for security labels.

=== What Developers want ===

PostgreSQL is becoming the default for many web applications like Ruby and Django. But there are plenty of users complaints. They don't show up on the PostgreSQL mailing lists. The developer complaints are on stackoverflow, forums for virtual hosting companies, and application specific lists like ORM/framework layers.

Two categories of developer comments: blockers that cause to use another tool, and enhancers that would expand the market into new areas. Many of these are available features, but they seem to hard to use.

==== Blockers ====

1. Installation onto developer laptops (Windows / OS X)
* Re-installs problematic in Windows
* Reinstall of Redis is the competitor here, it is a closer to a true one-click installer.
* People use Redis because it's "easy to install", while PostgreSQL ran into one of multiple problems (reported on lists like pgsql-general)
* postgres.app is aiming at simplifying things for Mac developers, is in beta
* Kevin: also seen issues with Rails + Rake, lots of questions on Stackoverflow.
2. Complexity of configuring PostgreSQL, i.e. postgresql.conf
* Shared memory issues on the Mac
** Could use POSIX shared memory instead Sys V
* Need a configuration generator and hints for settings that are set incorrectly
**Example: need to increase size of the transaction log with pg_xlog having X GB of space. Math to determine settings like checkpoint_segements given a GB target is complicated.
3. Better analysis and troubleshooting
* Expose everything via SQL, i.e. autovacuum ; no parsing logs.
* EXPLAIN needs to be easier to understand, suggest what needs to be done when planner mistakes are made.
* Freeze a stable query plan needed for some apps.
4. Easier to understand replication
* External projects that try to help are often less maintained/robust/documented than core
* Same thing is true for pooling projects
5. Better pg_upgrade
* More trustworthy
* Handle version upgrades across large clusters
* Deliver on <5 minutes promise. Can take a long time for statistics ANALYZE. Needs to save/restore that instead.
6. MERGE UPSERT

==== Enabling features to broaden userbase ====

1. Finish JSON support
* Most popular new feature on news sites LWN etc. since 9.0 replication
* Some people want simple document storage like NoSQL, but with PostgreSQl reliability
* Needs indexing performance improvements
* More extract from JSON features
* Schemaless PostgreSQL is possible with JSON or hstore, but it's not obvious that's true.
2. Better extensions
* Packaging for popular extensions on popular
* Extensions should follow replication; move .so to standby? Lots of resistance to that idea.
* Better visibility of extensions, and extension aggregators like PGXN.
3. Client language queries
* Straight from, say, Python to a parse tree
* SQL Server/.Net does move in this direction for C#
* Competition here is the non-relational databases
4. Built-in sharding
* PL/Proxy: must find it, minimal docs, questions around support situation
* Target user base here doesn't like SQL or functions much either
* Base on writable FDW?
* Borrow ideas from notable sharded PostgreSQL deployments?

==== Enhancements of FDW in 9.3 ====

What do we need for FDW in 9.3? Want discussion of what to implement. Hanada is working on pgsql_fdw. Wants this in the core distribution, to replace dblink. Currently FDWs are read-only so users still need dblink. There is a list of features Hanada wants to implement.

One issue is naming. Currently we already have postgresql_fdw in core, which is used by dblink. Proposed pgsql_fdw, but that doesn't fit our naming conventions. We should maybe rename the dblink one to dblink_fdw. There is also an issue around options where it should consult libpq on what options are supported. Since the function name conflicts are internal, this would only mess with pgupgrade.

Features include:
* writeable FDWs
* aggregation pushdown
* table sorting pushdown
* table inheritance with FDW
* constraint support on foreign tables

Writeable FDW is the most interesting feature. One issue is transaction control, suggestion is that it's the responsibility of the FD module to control transactions, not PostgreSQL. Two ways to do it: one is that every write to a FT is an autocommit transaction. The other option is that the FT commits when you commit your local transaction. SQL Server automatically does two-phase commit. But it might be better for a first version not to have any transaction control.

We will implement with no remote transaction control for the version for 9.3. Plus distributed transactions have lots of interesting failure conditions.

KaiGai plans to get pgsql_fdw into the first CF so that we can play with it.

=== Parallel Query ===

Everyone run screaming from the room. First, understand that not everyone is I/O bound. There are cases where the system is primarily memory or CPU-constrained. If you have a handful of very complex queries which are primarily memory-bound, but we're not always I/O constrained, we need to look at ways to parallelize memory/CPU-constrained systems. We need to start looking incrementally with how we can do some things in parallel.

Already-completed parallel pg_dump is an example of this. We need more cases where we can surgically parallelize stuff. Josh brought up the issue of PostGIS queries which need CPU parallelism. Greg brought up 48-core server with 256GB of RAM for a 100GB server. If we can get 4 CPUs, we get better memory bandwidth. We're sometimes memory-bound because of non-sharable memory bandwidth. Bruce told story of Informix 6's parallelism disaster.

We need a task list of individual tasks we could parallelize instead of parallelizing everything. We do need a general "helper process" infrastructure so that we can hand work off to them. Simon is working on the parallel worker tasks now.

Bruce and Greg discussed Greenplum's history. The way we generate query plans makes this hard, since it's kind of a "pull" basis: "gimme a tuple". If our query plan was a task list it would be easier. MPP systems have plans where they look at which steps can be parallelized and what they cost.

The hard stuff is in the optimizer. Creating a cost model is really difficult. Peter brought up the Intel threading building blocks as a generalized parallelism case with a graph dependency. It has this thing called "task stealing". The classic parallelism case is video rendering, but our tasks are not like that. We need one-off cases for each task.

It's like the Windows port in terms of scope and complexity. This is different from the Windows port, in that we can do it piecemeal, but we need to decide to go down the road of additional complexity. Dimitri suggested exposing the executor as a virtual machine. A lot of stuff is different. Josh suggested starting with parallel index build as the easiest single task with solid benefit. Bruce points out the even simpler case is to build several indexes in parallel over the same scan.

Additional items that can be parallelized:

* Redo
* Vacuum
* Logical dump
* Sorting
* Scans

=== MERGE ===

Peter hasn't done as much with this as he expected so far, but plans to get something done for 9.3. What's the best way to solve this problem? Josh spoke about the need for atomic UPSERT, Peter agrees that that's a good version 1 goal.

There's a fair amount of speculation on how to implement this feature. A lot of people want to use predicate locking, but we need an accessible API and some more features for predicate locking to make it work. We could also have a new kind of lock associated with an index tuple. The UPSERT case requires solving the hard problem, general MERGE beyond that is detail work. One thing we need to do is finish deprecating user-definable RULEs.

Greg worked with a GSOC project for MERGE, but concurrency completely didn't work. We still have to solve the concurrency issues. Robert remembers that there were intrinsically complex issues without even a possible perfect solution. We need to look at the thread where we looked at the problems; the definition of sensible behavior is in question (thread: http://archives.postgresql.org/message-id/AANLkTineR-rDFWENeddLg=GrkT+epMHk2j9X0YqpiTY8@mail.gmail.com ). We need to define the spec first. We can look at what other databases do.

We can allow weird things to happen -- corner cases -- with MERGE or UPSERT. We can tell people to use SSI to avoid those weird issues. The SQL standard's MERGE doesn't really give us UPSERT, we should use different syntax. We want INSERT ... ON DUPLICATE KEY UPDATE, not REPLACE INTO. We should ask MySQL folks about the history of this.

Job #1 is building the simple case, UPSERT. We can do SQL-standard MERGE later. Greg wants reviewers to commit for this. This is really a Heikki thing. The Executor part needs expert review (Tom?).

=== Materialized Views ===

What's the minimum committable patch, and what direction should we take it in? Kevin has time to work on it, but it's been hard to schedule that time.

* syntax for create/alter
* new relkind in pg_class
* pg_dump and restore support
* being able to index them
* statement to regenerate contents of matview (concurrently?)

Will have an option to create a matview without filling it with data. pg_dump would use this. Would deal with the various ways of updating matviews, like incremental, later. If you wanted incremental updates on a matview which is too complex it would error. Further down, doing incremental updates via queueing mechanism.

Also, there's the optimizer -- substituting matviews for base tables automatically. That would be much later. Josh mentioned that someone had already written code for that. KaiGai asked about SE-Postgres and matviews, and discussed it with Kevin. Josh also asked about eventually doing on-request refresh.

Simon wants us to call it something different from Materialized Views, becuase we won't have the optimizer stuff which Oracle does. Kevin is calling it declarative materialized views. And it's not clear that we want to handle query rewrite the same way Oracle does. We can have synchronous update of matviews, but more useful is queueing updates of the views to that they are "eventually consistent". Kevin talked about cranky judges.

Phase I is just do do the object type and manual refresh. Incremental update will be later. There's a couple other things you can do if you can guarantee that the matview data would produce the same result. There was discussion around what to call the feature given that we'll be implementing matviews in several releases.

Dimitri suggested that we could use matviews as a working concept for correlation stats. Simon discussed issues of setting acceptable staleness at data request time, both for matviews and for replication.

=== In place upgrades & Checksums ===

Where had the page format discussion gone wrong in the past? There's 4 issues:

* adding more bytes in the header
* having multiple page views
* time required to upgrade

The whole discussion talked about 32-bit checksums. But with 16-bit checksums, we could borrow pg_tli, and add a checksummed bit. Greg said we bump the page format, Robert said no. Greg wants us to "get practice" in having new page formats. We need to flag whether or not the page is checksummed. Will we ever need 32-bit checksums? If we implement 16-bit, we'll find out.

Simon analyzed the error rate with 16-bit checksums, and felt that it was enough for an 8K page, but not a 32K page. Not clear on why it makes a difference on what the size of the page is. Plus we're not expecting an error on just one page.

What are we planning to include in the checksum? What are we going to checksum? Jeff has been looking at issues where whole disk blocks are getting swapped. Suggested including the relfilenode etc. in the checksum in order to make sure that the page is where it's supposed to be. Would it prevent us from moving data around? Changing tablespaces, etc. might be an issue. Is table OID better or worse than relfilenode? Discussion of what pg_upgrade does. OID seems better.

Need to have some way to track what's checksummed and what's not in a table. Each page will have a checksum bit. Add command VACUUM CHECKSUM ON. And we don't really have to implement an "old page reader".

Hint bits are the biggest implementation issue. Simon's approach was to full-page-write all pages with hint bits once per checkpoint cycle, but there's still some stuff to be worked out there. There's an issue with hint bits being set while the page is being written by another process. Discussed the performance impact of this.

For first version, we need to look at whether it's reliable. That is more important that the performance. Bulk loading has a major performance issue. Setting hint bits on the first select of a major table generates a whole bunch of WAL traffic.

=== Partitioning and Segment Exclusion ===

Current partitioning is "just good enough" to deter building something better. Dimitri has been thinking about what do to instead. Three problems:

# when do you create the new partitions
# constraint exclusion has all kinds of issues
# index and constraints -- no primary keys etc.

We've had several proposals. Declarative partitioning syntax. But as long as we have separate tables, we only solve problem 2. We've had 5 years of partial patches for that problem.

So how about another idea: the problem is having a table with a huge data set, and addressing only part of that table. We already have table segments -- we could have segments which are determined by ordering. The idea is to have an index which, given the partitioning key, would tell us where the tuples are located -- in which segment.

At what level in the system should a partition exist? Simon pushed for above-table level. Now we're looking at below table level. So the system defines partitions, not the user. We can look at a large table of 100 segements as having 100 partitions. If we store metadata about each partition, we can look at that to decide which segments to scan. Josh pointed out that this doesn't solve all or even most of the issues which partitioning is intended to solve. This solution is really a heavily compressed index or a performance optimization for scanning large, time-based tables. It's a sort of lossy index.

Don't get hung up on 1GB segments, we might change that in the future. Or we could change that for this. Jeff Davis pictured something different for constraint exclusion with something simpler. Discussion about index scans, which may not be as efficient as it could be. Index-Only scan needs some optimizations.

=== CommitFest Schedule ===

Simon proposed a schedule, which includes the last commitfest being 2 months. Robert would like it to be shorter, not longer. Robert pointed out that the final CFs have been getting longer, not shorter since 9.0. Two issues related to commitfests:

* works better when lots of people volunteer to review
* last commitfest doesn't end.

We would all benefit if we ended the CF earlier. Robert thought we should make CF4 shorter, non longer. Josh suggested that we could relese every 6 months. Big problem is people writing patchs still during CF4. We wait until everyone is exhausted and then decide what to bounce. We should make decisions at the beginning of the commitfest.

Suggested separating review and commitfest. We should triage at the beginning of the commitfest. Robert brought up Dimitri's patches as an example. Robert wants completion over priority, Simon says the opposite. The problem with a consensus process is that there's no consensus. We could have a release manager. It's the big patches which are the real problem, since people really want them and there's lots of stuff in them.

The problem with prioritization is that we're promoting a big feature over what's not quite there vs. several other patches which are ready. It's not fair to our contributors. But we could triage at the beginning because we're arbitrarily bumping stuff anyway. It's better to do it early than late. You can identify which patches are big or small, and which ones have a certain degree of readiness. Even if you're not correct, it'll help people allocate their time.

For voting on priorities, we could vote and rate which ones are going to be easy or hard and how important they are for us. Dimitri outlined a system of point allocation and voting. Or we could list the committer on a patch at the beginning of the commitfest. That makes sense for the big patches, but not the small ones. So we should identify them at the beginning of the commitfest.

Everyone is going to argue for their own stuff, though. People have different priorities. We also can't tell committers what to do, we can only ask. We'd like to get committer signoff early in the process. We might also want to sign off reviewers.

Triage also needs to flag patches where we don't agree on the spec.

We need to get better about giving feedback on the design for the patch. The problem with posting a design spec is that there's no formal review process for design spec. After CF3, a week of triage. If we haven't seen the big patch by the triage, it doesn't get into CF4 for big patches.

Simon pointed out that it's hard to make rules for big patches because each one is different.

So, changes to the process:
* Planning week after the 3rd commitfest
* "design spec" flagged submissions to the CF
* write docs about the CF process
* one patch, one review requirement

=== CommitFest Management ===

CF1: June 15 - July 15

CF2: Sept 15 - Oct 15

CF3: Nov 15 - Dec 15
Planning Week - Dec 8-15

CF4.1: Jan 15 - Feb 15
Final Triage: Feb 1-7

=== Goals, Priorities, and Resources for 9.3 ===

Dave: Installers

Andrew: Aggregation for JSON, Projecting data from JSON, Pretty-printing SQL, PL/perl binary format, binary output for psql, windows builds for extensions.

Peter: UPSERT, trying to replace Flex, pg_stat_statements for query plans.

Simon: Bi-Directional Replication

Hanada: pgsql_fdw, other FDWs.

Hitoshi: plv8, JSON support, some windowing function improvements.

Kevin: Declarative materialized views, SSI performance.

Jeff: statistics for ranges, range keys, range FKs, and range joins.

Robert: performance, performance, performance. Reducing latency events. Write performance improvements. Can we optimize vacuum some more, reviewing patches.

Josh: documentation, advocacy, maybe autoconfiguration. Release notes.

Magnus: configuration directories. Monitoring. Simplifying replication.

Dimitri: now working on "event triggers". Next step for extensions. Segment exclusion. Queueing in core design spec.

Tom: backfilling weak spots in the planner.

Alvaro: finalize FK locks. Allowing ALTER TABLE to reorder columns.

Bruce: design spec for some parallel operations.

Oleg & Teodor: improve SP-GiST. Indexing similarity. Also want to work on spatial join. JSON indexing if they can get sponsorship.

Noah: global temp tables, local XID space for temp tables, more ALTER TABLE improvements.

Greg: reviving dead projects: config directory, eliminate recovery.conf, adding instrumentation for timing events inside the database.

KaiGai: SE row-level access control.

Stephen Frost: list optimization work. SSL under Windows, supporting engines.

=== Other Business ===

Josh will write as-we-go release notes for alphas or whatever.

We could have a mini-developer meeting in Prague. There was discussion about whether we should move the developer meeting around every year. This is the "main" developer meeting, but we could have another one somewhere else. We could have it at FOSDEM, in February.

Josh brought up the idea of having an unconference day for Postgres contributors. Robert suggested interest group meetings as a refinement of that.

[[Category:PostgreSQL Events]]
[[Category:PostgreSQL 9.3]]

PgCon 2012 Developer Meeting

2012-05-09T17:10:48Z

Sternocera:

PgCon 2012 Developer Meeting

2012-05-06T21:55:50Z

Sternocera: Updating MERGE agenda item as directed

PgCon2012CanadaInCoreReplicationMeeting

2012-04-27T14:32:21Z

Sternocera: /* Attendees (alphabetical) */

= PostgreSQL In-Core Replication meeting, pgCon 2012 =

== Time and Place ==

Wednesday, May 16th, 6pm to 10pm

Ottawa somewhere, room TBA

== Agenda ==

Draft agenda follows. Please let me know of any contributions/changes to the agenda you have:

# Discussion of Multi-Master Theory (Simon)
# Demonstration of prototypes (Andres)
# Performance comparisons
# My use case (Keaton)
# Social Media use case (Simon)

Broad and general discussion throughout. Notes and actions will be taken. Volunteers for tasks welcome.

The meeting will be from 6pm to 10pm, with various forms of food and possibly a drink or two, sponsored by 2ndQuadrant.

== Attendees (alphabetical) ==

* Keaton Adams
* Josh Berkus (prefer vegetarian)
* David Fetter
* Dimitri Fontaine
* Andres Freund
* Peter Geoghegan
* Jim Mlodgenski
* Jim Nasby (plus guest)
* Michael Paquier
* Simon Riggs
* Mark Sloan
* Greg Smith
* Koichi Suzuki
* Peter van Hardenberg
* David Wheeler
...

Meeting limit about 20-25 people

=== Joining the Meeting ===

If you will be able to attend, please email Simon ([mailto:simon@2ndQuadrant.com simon@2ndQuadrant.com]) with the following:

* Your Name
* What pizza topping you like

and please come armed with detailed information about your future replication requirements.

PgCon 2012 Developer Meeting

2012-04-25T22:59:41Z

Sternocera:

PgCon 2012 Developer Meeting

2012-04-20T22:06:44Z

Sternocera: /* Proposed Agenda Items */

2012-04-04T11:59:24Z

Sternocera: Created page with "Valgrind [http://http://valgrind.org Valgrind] is an instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect ma…"

[[Valgrind]]

[http://http://valgrind.org Valgrind] is an instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile programs in detail. In particular, Valgrind's Memcheck tool is useful for detecting these bugs. However, it is non-trivial to use with Postgres, and requires modification to Postgres source files to instrument memory allocation and memory context infrastructure with various Valgrind macros.

It is hoped that at some point in the future, Postgres will directly support Valgrind through the use of a configure option, which is possible due to the fact that the header file valgrind.h is under a BSD license, as opposed to the rest of Valgrind which is under the GPL 2. In the meantime, this wiki page is the place to obtain an unofficial patch that adds the necessary calls. It is not as comprehensive as it possibly could be, and there are probably other places where specific checks could be usefully injected.

shared_buffers effectively scrubs memory from Valgrind's perspective.

=== General testing procedure ===

For general tests, the recommended procedure is:
<source lang="bash">
# Build Postgres with the valgrind patch
$ cd ~/postgresql
$ patch -p1 < valgrind_postgres.patch
# Building at O1 would probably also be acceptable if O0 proves too slow, but avoid O2
$ ./configure --enable-debug CFLAGS=-"O0 -g"
$ make && make install
# Start Postmaster
$ valgrind --leak-check=no --gen-suppressions=all --suppressions=postgresql/valgrind.supp --time-stamp=yes --log-file=pg-valgrind/%p.log postgres 2>&1 | tee pg-valgrind/postmaster.log
# run tests
$ make installcheck-world
</source>

=== Co-ordination when running tests ===

postgresql.conf should include a timestamp and PID in log_line_prefix, as well as having a log_min_duration_statement of 0. Since the valgrind logs included timestamps and were split by PID, they can be used to correlate valgrind errors with particular test suite commands. Once the test cases yielding valgrind errors are tracked-down, you can rerun the valgrind-ed postmaster with "--track-origins=yes --read-var-info=yes" to get more-specific diagnostics. Valgrind 3.6.0 should be used to get good pinpointing of the error source. At time of writing, version 3.7.0 is the latest stable release.

The full installcheck-world run has been found to take something around six hours on a modern machine, but memory consumption is not greatly inflated. It is recommended that when running Valgrind that you disable CLOBBER_FREED_MEMORY and MEMORY_CONTEXT_CHECKING; they add additional valgrind hook traffic and are redundant with the testing valgrind performs. The patch actually switches the pg_config_manual.h defaults for those settings.

=== The patch itself ===

Per recommendations in the valgrind documentation, this patch just copies valgrind.h into the PostgreSQL tree. It is current as for the master branch, as of April 3 2012.

PgCon 2012 Developer Meeting

2012-02-29T01:59:29Z

Sternocera: /* Attendees */ - alphabetical order

Group commit

2012-01-21T01:51:36Z

Sternocera:

=== Description of feature ===

''Group commit'' is a feature planned for PostgreSQL 9.2 .

The feature is being developed by Simon Riggs and Peter Geoghegan. The latest -hackers thread on the feature is: http://archives.postgresql.org/pgsql-hackers/2012-01/msg00804.php .

Broadly speaking, a group commit feature enables PostgreSQL to commit a group of transactions in batch, amortizing the cost of flushing WAL. The proposed implementation this page describes is heavily based on the existing synchronous replication implementation. It supercedes the commit_siblings "group commit" implementation of prior versions. This earlier implementation was never really considered to be effective, and its use was weighed down by caveats, so in practice it was only used very infrequently. It is anticipated that the proposed implementation will be turned on by default, and it may not be possible to turn off.

=== Benchmark ===

Benchmarking of this feature has been performed by Greg Smith's pgbench tool (https://github.com/gregs1104/pgbench-tools) . Here are results for the initial benchmark:

http://wiki.postgresql.org/images/5/50/Group-commit-pgbench-tools.pdf

Revised results, with semaphore implementation:

http://wiki.postgresql.org/images/c/c6/Group-commit-semaphore-results.pdf

These results were obtained on an ext4 (Linux kernel 3.1) filesystem with LVM. The harddisk used was a WDC WD3200BEKT-08PVMT1 7200 RPM sata disk, with write caching enabled.

2012-01-17T13:18:26Z

Sternocera: /* Benchmark */

== Group commit ==

'Group commit'' is a feature planned for PostgreSQL 9.2 . The feature is being developed by Simon Riggs and Peter Geoghegan. The latest -hackers thread on the feature is: http://archives.postgresql.org/pgsql-hackers/2012-01/msg00804.php .

=== Description of feature ===

Broadly speaking, a group commit feature enables PostgreSQL to commit a group of transactions in batch, amortizing the cost of flushing WAL. The proposed implementation this page describes is heavily based on the existing synchronous replication implementation.

=== Benchmark ===

Benchmarking of this feature has been performed by Greg Smith's pgbench tool (https://github.com/gregs1104/pgbench-tools) . Here are results for the initial benchmark:

http://wiki.postgresql.org/images/5/50/Group-commit-pgbench-tools.pdf

These results were obtained on an ext4 (Linux kernel 3.1) filesystem with LVM. The harddisk used was a WDC WD3200BEKT-08PVMT1 7200 RPM sata disk, with write caching enabled.