PgCon 2015 Developer Unconference

From PostgreSQL wiki

Jump to: navigation, search

An Unconference-style multi-track (three tracks are currently planned) event for active PostgreSQL developers will be held from the afternoon of Tuesday 16 June, 2015 through Wednesday 17 June 2015 at the University of Ottawa, as part of PGCon 2015. This Unconference will be focused on technical PostgreSQL development discussions ranging from Clustering and replication to the infrastructure which runs postgresql.org.

Note: See the schedule at the bottom of this page.

Contents

Topics

Developers are asked to propose topics which they wish to either present on or which they would like another individual to present on. All topics should be clearly related to PostgreSQL development. The topic should be added to the table below and any required attendees (presumably at least the presenter, and the requester if different) listed. Other attendees of the Unconference who are interested should list themselves as Optional. Note that non-technical topics related to PostgreSQL development will be addressed during the invite-only Developer meeting, being held in advance of the Unconference. Further, the Developer Unconference is for developers of PostgreSQL and user-oriented topics are not appropriate for this venue.

Slot assignment

Slots will be assigned based on the topic's interest among the attendees of the Unconference (the number of individuals who listed themselves as attendees). Final determination on any particular topic will be made by the Unconference organizers. Please only participate if you are confident of your attendance at the Unconference.

Venue

These meetings will be held at the University of Ottawa. The topics selected, the schedule and the specific room assignments will be published closer to the event and will be based on the information provided here. Please direct any questions to Dave Page (dpage@pgadmin.org).

Sponsorship

The Developer Unconference will be sponsored by Salesforce.com, and by NTT Open Source for the Clustering Track.

Attendees

While the Unconference is open to all attendees of PGCon, formal invitations will be sent to specific PostgreSQL developers, including the Core team, Major Contributors, Committers, and other developers who have been involved in the 9.4 release. These invitations are intended to encourage developers to attend the Unconference but we are unable to guarantee every invitee a speaking slot.

RSVPs

The following people have RSVPed to the meeting (in alphabetical order, by surname):

  1. Ingmar Alting
  2. Naoya Anzai (arrive tuesday evening)
  3. Chris Autry
  4. Ashutosh Bapat
  5. Oleg Bartunov
  6. Josh Berkus
  7. Christopher Browne
  8. Joe Conway
  9. Jeff Davis
  10. Andrew Dunstan
  11. Yurie Enomoto
  12. Ozgun Erdogan (Wednesday)
  13. Ed Espino
  14. Andres Freund
  15. Stephen Frost
  16. Masao Fujii
  17. Etsuro Fujita
  18. Peter Geoghegan
  19. Kevin Grittner
  20. Robert Haas
  21. Ahsan Hadi
  22. Magnus Hagander
  23. Shigeru Hanada
  24. Álvaro Herrera
  25. Yasuo Honda
  26. Kyotaro Horiguchi
  27. Thierry Husson (Wednesday @ 11am)
  28. Ayumi Ishii
  29. Tatsuo Ishii
  30. Moshe Jacobson
  31. Stefan Kaltenbrunner
  32. Amit Kapila
  33. Mehmet Emin KARAKAŞ
  34. Motoyuki Kawaba (arrive Tuesday evening)
  35. Konstantin Knizhnik
  36. KaiGai Kohei (arrive tuesday evening)
  37. Alexander Korotkov
  38. Ilya Kosmodemiansky
  39. Dilip Kumar
  40. Tom Lane
  41. Amit Langote
  42. Heikki Linnakangas
  43. Chris Malek (Wednesday @ 11am)
  44. Grant McAlister
  45. Mack McCauley
  46. Fabrízio de Royes Mello
  47. Noah Misch
  48. Bruce Momjian
  49. Yugo Nagata
  50. Satoshi Nagayasu
  51. Jim Nasby
  52. Dave Page
  53. Christophe Pettus
  54. Tyler Poland
  55. Paul Ramsey
  56. Kumar Rajeev Rastogi
  57. Simon Riggs
  58. Michael Robinson
  59. Tetsuo Sakata
  60. Masahiko Sawada
  61. Arul Shaji
  62. Dan Shuster
  63. Steve Singer (arrive tuesday mid-afternoon)
  64. Marco Slot
  65. Greg Smith
  66. David Steele (arrive tuesday evening)
  67. Jose Luis Tallon (arrives tuesday evening)
  68. Yasin TATAR
  69. Euler Taveira
  70. Rod Taylor
  71. Fabio Telles
  72. Jan Urbański (Wednesday)
  73. Tomas Vondra
  74. Jan Wieck (arrive tuesday evening)
  75. Chris Winters
  76. Nat Wyatt
  77. Jimmy Yih
  78. Rob Young

Topics

See the schedule at the bottom of this page.

Please add any topics you wish covered to the table.

For any topics you are requesting or presenting on, please add your name in the Required column.

For any topics you would like to attend, please add your name in the Interested column.

Topic Policy Taker of Notes Required Attendees Interested Attendees
Picture! Open Oleg Bartunov (Fuji 16-55) All! All!
pgAdmin4 Open Dave Page, Stephen Frost Magnus Hagander, Joe Conway, David Steele, Fabrízio de Royes Mello, Satoshi Nagayasu, Dave Cramer, Alexander Korotkov, Chris Malek, Arul Shaji
Advocacy Team Meeting Open Stephen Frost Magnus Hagander, Greg Smith, Jim Nasby, Josh Berkus, Joe Conway, Michael Robinson, Oleg Bartunov, Arul Shaji
Vertical Scalability w.r.t Writes Open Amit Kapila Amit Kapila Greg Smith, Hannu Valtonen, Ilya Kosmodemiansky, Tomas Vondra, Grant McAlister, Joe Conway, Peter Geoghegan, Kyotaro Horiguchi, Simon Riggs, Amit Langote, Andres Freund, Robert Haas, David Steele, Rod Taylor, Jim Nasby, Chris Winters, Nat Wyatt, Noah Misch, Masao Fujii, Mehmet Emin KARAKAŞ, Christophe Pettus, Fabrízio de Royes Mello, Euler Taveira, Fabio Telles, Andrew Dunstan, Mack McCauley, Masahiko Sawada, Shigeru HANADA, Michael Robinson, Dave Cramer, Steve Singer, Alexander Korotkov, Oleg Bartunov, Konstantin Knizhnik, Marc Jeanneret, Chris Malek
Security Team Meeting Closed Heikki Linnakangas, Stephen Frost, Magnus Hagander Noah Misch, Álvaro Herrera, Andres Freund, Robert Haas, Tom Lane, Andrew Dunstan,Oleg Bartunov
Native Compilation + LLVM Open Kumar Rajeev Rastogi Jeff Davis, Ozgun Erdogan, Tomas Vondra, Peter Geoghegan, Robert Haas, Chris Browne, Josh Berkus, Ingmar Alting, Masao Fujii, Christophe Pettus, Jose Luis Tallon, Heikki Linnakangas, Alexander Korotkov,Oleg Bartunov,Konstantin Knizhnik
Horizontal Scalability / Sharding in PostgreSQL - ground covered so far and remaining to be covered. Open Ahsan Hadi, Ashutosh Bapat, Etsuro Fujita Hannu Valtonen, Jeff Davis, Amit Langote, Kyotaro Horiguchi, Tetsuo Sakata, Simon Riggs, Robert Haas, David Steele, Rod Taylor, Chris Browne, Jim Nasby, Josh Berkus, Chris Winters, Masao Fujii, Mehmet Emin KARAKAŞ, Fabrízio de Royes Mello, Euler Taveira, Fabio Telles, Satoshi Nagayasu, Andrew Dunstan, Mack McCauley, Shigeru HANADA, Michael Robinson, Steve Singer, Alexander Korotkov, Oleg Bartunov, Konstantin Knizhnik, Andres Freund, Marc Jeanneret, Chris Malek, Marco Slot
PGCAC Board Meeting 2015 Closed Josh Berkus Josh Berkus, Chris Browne, Steve Singer, Dan Langille, Dave Page During Lunch Wed.
pgPool2 towards version 3.5 Open Tatsuo Ishii Ashutosh Bapat, Ahsan Hadi, Yurie Enomoto, Chris Malek
Partitioning Open Amit Langote Hannu Valtonen, Ashutosh Bapat, Jeff Davis, Kyotaro Horiguchi, KaiGai Kohei, Noah Misch, Tetsuo Sakata, Peter Geoghegan, Álvaro Herrera, Thierry Husson, Joe Conway, Naoya Anzai, Robert Haas, David Steele, Chris Browne, Jim Nasby, Josh Berkus, Masao Fujii, Mehmet Emin KARAKAŞ, Fabrízio de Royes Mello, Euler Taveira, Fabio Telles, Andrew Dunstan, Jose Luis Tallon, Yurie Enomoto, Mack McCauley, Masahiko Sawada, Shigeru HANADA, Michael Robinson, Yasuo Honda, Dave Cramer,Steve Singer, Alexander Korotkov, Oleg Bartunov,Konstantin Knizhnik, Chris Malek, Ed Espino, Keith Fiske
Foreign Data Wrapper enhancements Open Shigeru Hanada, Etsuro Fujita KaiGai Kohei, Hannu Valtonen, Ashutosh Bapat, Jeff Davis, Amit Langote, Kyotaro Horiguchi, Noah Misch, Tetsuo Sakata, Naoya Anzai, Robert Haas, Jim Nasby, Josh Berkus, Chris Winters, Ingmar Alting, Mehmet Emin KARAKAŞ, Jose Luis Tallon, Oleg Bartunov, Konstanti Knizhnik, Chris Malek, Tyler Poland
Utilization of modern semiconductors - GPU, SSD, NVRAM, FPGA, PMEM... Open KaiGai Kohei Matthew Wilcox, Josh Berkus, Satoshi Nagayasu, Jose Luis Tallon, Naoya Anzai, Mack McCauley, Shigeru HANADA, Michael Robinson, Ingmar Alting
Native Columnar Storage Open Álvaro Herrera Ozgun Erdogan, Tomas Vondra, KaiGai Kohei, Amit Kapila, Josh Berkus, Naoya Anzai, Amit Langote, Robert Haas, David Steele, Rod Taylor, Chris Browne, Jim Nasby, Chris Winters, Nat Wyatt, Masao Fujii, Fabrízio de Royes Mello, Euler Taveira, Satoshi Nagayasu, Mack McCauley, Masahiko Sawada, Shigeru HANADA, Michael Robinson, Heikki Linnakangas, Alexander Korotkov, Oleg Bartunov, Konstantin Knizhnik, Andres Freund, Marc Jeanneret, Marco Slot, Ed Espino, Arul Shaji, Tyler Poland
Future of PostgreSQL shared-nothing cluster Open Konstantin Knizhnik, Alexander Korotkov, Oleg Bartunov Jeff Davis, Amit Langote, Kumar Rajeev Rastogi, Josh Berkus, Simon Riggs, Robert Haas, Jim Nasby, Masao Fujii, Christophe Pettus, Fabrízio de Royes Mello, Euler Taveira, Fabio Telles, Yurie Enomoto, Masahiko Sawada, Shigeru HANADA, Yasuo Honda,Steve Singer, Marc Jeanneret
PostgreSQL and SMR Drives - the future of magnetic storage means very expensive random writes Open Jeff Davis Kumar Rajeev Rastogi, Noah Misch, Ilya Kosmodemiansky, Amit Kapila, Simon Riggs, Rod Taylor, Jim Nasby, Josh Berkus, Nat Wyatt, Christophe Pettus, Satoshi Nagayasu
Slony Development Open Steve Singer, Chris Browne, Jan Wieck Josh Berkus, Rod Taylor, Jim Nasby, Satoshi Nagayasu, Yurie Enomoto
Dockerizing Postgres Open Josh Berkus Simon Riggs, Nat Wyatt, Christophe Pettus, Fabrízio de Royes Mello, Jan Urbański, Michael Robinson, Jeff Davis, Rob Young, Greg Smith, Keith Fiske
Bi Directional Replication & Logical Decoding|BDR Open Simon Riggs Andres Freund, Jim Nasby, Josh Berkus, Mehmet Emin KARAKAŞ, Christophe Pettus, Fabrízio de Royes Mello, Euler Taveira, Michael Robinson, Dave Cramer,Steve Singer, Jeff Davis, Arul Shaji
Autonomous Transactions Open Simon Riggs, Kumar Rajeev Rastogi David Steele, Jim Nasby, Josh Berkus, Nat Wyatt, Masao Fujii, Euler Taveira, Andrew Dunstan, Masahiko Sawada, Michael Robinson, Amit Kapila, Chris Malek, Arul Shaji, Fabrízio de Royes Mello
Audit Logging Open David Steele Josh Berkus, Nat Wyatt, Masao Fujii, Christophe Pettus, Fabio Telles, Satoshi Nagayasu, Yurie Enomoto, Mack McCauley, Masahiko Sawada, Michael Robinson, Oleg Bartunov, Tyler Poland
pg_shard v2.0 and Lessons Learned from NoSQL Databases Open Ozgun Erdogan, Marco Slot Josh Berkus, Jim Nasby, Josh Berkus, Chris Winters, Mehmet Emin KARAKAŞ, Fabrízio de Royes Mello, Satoshi Nagayasu, Shigeru HANADA, Michael Robinson, Oleg Bartunov, Chris Malek


Direction of json and jsonb Open Andrew Dunstan Josh Berkus, Christophe Pettus, Masahiko Sawada, Michael Robinson, Rod Taylor, Alexander Korotkov, Oleg Bartunov, Chris Malek, Jeff Davis
Native Sparse Set Type Open Andrew Dunstan Josh Berkus, Michael Robinson
Testing Framework Adequacy Open Andrew Dunstan Josh Berkus, Christophe Pettus, Mack McCauley, Michael Robinson,Steve Singer, Oleg Bartunov, Ed Espino

pgAdmin4

Meeting Notes

  • To be filled in

Attendees

  • To be filled in

Infrastructure Q&A

Meeting Notes

  • To be filled in

Attendees

  • To be filled in

WWW Team Meeting

Meeting Notes

  • To be filled in

Attendees

  • To be filled in

Advocacy Team Meeting

Meeting Notes

  • To be filled in

Attendees

  • To be filled in

Vertical Scalability w.r.t Writes

Purpose of this discussion:

  • Discuss about priority/importance of various performance and scalability problems
  • Solution/Idea to solve most important problem('s)
  • Is pgbench sufficient to capture various kind of real world workloads?

Some of important performance problems I have in mind are:

  • Avoid/Reduce Vacuum Freeze
  • Bloat
    • Heap
    • Index
  • Instability in TPS due to checkpointer flush
  • Tuple size
    • Heap Tuple Header
    • Alignment in index can lead to bigger index size for simple datatypes

Scalability bottlenecks

  • Locks
    • ProcArrayLock
    • WALWriteLock
    • CLOGControlLock
    • Lock for Relation Extension
  • Writes, especially when data doesn't fit in shared buffers.
    • Write Performance
    • Double Buffering
    • In-memory table/tablespaces

Meeting Notes

  • To be filled in

Attendees

  • To be filled in

Security Team Meeting

Meeting Notes

  • This will be, ehem, secure so nothing will be written here

Partitioning

Proposal to enhance partitioning support in PostgreSQL was posted to -hackers last year and resulted in discussion of some ideas regarding implementation. Late in the discussion, a crude WIP patch was also posted with some experimental syntax, catalog changes, an idea for internal representation and a proof-of-concept INSERT tuple routing function demonstrating practicality of the internal representation. It would be nice to carry the discussion forward at the same time implementing a patch to be proposed, reviewed early in the 9.6 development cycle. Points to discuss could be:

  • New features and old inheritance based implementation
  • Planner considerations for new partitioned table
  • Need for a new Append-like executor node for partitioned tables
  • DML/DDL restrictions on partitioned tables and partitions
  • Basically any considerations for partitioned tables and partitions that are explicitly defined so at a layer that's above the storage layer
  • Other points that come up

Meeting Notes

Note: no meeting notes were actually taken during the session so just writing down a summary of kind of things that people managed to talk about (as best as I (Amit) can remember)

  • The session was started off by summarizing the issues user face when using current partitioning. That summary was not the focus of the session but what followed. It is well understood that partitioning is not user-friendly and we badly need a system wherein various aspects of partitioning are "managed" by the system. In other words, implement partitioning DDL, with whatever system catalogs are necessary.
  • People were interested in knowing if the new partitioning is *in place of* or *in addition to* existing options. The answer would be the latter.
  • Inheritance was brought up and a discussion related to whether we should keep using it ensued. Some people argued inheritance is awfully slow for a non-trivial number of partitions (starting in planner where a large inheritance set consumes a whole lot of memory even if a query would require accessing a single partition of hundreds). But, others argued in favor of keeping the first patch simple or not bother about solving too many problems at once. It would be great if we solved the aspect of "managed partitions" first by providing syntax, catalogs, etc.
  • It was brought up that existing system allows a lot of flexibility in terms schema diversion of partitions from its parent(s), or more to the point, *does not prevent* it. That has kept us from treating a inheritance tree purposed as a partitioned table to be actually one. That is, we cannot really implement dynamic partition pruning, pairwise joins, etc. "Managed partitions" with special-purpose DDL and catalogs should lead us towards more fruitful partitioning optimizations.
  • Somebody brought up how current scheme allows for multi-level partitioning (composite partitioning). The same could be adopted in the new scheme by allowing partitions themselves to have partition key thus allowing arbitrary number of levels in partitioning hierarchy. The new proposal even considers composite keys using single level partitioning. Of course, it remains to be seen what the patch has to offer but important point to consider given that it would affect the syntax and catalog structure.
  • A slightly tangential issue of how to support partitioning on variety of types (PostgreSQL type system is *extensible*!) was mentioned. We could solve it by incorporating operator class (per key column) info into the partition key catalog. There was not really time for discussing many implementation matters like this one.

Attendees

  • To be filled in

Utilization of modern semiconductors

Recent evolution of semiconductor devices make us re-consider the assumption we stand on, and utilization of its power is key of innovation. We'd like to have a discussion to get the future direction in short and middle/long term.

  • GPU, FPGA - have advantage on simple but massive amount of calculation. It allows DBMS to perform as data processing platform that works nearby data.
  • SSD, NVRAM - likely, game changer of storage layer on both of read/write workloads. DBMS also has to pay attention characteristics of these devices.


Meeting Notes

  • Differences in APIs between PMEM and SSD/Disk - Unlike traditional device, PMEM assumes memory mapped cacheline based interface. PostgreSQL's storage manager is a good candidate for investigation to support both type of devices; one is traditional VFS based one, other is memory mapped one. PMEM library will offer clear interface to applications.
  • If all system RAM is persistent, we don't need transaction log, however, we are in the meantime of technology improvement. In the near future, we utilize PMEM on a particular portion, like transaction log buffer.
  • Does GPU help data encryption/description? Yes, it may be possible, probably, special code shall implement type input/output handler. However, a certain level of complexity of calculation is needed to make sense GPU acceleration.
  • In PG-Strom case, GPU works well to the type of workload to reduce number of rows; scan, join (with equal condition) or aggregation.
  • PL function is predefined, so it may work fine even if HDL build and device rewrite takes 30minutes.
  • XeonPhi needs another optimization than GPU. Because GPU logic usually has inter-thread-synchronization, however, it is emulated by software on XeonPhi. It is too challenging to write once, run anywhere,
  • GPU and AVX (SIMD operation) will work ideally on column oriented data structure, so Alvaro's work is very desirable from the standpoint of HW acceleration.
  • Action items
    • investigation of storage manager API to fit both of VFS and PMEM
    • columnar storage as better infrastructure of HW acceleration

Attendees

  • To be filled in

Future of PostgreSQL shared-nothing cluster

In 2015 PostgreSQL Professional company started project of migration PostgreSQL-XL to codebase of PostgreSQL 9.4 and increasing its stability and usability. At this unconference session we'd like to discuss current progress and further development. Generally we'd like to find ways to reduce difference between PostgreSQL and its shared-nothing cluster fork so that burden of the maintenance become manageable.

Meeting Notes

  • To be filled in

Attendees

  • To be filled in

PostgreSQL and SMR Drives

Meeting Notes

  • Indexes need to be addressed (discuss with indexing experts)
  • Freeze without overwrites: stalled waiting for page format change to include transaction epoch
  • Greg Smith: if we try to rely on caches or translation layers, there will be a "knee" in the performance graph
  • Page format change: Tom: heterogeneous page formats may be necessary
  • Heikki: Metapage in heap with page versions, info in pg_control
  • Simon: separating volatile data is like columnar storage

Attendees

  • Greg Smith
  • Tom Lane
  • Josh Berkus
  • Simon Riggs
  • Stephen Frost
  • Heikki Linnakangas
  • Amit Kapila
  • Dave Page
  • Tomas Vondra
  • Peter Geoghegan
  • Joe Conway
  • Dave Cramer
  • Magnus Hagander
  • Stefan Kaltenbrunner

Native Columnar Storage

See Alvaro's email to Hackers.

Meeting Notes

  • To be filled in

Attendees

  • To be filled in

Audit Logging

Audit logging is an important part of a RDBMS for many users and applications. Discuss how best to incorporate audit logging into PostgreSQL and what must be included at a minimum to make the feature viable.

Meeting Notes

  • To be filled in

Attendees

  • To be filled in

Direction of json and jsonb

What are the future needs of the JSON types? Recent suggestions have included an indexable "exists" operator, the JSON pointer and JSON patch standards, recursive merge, intersection, and being able to assign to a subdocument (json#>path as an lvalue). What are people using these types for, and what are the major gaps in functionality?

Meeting Notes

  • Tom: need to work on selectivity estimators for the indexable operators
  • Oleg: update syntax wanted
    • UPDATE table SET jsoncol['a'][1]['b'] = ...jsonvalue...
    • similar syntax for fetching a value
    • potentially, most container types should be able to offer array-like fetch and set syntax
  • Oleg: jsquery-like query language
    • there will be a talk about that tomorrow
    • possible patch in 9.7 timeframe
  • Oleg: validation of JSON values against a schema
    • is there any accepted standard for JSON schemas?

Attendees

  • Tom Lane

Native Sparse Set Type

Sets over small domains can be reasonably modeled by bitmaps, but sets over very large domains can not. Is there a need for such sets? How would we implement them? Arrays? Balanced trees? Something else? What types of sets would we allow? Anything with Btree operators, or more restricted? What would the notation look like?

Meeting Notes

  • To be filled in

Attendees

  • To be filled in

Testing Framework Adequacy

The buildfarm is more than 10 years old, and the testing needs of Postgres and its ofware ecosystem have changed radically in that time. What do we now need in the way of testing? How do we test complex arrangements such as the various sorts of replication in an automated way? Do we need a new framwork, or can the existing framework be adapted to our needs?

Meeting Notes

  • To be filled in

Attendees

  • To be filled in


Unconference schedule by Time and Room

Time DMS1110 DMS1120 DMS1160
Tue 13:30-14:30 PostgreSQL and SMR Drives - the future of magnetic storage means very expensive random writes pgPool2 towards version 3.5
Tue 14:45-15:45 Autonomous Transactions Native Compilation + LLVM pgAdmin4
Tue 16:00-17:30 Bi Directional Replication & Logical Decoding|BDR Advocacy Team Meeting
Wed 10:00-11:15 Security Team Meeting Slony Development Utilization of modern semiconductors - GPU, SSD, NVRAM, FPGA, PMEM...
Wed 11:30-13:00 Direction of json and jsonb Dockerizing Postgres Vertical Scalability w.r.t Write
Lunch time PGCAC Board Meeting 2015 LUNCH!!
Wed 14:00-15:00 Testing Framework Adequacy Open Horizontal Scalability / Sharding in PostgreSQL - ground covered so far and remaining to be covered.
Wed 15:15-16:15 pg_shard v2.0 and Lessons Learned from NoSQL Databases Future of PostgreSQL shared-nothing cluster Native Columnar Storage
Wed 16:30-17:30 Foreign Data Wrapper enhancements Audit Logging Partitioning
17:30 Picture!
Personal tools