FOSDEM/PGDay 2017 Developer Meeting

From PostgreSQL wiki
Jump to navigationJump to search

A meeting of the interested PostgreSQL developers is being planned for Thursday 2nd February, 2017 at the Brussels Marriott Hotel, prior to FOSDEM/PGDay 2017. In order to keep the numbers manageable, this meeting is by invitation only. Unfortunately it is quite possible that we've overlooked important individuals during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).

Please note that the attendee numbers have been kept low in order to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 9.6 and 10 release cycles. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies.

This is a PostgreSQL Community event.

Meeting Goals

  • Review the progress of the 10.0 schedule, and formulate plans to address any issues
  • Address any proposed timing, policy, or procedure issues
  • Address any proposed Wicked problems

Time & Location

The meeting will be:

  • 9:00AM to 5:00PM
  • Brussels Marriott Hotel

Coffee, tea and snacks will be served starting at 8:45am. Lunch will be provided.

RSVPs

The following people have RSVPed to the meeting (in alphabetical order, by surname) and will be attending:

  • Oleg Bartunov
  • Jeff Davis
  • Andrew Dunstan
  • Stephen Frost
  • Etsuro Fujita
  • Magnus Hagander
  • Petr Jelinek
  • Alexander Korotkov
  • Noah Misch
  • Bruce Momjian
  • Simon Riggs
  • Dave Page
  • Masahiko Sawada
  • Tomas Vondra

The following people have sent their apologies:

  • Joe Conway
  • Dimitri Fontaine
  • Peter Geoghegan
  • Kyotaro Horiguchi
  • Shigeru Hanada
  • Amit Kapila
  • Tom Lane
  • Thomas Munro
  • Michael Paquier
  • Dean Rasheed
  • Craig Ringer
  • David Rowley
  • Teodor Sigaev
  • Heikki Linnakangas

Agenda Items

Please add agenda items here!

  • Sharding update
  • Setting up the Release Management Team for Postgres 10.0 (Simon)
  • Supporting management roles (aka: removing superuser checks) (Dave)
  • Adding DBA management roles (was Superowners) (Simon)
  • SQL/JSON in SQL-2016 Standard and our roadmap (Oleg)
  • Is it worth having loads of meetings if not everybody attends? (Simon)
  • Tools and services from pginfra (Magnus -- if others are interested, I don't have any specific entries myself)

Agenda

Time Item Presenter
09:00 - 09:10 Welcome and introductions Dave
09:10 - 09:20 10.0 Release Review All
09:20 - 09:45 Setting up the Release Management Team for Postgres 10.0 Simon
09:45 - 10:00 Is it worth having loads of meetings if not everybody attends? Simon
10:00 - 10:30 Momjian Half Hour Bruce
10:30 - 11:00 Coffee break All
11:00 - 11:30 SQL/JSON in SQL-2016 Standard and our roadmap Oleg
11:30 - 12:15 Supporting management roles (aka: removing superuser checks) Dave/Simon
12:15 - 12:45 Tools and services from pginfra Magnus
12:45 - 13:45 Lunch All
13:45 - 14:15 Performance Farm Tomas
14:15 - 15:00 Open CommitFest Item Review All
15:00 - 15:30 Tea break All
15:30 - 17:00 Open CommitFest Item Review All
16:45 - 17:00 Any other business Dave
17:00 Finish

Minutes

Welcome
--------

Magnus: Be it noted that the Quadranteers were on time.

Present: Oleg Bartunov, Andrew Dunstan, Stephen Frost, Etsuro Fujita, Magnus Hagander, Petr Jelinek, Alexander Korotkov, 
         Noah Misch, Bruce Momjian, Dave Page, Masahiko Sawada, Tomas Vondra

Apologies: Simon Riggs (travel issues prevented attendance)

10.0 Release Review
-------------------

Dave: The point of this item is to quickly review the current status of the release and note any potential issues.
Magnus: The real question is that we're looking for a September release. Do we still think we can do that?
Noah: I don't see anything that would stop that.

All agree we're currently on track.

Setting up the Release Management Team for Postgres 10.0
--------------------------------------------------------

Noah: Do we want an RMT again, and if so, do we want it to behave any differently from last year?
Stephen: From my perspective it seemed like a good thing and was helpful. How did it seem from the inside.
Noah: It was a lot of not very interesting work.
Bruce: What sort of work?
Noah: Keeping track of items and chasing people. Some of it could be automated perhaps.
Magnus: There's value in personal chasing rather than autmated.
Noah: We also did the scary patch tournament!
Bruce: The big value is that we ensure everything gets done.
Petr: It certainly helps that there are committers on the team, and they can, if needed, just revert a patch.
Stephen: It revolves around the open items list.
Noah: Certainly.
Magnus: Everyone is free to add, RMT removes.
Noah/Petr: People added to the open items list because they realised that's what the RMT are following
Dave: Is there anything that could be automated to ease the process?
Noah: Maybe a dashboard of what needs to be chased today?
Dave: Do we have the info needed for that on the CF app?
Noah: Not really, as we don't track the open items there specifically
Stephen: I have to email the list when I add an open item anyway, so it would be cool if I could have a tag and the CF 
         app could pick that up.
Petr: What about tracking open items as part of the CF?
Magnus: The workflow for open items really isn't the same as it is for a CF. I'm worried that merging these functions 
        together will make both processes less optimal.
Noah: Another task is trying to figure out what commit caused an open item, which is not always easy.
Dave: We could link to commits when closing items on the CF app, much like Redmine does
Dave: It seems like we're all in agreement that we want an RMT again.
Noah: Yes, I hear only good things.
Magnus: Who was on the last team?
Noah: Me, Robert Haas and Alvaro.
Magnus: We should not use the same people again to avoid burnout.
Noah: I'd be happy to do it - it's kindof calming.
Dave: It's a good thing to have one person roll over to the next year to build institutional knowledge/experience.
Stephen: I'd like to have a non-committer on the team.
Bruce: Should we have someone outside the Americas?
<discussion on where the center of the world is; Britain of course>.
Noah: Timing isn't a major issue - we don't need every RMT member in close timezones.
Stephen: Members of the RMT need to be very vocal and outspoken. They need to be a trusted voice and willing to deliver
         bad news
Dave: So we're agreed we need an RMT again, and Noah is willing to do it again or step down as needed.
Andrew: So the RMT is active from the end of the last commitfest until the release?
Noah: Yeah.
Dave: So if Noah is willing, I'd propose that he takes the lead on forming this years team.
Noah: Ok.
Bruce: Alexander would be good.
Noah: Are you interested?
Alexander: Yes

TODO: Noah to form RMT.

Is it worth having loads of meetings if not everybody attends?
--------------------------------------------------------------

Dave: <describes past developer meetings>
Dave: I expect Tokyo to be an exception - once every few years
Bruce: Will there be a Tokyo conference next year?
Etsuro: We'll have an Asia conference, but maybe China or elsehwere.
Noah: The improtant thing is we have a meeting with a critical mass of developers
Stephen: Yes, Ottawa is good for that.
Magnus: Ottawa is good for admin/procedural
Andrew: Should we make Brussels more open, an unconference style?
Magnus: I think that works well at Ottawa because of the large conference as well. We could have an entire open meeting 
        on patch triage for example though.
Bruce: Who was in Tokyo (about half). That shows maybe geographical distribution may not be an issue.
Magnus: Lists people who went to Tokyo. There was only one person who was in Tokyo who hasn't been in Ottawa or Brussels
Noah: I don't really want to travel that much - I'm only here because I was in Europe anyway.
Bruce: Was Tokyo useful?
Dave: I think it was useful to meet with our Japanese colleagues who we rarely see, but I don't think it was a forum for
      making decisions.
Magnus: If we didn't have the Tokyo meeting, maybe we would have had a full agenda today.
Stephen: I think it's useful to have some number of developers talking through designs etc. at multiple conferences.
Dave: So really what you're saying is that we should have technical un-conferences
Bruce: Yeah, or maybe half and half.
Stephen: When I was thinking of coming here, I wasn't thinking so much about the technical content, but perhaps I should.
Magnus: Having the CF review is a good thing, and it doesn't need to be closed.
Bruce: We could have 2 rooms, one for patch triage and one for unconference.
Andrew: For serious triage, you need Toms and Alvaros and so on.
Dave: So, keep Ottawa as it is, and make other dev meetings more unconference/triage events.
Stephen: Right - but we need to ensure senior devs attend.
Bruce: I don't want to preclude procedural discussions at other events though.
Dave: We can always take an unconference session if needed.
Magnus: Or start the unconference an hour later.

TODO: Dave to investigate options for Brussels/Asia next year.

Momjian Half Hour
-----------------

Dave: I'm failing as a moderator as this item now needs to be the Momjian 18 minutes.
Bruce: I'm not sure I even have 5 minutes.
Bruce: I want to recap on sharding that we discussed in Tokyo. Simon said it looked like we had a workable project - and
       with the various bits of work on partitioning and pushdown etc. it looks like we'll have something for 10.0, but
      not everything.
Petr: Declarative partitioning doesn't work with FDWs yet.
Bruce: Yes, that limits what can be done in 10.0
Noah: What are the projects ongoing that are part of this?
Bruce: I put a blog out after Tokyo that links to the wiki where I'm tracking the various parts of the project: 
       https://wiki.postgresql.org/wiki/Built-in_Sharding
Noah: Are there patches in the CF right now?
Bruce: Yes, unfortunately they're just sitting there.
Stephen: Which ones?
Bruce: Parallel foreign data push-down
Etsuro: Yes, that's been proposed but noone has reviewed it yet. There's also a transaction manager proposal that's 
        received no feedback.
Bruce: Whilst there are things waiting, overall I'm very happy with how fast things are going.
Bruce: On security...
Bruce: There's more to it than SSL certs etc - auditing, policy and more - and we don't do enough. We have a mindset in 
       the community that "if it can't be 100% bulletproof, we won't do it". 
Bruce: We need to do much more, and accept that it won't be perfect.
Dave: I don't think we're holding back on things like SSL cert docs for that reason - we can just improve them. On the 
      other hand, we also know that RLS isn't perfect and has some known covert channels - but we recognise there's 
      little we can do about that.
Masahiko: We've been doing work on pg_audit.
Stephen: We need to figure out if we can put it in core.
Tomas: It's a similar problem to pg_logical, in that it started out as an extension. We will soon have 3 forks of 
       pg_audit - which is not good.
Petr: We'll always want more features; we have to understand that the in-core solution might only be 90%.
Tomas: (to Stephen) You should talk to Abhijit if you're interested in making pg_audit in-core again.
Bruce: I think the two areas we're lacking is documentation and auditing.
Stephen: I agree that docs need improvement, but I don't think anyone is saying we shouldn't just do that. I think we 
         need more than auditing though - some kind of cell based RLS.

TODO: Bruce to improve docs on SSL certficate setups
TODO: Bruce to complete TODOs from last years meeting.

SQL/JSON in SQL-2016 Standard and our roadmap
---------------------------------------------

Oleg: In December ISO released a new SQL/JSON standard. The reason I want to talk is that we need to decide what to do.
      Postgres was the first DB with native JSON, and we have many users. Before the standard, we designed what we 
      wanted, but now, should we move to the SQL/JSON standard? SQL/JSON is not compatible with existing JSON - it looks
      like JSONB. It specifies the data model, and originates from Oracle.
Magnus: MySQL were very proud that they had the first compliant database - because they had access to the pre-release
        Oracle documentation!
Andrew: What are the differences between standard and our implementation?
Alexander: There are 9 functions like json_path for example, which uses dot syntax instead of slash syntax.
Oleg: Our JSON is just a string - we preserve everything. Our JSONB is binary, which doesn't preserve anything. The new
      standard is un-ordered.
Alexander: There's also a naming convention issue. All our JSONB functions start with jsonb_.
Andrew: So it's really just a set of functions. There's no datatype as such.
Stephen: It would be really nice to have support for the standard in core. 
Petr: The question is, should we have a new datatype for json_path?
Andrew: Just use a string!
Stephen: Do we have anything already that occupies this space?
Oleg: No.
Alexander: If we take a string, we need a way to cache.
Stephen: If they take a string, then we can overload alternatives.
Magnus: JSONB sounds much better The question is whether we can map the standard to it.
Oleg: No problem.
Andrew: Can you do it in a couple of weeks?
Oleg: The first problem is development. The second is that to review you need a copy of the standard, which we cannot
      share.
Andrew: I'll get a copy, and can review.
Oleg: We'll have something for Postgres 10.0. Teodor is very interested - three weeks should not be a problem.
Alexander: The standard is very fixed, but should we allow the user to use catalog functions or operators etc.
Andrew: No, follow the standard for now.

TODO: PostgresPro team to implement functions, Andrew to review.

Supporting management roles (aka: removing superuser checks)
------------------------------------------------------------

<notes taken by Noah>

Some users reject pgadmin & other tools due to superuser access.
dpage wants grantable permission to read log_directory.
PEM/pgadmin uses pg_ls_dir/pg_read_file to read logs
sfrost: could offer pg_read_log instead?
dpage: system also wants to read postgresql.conf
sfrost: risk if everything routes through postgres protocol
spage: PEM agent runs on each server, but some users don't run it.
magnus: do you need postgresql.conf, or is pg_settings enough
  dpage: requires superuser access for file-path settings
  dpage: want verbatim file to catch unapplied changes
  frost/magnus: there's another feature for that
frost: don't want PG to expose this much
nmisch: once you have to run more than "create role", might as well create a
  whole database with everything you need
dpage: doesn't want to have to grant pg_catalog objects to some role
nmisch: I think pg_backup is defineable, but hackers did not
dpage PEM needs: pg_tablespace, read lots of GUCs, pg_create_restore_point,
  pg_start_backup, pg_stat_activity, ps_ls_dir/pg_read_file OR higher-level replacement
nmisch: would they mind if you create your own database, vs. objects in their database?
dpage: some would mine less, some would still dislike
frost: allowing pg_ls_dir grant invites people to grant too much
magnus: people won't migrate from pg_ls_dir to another interface
frost: let's give people interfaces to the things they need
dpage: had ten years of monitoring running as superuser. need to improve that somehow.
frost: pg_read_file,pg_write_file are sufficiently dangerous
magnus/frost: read file basically gives you superuser via ssl private key,
  kerberos keytab, etc
frost: still helpful if you consider this as defense against bugs in monitoring
  code, not hostile monitoring code
frost: we could solve the read-all-GUCs case for 10.0.
magnus could add pg_list_logs, pg_ls_xlog
frost: have what-is-oldest-xlog function in backend today?  magnus: doubts it
dpage: monitoring wants to track volume of xlog files, not actual file names
frost: keep it correct for nondefault segment size
nmisch: treat xlog as stream of bytes; breaking into files is implementation
  detail
frost: still want to reduce number of superuser checks.  extension whitelist?
magnus: risk of providing half-way answer is that people never get to full-way
magnus: these individual new functions/roles are low-risk
frost: doesn't want the function reaching to syslog or something
dpage: will this ever lead to default roles?
nmisch: will never have one pg_monitoring role for PEM
magnus: could have N roles that together suffice for PEM.
frost: don't want 100 default roles.
dpage: SHOW has no grant
nmisch: could add a builtin role, similar to pg_signal_backend
frost: riggs proposing patch to give db owner more privileges
nmisch: had been debated in the past, some people strongly in favor of status
  quo.  petr: people coming from other databases don't like it
frost: wants it more like an owner of all objects. not like local superuser
  in particular, allow revoking rights from the owner
magnus, nmisch: won't allow untrusted function creation
frost: gets requests for readonly users (pg_dump role)
magnus: compare db_datareader in sql server
petr: risk of changing database owner rights due to effects on existing DBs
petr/dunstan: riggs wants dbowner to operate as object owner
vondra: what is riggs goal?  dunstan: arose from difficulty of bucardo in RDS
frost: break up owner rights, create default role for each.  but want it to be
  granted at the DB level.  considering db-level catalog.
magnus: cluster-level enough for many people
dpage: might write pg_ls_log_dir, still wants pg_ls_wal_dir
nmisch: I think the key thing to resolve is the list of default roles
  pgadmin/PEM would need, then acquire consensus that those roles would be
  sufficiently well-defined to put in core.

conclusions:
- welcome specific proposals for new lower-priv interfaces, grantability, default roles
- read-all and write-all proposals welcome

Tools and services from pginfra
-------------------------------

Magnus: What stuff do people want? E.g. tracking of open items etc. Let's think about it over lunch and talk afterwards.

<lunch>

Magnus: So are there any comments on services that need improvements or services we lack?
Petr: I asked about this last year, but the CF app doesn't email me if someone changes something on an item.
Magnus: Huh? If you create comments it will email.
Petr: I also don't hear when the status changes.
Magnus: I definitely get those. We should debug why you don't get them later.
Noah: The -announce list can look pretty polluted in the archives when people reply to items by CC different lists.
Stephen: <pointing to Magnus> HAH!!
Magnus: We need to special case -announce. We don't allow CC's on -announce messages, but the archives work on msgid.
Dave: While we're talking about mailing lists, let's remind everyone that we're changing mailing list software which
      will have knock-on effects:
Magnus: No subject mangling, no footers, new namespace. Needed so we can support DKIM.
Noah: So what about archives for closed lists?
Magnus: A separate instance of the archives, with community auth based access.
Andrew: Will there be a web interface?
Magnus: Yes
Dave: It's vastly more user friendly than mj2, and highly simplified.
Stephen: Oh, and we will no longer break signed emails.
Noah: I've noticed that the mbox archives mis-handle git attachments. 
Magnus: Yeah, that's mj2.
Dave: Shall we talk about AWS?
Magnus: We have AWS credits that people can use for dev/test. If anyone wants to be guinea pig for the access processes
        please let us know (sysadmins@postgresql.org).
    
Performance Farm
----------------

Tomas: If you remember the meeting last year, we decided to build a performance farm, like the build farm. We have the
       client basically working and running pgbench and TPC-H/TPC-C. What we don't have now is the server side.
Andrew: I'll be working on it as well. It may be worth roping in Christophe.
Dave: We also have the skeleton website (Django) which ties into community auth etc.
Tomas: The initial version will be designed to run tests following changes. We won't allow execution of arbitrary code.
       We want to get more people involved - maybe we should setup a mailing list.
Stephen: How much data will we get? The buildfarm got kinda big.
Andrew: Results should be much smaller - there are no build logs etc.
Tomas: Output from sar is collected, and might be quite big.
Stephen: We need to benchmark the storage requirements so we know how best to host the server. We also need to define
         a retention policy.
Dave: We can use RRD-like storage.
Tomas: We can de-dupe consequetive duplicate results.
Stephen: We should think about partitioning.
Tomas: I'm OK with that, but it is likely not compatible with retention times. We need to look at this in more detail 
       once we now what storage is used by a working system.
Oleg: What about upgraded machines?
Tomas: The client collects a number of stats - we can track them.
Noah: I think the main thing is having lots of clients, so we'll see real regressions on multiple machines.
Dave: The biggest problem is that we're all so busy
Noah: Once the basics are done, that's the hard part, then we can move on to refining things over time.

TODO: Dave/Tomas/Andrew: Schema design
TODO: Dave/Andrew: Machine registration/admin etc.
TODO: Dave: Mailing list

Any Other Business
------------------

None.