PgCon 2023 Developer Meeting
A meeting of the interested PostgreSQL developers is being planned for Tuesday 30 May, 2023 at the University of Ottawa, prior to pgCon 2023. In order to keep the numbers manageable, this meeting is by invitation only. Any questions regarding the invitations to this event should be directed to the team of individuals tasked with coming up with the list of people to invite:
- Andres Freund
- Stephen Frost
- Dave Page
An Unconference will be held on Friday for in-depth discussion of technical topics.
This is a PostgreSQL Community event.
Meeting Goals
- Define the schedule for the upcoming releases
- Address any proposed timing, policy, or procedure issues
- Receive updates from project sub-teams on their activities and discuss any resulting issues or concerns.
- Address any proposed Wicked problems
Time & Location
The meeting will (probably) be:
- 9:00AM to 12PM
- DMS 3105 - Desmarais Hall, 55 Laurier Avenue East
- University of Ottawa.
Lunch will be served during the meeting.
COVID-19
The University of Ottawa's COVID-19 guidance can be found at https://www.uottawa.ca/en/covid-19. Wearing of masks at the Developer Meeting will be optional, however we do ask that people do not attend if they have COVID symptoms or have tested positive.
RSVPs
The following people have RSVPed to the meeting (in alphabetical order, by surname). Note that we can accommodate a maximum of 30!
- Nathan Bossart
- Joe Conway
- Jeff Davis
- Mark Dilger
- Peter Eisentraut
- Andres Freund
- Stephen Frost
- Etsuro Fujita
- Peter Geoghegan
- Magnus Hagander
- Amit Kapila
- Jonathan Katz
- Alexander Kukushkin
- Tom Lane
- Heikki Linnakangas
- Noah Misch
- Thomas Munro
- Dave Page
- Michael Paquier
- Melanie Plageman
- Masahiko Sawada
- Peter Smith
- Tomas Vondra
The following people will not be in Ottawa, and do not plan to attend:
- Masao Fujii
- Daniel Gustafsson
- Álvaro Herrera
- Tatsuo Ishii
- Alexander Korotkov
- Amit Langote
- Dean Rasheed
- David Rowley
Photos
Please contact Dave Page if you'd like copies of the full resolution originals.
Agenda Items
Please add suggestions for agenda items here. (with your name)
- 16.0 release and commitfest schedule (Dave)
- Renaming "master" branch to "main"? (Michael)
- A brief PG15 RMT postmortem and what can we improve? (Jonathan)
- What are the big challenges for our users? What are the big challenges for us to solve? (Jonathan)
- Cloud operators have data of value to this community, e.g. frequency of errors that should never happen. What sort of data sharing regime might provide a good balance between value to the community and comfort for cloud operators? (Noah)
- High level thoughts and feedback on moving toward ICU as the preferred collation provider (Jeff)
Agenda
Time | Item | Presenter |
---|---|---|
09:00 - 09:10 | Welcome and introductions | Dave Page |
09:10 - 09:20 | Release and commitfest schedules | Dave Page |
09:20 - 09:35 | Renaming "master" branch to "main" | Michael Paquier |
09:35 - 10:00 | A brief PG15 RMT postmortem and what can we improve? | Jonathan Katz |
10:00 - 10:30 | What are the big challenges for our users? What are the big challenges for us to solve? | Jonathan Katz |
10:30 - 11:00 | Coffee break | All |
11:00 - 11:20 | High level thoughts and feedback on moving toward ICU as the preferred collation provider StateOfICU | Jeff Davis |
11:20 - 11:50 | Cloud operators have data of value to this community, e.g. frequency of errors that should never happen. What sort of data sharing regime might provide a good balance between value to the community and comfort for cloud operators? | Noah Misch |
11:50 - 12:00 | Any other business | Dave Page |
12:00 | Lunch |
Note: This timetable is a rough guide only. Items will start as soon as the previous discussion is complete (breaks will not move materially however). Any remaining time before lunch may be used for Commitfest item triage or other activities.
Minutes
Stephen Frost recording
Welcome and Introductions
Dave Page - General opening Introductions
Release and commitfest schedules
Dave P - Any proposed dates? Katz- We proposed a specific feature freeze date and no one objected Dave P - April 8? Katz- Yes Peter E- Can always change it next year if needed anyway Katz- Any objections? Tom- Official rather than unofficial end? Katz- Do we want to change anything? Peter G- Anything wonky could be addressed, seems to be pretty well worked out Katz- Can we try to shift more work to earlier? Time and human nature shows that we end up with things at the end but getting things in earlier Andres- better this year than last year, at least. Not sure if there's specific reasons but generally better. Munro- Do you mean buildfarm breakage? Andres- Buildfarm breakage but also people testing Noah- Reduce log jam by reducing uncertainty about when it ends, maybe other approaches are there too Katz- Having CI helps because it provides feedback more quickly Dave P- Haven't come up with schedule for next cycle yet Katz- Usual late q3/q4?, everyone is planning around that Dave P- Pick a date? Katz- Typically target one of the last two weeks in September Peter E- Usually end up in October but would like it to be earlier Noah- Which betas when to get there? Peter E- We don't usually plan these really and haven't ever really set up a date Joe- Seems like doing the same as we've done in the past, right? Dave P- Topic is are we going to change what we've done? Joe- July first CF for 17? Andres- Impression was that lots of review was happening but was a different set of people vs. people working on July CF Joe- When is RC? Katz- Generally September Tom- Don't have good sense of how much beta testing is happening Katz- We are still seeing bugs after the beta, which is normal, it's software and that happens, but more we get people to beta test the more we'll get figured out Dave P- That's a discussion we've had for every meeting, used to do a lot more alpha and beta but didn't seem to get people to actually test Katz- Uses say they are happy to help with testing but then they don't Peter G- Impression is that it's hardly the case that users test Dave P- Agreed, people don't tend to test Peter E- People expecting docker images and such but without that it ends up not happens Alex- Very inconvenient for some people to test Mark- EDB testing happens starting with betas because their product depends on it, but if they don't catch things then they release and having it go longer doesn't add anything really Peter E- There are others who do that, like PostGIS, where they don't do user testing but they test that their extensions work Dave P- pgAdmin doesn't test very early on Katz- Extensions do get tested early on though Andres- PostGIS actually keeps up pretty tightly Peter E- Other extensions don't really keep up with that Dave P- PostGIS watches for API breakage and such Andres- Yes, in their best interest to keep up Noah- What about pginfra? Magnus- Most of our pginfra is relitively small and simple Peter G- How would we know if more beta testing was happening? Tom- We'd have more bug reports Joe- Seems like we just need to set the dates out and then we change the plan if needed to, but with dates we actually have deadlines Dave P- Good point, users can then plan around that Katz- Typically the betas are around the same time each year, beta2 is around when we branch, beta3 is around August, then RC in September Tom- Did we do a beta right before branch? Katz- Usually, right in last week of June Peter E- Depends on open issues but once those are resolved we can release, looking at it now, 6 items, 2 are basically the same, could be done quick, last year had a super long list that took a lot longer to get there. Stephen- If we have a deadline then people can work towards it Peter E- Have a deadline or your feature gets reverted Heikki- Lets write down these deadlines so that people can work towards them Katz- Gives predictability to users so they can plan around it. How many people really go to .0? Suprisingly a lot of people do Joe- .0 is where the real testing starts.. lets just admit that and push to get that out Andres- Folks will just move to .1... Katz- There wasn't a lot of time between .0 and .1 and when there aren't a lot of bugs fixed between them then people feel comfortable Stephen- Should be close to the next quarterly release Katz- Think we agree, don't want too far before quarterly and don't want right after Dave P- September 14 is a Thursday, seems like a good date? Peter G- That seems fine Andres- Is there a chance we'd want to reduce the window between stamping and release, particularly when there are security issues? Seems to be long to have 4 days between those and packaging is faster. Tom- EDB didn't want to short it Dave P- Isn't packaging, it's more about QA and checking to make sure everything is ok and to have time to address them Peter G- Could adjust if needed Dave P- People know the schedule and have confidence that they can do things in time to test, et al Joe- Seems we could easily move it to Wednesday for the actual release process Andres- Do we actually have cases where things break between stamping and release? Dave P- Isn't always an issue with our stuff, sometimes it is with other things and with packaging Magnus- Those could be tested previously rather than after stamping Dave P- Could possibly do earlier but other libraries have new releases and such Magnus- RPMs and Debian systems get built but RPMs aren't really tested, but the Debian snapshots are actually tested Peter E- Can't start building the actual release packages until the actual release tarballs are published, need a bit of time in case things happen. Maybe could squeeze from Monday to Wednesday. Dave P- Wouldn't want to squeeze any more than Monday to Wednesday Heikki- If there is a hiccup with Debian or such then that is independent really Dave P- We release through our own servers and such really Peter E- We could go out and ask people about this and find out Noah- If we declare it earlier that it'll be a shorter release then people can make it work Joe- Folks don't really want to deploy things on a Friday, so it's better to bring it in earlier Michael P- Timezones should be considered too and Thursday in US is actually Friday on the other side of the world and people don't want to deploy on Fridays Dave P- Same for packagers really as many of them are in India Mark- Are people waiting until the last minute? Dave P- We don't really do snapshot builds because we depend on the tarballs Mark- If someone committed something 2 months before that breaks a build, you won't know? Dave P- We should know that but there's other libraries and things that are involved in the build which get updated as part of the installer and might break things and those aren't checked as often. Noah- Think of packaging as pipeline that's constantly getting built to test and make sure that things work Dave P- Requires people to spend time testing that things actually worked and having an automated system is an awful lot of work, particularly of GUI applications that have to work but are hard to test Heikki- Seems like consensus is Wednesday? Tom- We should ask the packagers list really as they are very involved Heikki- Yes, we should propose it to them and see what feedback we get. Dave P- Want to be careful with packages, we don't want to get into a position of forcing other folks to work 12-14 hour day because many people have this as a job and such. Noah- We should ask packagers what the impact ist. Katz- Need to give ample time to even consider this, should we test with beta releases? Peter E- Not a good test point because it doesn't require as much Dave P- Beta is also only one version vs. the regular releases Andres- Isn't the same pressure for the beta releases really either, no push on that Katz- Seems like we could just propose it for the next regular set of releases Dave P- We've blown our time and seems like we should move on Katz- Propose to packagers for November release to release on Wednesday instead Dave P- Sept 14 target for release? Katz- For v17 Sept 19, 2024
Action items
- Propose to packagers moving to Wednesday for November release
- Propose to RMT September 14 2023 for v16 release
- Propose to have September 19 2024 for v17 release
Renaming "master" branch to "main"
Michael- This was proposed back in 2020 and was done by github. Git has changed itself to have an init default branch, also released in 2020. You can also set up a default branch. Andres committed a lot of changes to move to primary and to not use master/slave. How do people feel about the change of branch name? Peter E- I have a master's degree from university, what to do with that? Not sure that the term is really that objectionable Dave P- Have similar issue in pgAdmin with the "master" password, named for key that is used to unlock with other keys. Decided to leave as-is since there really wasn't another name. Peter E- That's also the general term usage in cryptography. Dave P- Wouldn't object to the change but not really excited about it. Heikki- Don't need to decide if it's a bad word or not but git and github have changed and should we maybe consider the change for that reason? Peter E- Git hasn't really changed it though Michael- Right, really only github has changed it Peter E- git from commandline vs. from github Peter G- Depends on the packaging maybe? Michael- Git config setting that can be changed Peter E- When this came up before, consensus was that we would consider changing when Git changes, but Git hasn't changed yet, so Noah- Think we said we would change when Git changes and Git has not changed yet. Peter E- One point was that if we don't change it then we'll have the same discussion every year and maybe we should change it just due to that. There's a bunch of extensions and tooling and such and this would create a lot of work for folks. Heikki- Doing a new git repo defaults to main on my laptop but that's because of configuration, but if you don't have that configuration set then you get a hint that if you don't have the value set that the default may change in the future. Dave P- What downside? Mark- No idea how many bugs will come out because of this with pipelines and such Andres- Is there a way to make it just work for users who have existing checked out repos using git sym refs or such? Peter E- People are expecting main to be there these days rather than master and we don't necessarily have to go purge everything of the specific term Noah- Key is to not surprise people too mucn, we don't want to be the last people to have that term Peter E- But then what about everything else like pgpool Noah- Feel that 'master' is less surprising than 'main' today but that's probably going to change moving forward. Heikki- Some extensions have already changed like pgrouting Tom- Maybe wait for git or Linux to change Mark- Poll the room, does anyone feel strongly one way or the other Tom- Kinda feel like its not worth the effort currently Michael- Not feeling like there is an urge to change given that git hasn't changed. Dave P- No one really strongly against it either Michael- Did it myself locally and was not hard to change. Also fine to just not do anything right now. Katz- Same conclusion as last time- wait for upstream or Linux to change.
A brief PG15 RMT postmortem and what can we improve?
Katz- Postmortem from RMT from last year. Unique things in v15 release which is useful to call out. RMT is Release Management Team. Prior to RMT we had releases drag on. Idea was to have a team who is able to push towards a release and make sure the system is stable and address open items. Two goals- release on time, and have as stable a release as possible. Some releases have been pushed into October but generally its been good and releases have been pretty stable but sometimes there are bugs that weren't caught until later minor releases. In general things have been good and stable but there's always opportunity to improve. SQL/JSON is an interesting case which was reverted late and it was a highlight feature. v15 ended up being a bit sparse in terms of major features. v16 has lots of awesome stuff on the other hand. SQL/JSON things which were challenging- highly anticipated feature even though its a lot of syntactic sugar, but other databases have sql/json and people want to move to PG. What made it challenging from an RMT perspective but it wasn't ready and people were still working on it but ultimately we had to revert it. Hard to tell someone to revert it but that's part of the job and it is what it is. With sql/json what the RMT was trying to do was see if we could get it in. Want feedback though, we still had a pretty stable release and we got the release on time more-or-less. Personally, I allowed going past deadlines which were set and that was a mistake because it ended up getting reverted late. Currently v15 seems to be pretty stable while there were some bugs. Are there things from that release and that effort that the RMT could benefit from hearing? Peter G- Think the RMT process has been effective and successful. All things considered the RMT does a good job. You mentioned the obvious big one for v15 but that was a pretty specific case. Andres- Wasn't that narrow but need to make sure the RMT isn't looking at things from a marketing perspective. Thinking about hey this is one of the major features. Peter G- How much time did the RMT give for that feature? Andres- Months, like 3 months, was a lot of time, really dragged on. Was also emotionally draining for people. Melanie- Is it intentional that what the RMT does is opaque? Peter G- Sort of. Peter E- Isn't really intentional to have it that way. Peter G- Think it was Noah that came up with the idea but was a good idea and is better to have a team who is informing someone that something has to be reverted rather than individuals going back and forth about it. Andres- Doesn't mean we can't document the process and the review. Peter G- There is a wiki page Andres- But how would you know to look for a RMT wiki page as a new contributor? Heikki- New contributors don't really need to worry about the RMT as it is more for committers to deal with typically. Melanie- Are all RMT communications open? Are there online meeting notes and such? Peter G- No Katz- RMT meetings we do try to document what we talk about and there's a google doc and things on hackers are public but notes are private but not typically anything earth shattering in those meeting notes. Peter G- Was on team but if there are off-the-record comments then those would be off-the-record either way and don't want people to take things personally. Melanie- Is there a list and a date for when things need to get reverted Peter G- The thing that is the real no-no is when someone isn't very communicative, but as long as there is an active discussion and an ongoing effort then things don't need to really have hard deadlines. Noah- First year or two, a couple of emails were sent to hackers to explain how the RMT works, could do that annually. Peter G- There is a email that was sent that sets up the RMT Noah- Model should follow what the supreme courts should be like and that they can have some private notes if they need to Mark- Is there an active issue? Is there hostility to RMT? If feature isn't ready that isn't really the RMT's fault. Andres- Would have been good to have been faster when it came to reverting what wasn't ready for v15. Dragging on wasn't good. Peter E- There wasn't urgency maybe because it seemed like it could be reverted at any time. Peter G- There were some tendrils where it reached out to that wouldn't have been good to keep, would have been good to pull it out sooner. Katz- Sometimes its important to just jump on a call and discuss it. Peter G- Point is, can you really expect to do it differently next time? Not sure that there is really a lesson. Maybe be more aggressive next time? Andres- Maybe lesson is to not allow the marketability of the feature to drive us to keep it longer than we should have. Katz- We don't want to risk our reputation for reliability Peter G- Objection is that it took a long time but ultimately it was reverted and that was the right decision and it was reached in time. Katz- One of my take-aways, maybe we should be targeting like beta2 to say if it's not ready by then, then it needs to go, so that everyone can kind of move on and can focus on stabilizing the release. Noah- More time you give people the more time people will take. Point is that things need to keep moving and there needs to be regular progress and if there's not then there has to be a really good reason. You also brought up if the RMT should do more testing? Katz- The goal is to make sure the release needs to be reliable but it should be a project goal to have it be reliable. Noah- Right, would be too much for the RMT to take on.
Action items
- DO BETTER!
- When in doubt, use beta2 as the deadline.
What are the big challenges for our users? What are the big challenges for us to solve?
Katz- We have several days we will all be together, going to dive deep on a bunch of different things. What's nice about getting this group together is that we don't get to do this too often and focus and talk directly instead of being on laptops writing emails. Good to talk about higher-level challenges and not just technical ones. What are the biggest challenges for our users? From the users perspective, three buckets to try and put things in- availability, performance, emerging workloads. Good answers for HA. Biggest pain points- software upgrades, trying to get to zero downtime. Big users polled- whats biggest pain point- getting to zero downtime was #1 for all of them. Another big one was non-blocking schema changes. If there are exclusive locks being taken then it's practically the same as downtime. Peter G- Just knowing ahead of time what is going to happen Ketz- Another big theme is observability and be able to have predictability. Second is performance- trying to always get better performance, the direct i/o work, its going to take some time but there is a lot of excitement there. Also is vertical scalability where we can continue to grow up too. Saw systems with 24TB of RAM, but how much of that can PG really use? Also parallel operations. Finally emerging workloads- new topic that is really an old topic and old thing which has been around a long time, but vectors are a big thing these days and very hot and that's a good thing. Big arrays can push the limits of a page due to page size and how big the vector ends up being and people want to index these things and that can be a problem due to page size. What else is everyone hearing? Or project or other technical challenges? What are people hearing? Heikki- Connection pooling, max connections, can't change it after you start.. You need to set up pgbouncer but then that has challenges and limitations too. Katz- Yes, very good point. Andres- Getting HA is too hard and have to use a provider to get it. Trying to write down all the things needed you quickly find our tools really aren't good. Would be good to improve on that to make it realistic for users to have HA without a lot of effort. Peter E- Biggest challenge is that our development velocity isn't growing. Isn't shrinking either, but isn't growing. Has been pretty constant for like 10 years. Can use those resources to work on X or Y but if you want to work on something like major version upgrades which will take time every year forever and if you can't grow the pool then that takes a way a certain set of people. Jeff- For a lot of projects, the velocity reduces, so it isn't bad that we have been able to maintain. Andres- Agree with the concern, but not sure that the commits vs. lines changed is as much of a concern, but also the quality of code going in now is much higher vs. how it used to be, and if you look at the number of corruption bugs, that is much less now than it used to be. Peter E- Quality of code is going up but there's still a limit with the number of people and if we aren't growing the number of people then it's hard to grow new features really. Peter G- not that long ago there were much fewer (1?) person working full time on PG Katz- There is generally positive thing but we need to figure out a way to grow the pool. Andres- There is a danger when you have people full-time because new people won't be full time and we forget how weird things are in PG and makes it hard for new people to get into the community. Because we do it full time, we kind of forget that people have to get started somehow. Heikki- Definitely feel that, just hired someone new to work on PG, is kind of exciting to watch someone new learn about the cool things in PG but it's hard too because they have to figure out how to search the mailing list and such. Nathan- People do find the process very intimidating and the culture of the mailing list and as you describe it you end up thinking about how strange it is. Peter E- Maybe we over-document it and then things get out of date, and you can't delete wiki pages very easily(?). Now you can just git clone, hack, git format-patch and send it. Seems like maybe the wiki page has too much and should be made simpler. Stephen- Mentioned GSoC and introducing new people Heikki- Wasn't too bad to get new person Melanie- Getting people engaged and keeping them motivated can be really hard and people can get very frustrated with the process. Ultimate is to have people who are not full-time engaged and working on PG rather than just having full-time people. There are so many pieces that people have to understand. We aren't really helping new people and we aren't going to get new people if we don't put effort into this. I need to figure out a minimal repro for a bug, I need to figure out how to benchmark a given change. How to make those steps forward, even if they get detailed enough feedback ... I did a workshop recently on how to get feedback on a thread and how to take it and how to move forward with changes. Can't just get detailed feedback and then not be able to say "I don't know what to do next" but some people aren't really sure about how to do it. This group isn't as good at mentorship as we should be and we need to change our attitudes around that to try and be better. Katz- How do we actually do that, able to do it internally to a degree, but how do we do this in the community, but we need to put the time into it. Melanie- How I'm still here- worked at Pivotal and do pair-programming, spent hours working on things together, similar at Microsoft with Andres working together over time to get things figured out and understand but isn't necessarily always realistic. Amit- Is it possible to get people involved with smaller patches rather than getting new people in. Melanie- A lot of PG people suggest that and ask people to review code, but if you're in that place where it's hard for you to understand what's going on, so reviewing a patch can take a very very long time and it ends up being a big time committment and they're not sure what to do next. Katz- Much harder to provide reviews really Nathan- Reviewing in a way to move the patch forward can be really hard. Would spend days slugging through a given patch and that's really what you have to do to keep it going. Melanie- Not clear that having done that review that you get to keep working on PG, isn't really something you can necessarily go to your boss and show what you did. Andres- Sometimes you get involved and its really hard to figure out what is going on with people going in different directions sometimes and everyone sounds the same to someone new. Peter G- When I was getting going didn't feel that people, but was a different time too. Katz- How do we make it feel like it is more of a collaboration. We want new people to come and to contribute to PG. Melanie- If we had 10 more people who were super productive and be contributing and moving PG forward. We need new people who are doing small things too but we really need 10 more people who are really contributing a lot. If we could just all invest the time to find those people then we could move things forward very well. Peter G- Not sure how relevant it is that committers are actually good Mark- Sometimes people ask about getting beat up on list by Tom, but Tom was the one who told me what I was doing was wrong, and it ends up being a personality thing. Not everyone understands that feedback about patches need to be taken in a positive way. Melanie- It's about taking the time to respond to people, doesn't have to be super nice but it's about investing. Mark- Pair-programming does happen in the companies but it just does't happen on the list. Not sure how you could do it on the list. Heikki- To Melanie's point, we could be better about making sure to tell people what to do next when they are given feedback. Peter E- What was feedback that didn't help? Melanie- Feedback was like "not sure if this is how it should be, could you check that?" which wasn't very clear and also needed to provide a slightly different repro. Also, how to *prove* something is correct? Not sure how to do it. Peter E- Sometimes figuring out what the next steps are may actually be the job. If you have a thing which is "not sure if this correct, can you do more to show it is correct?" Melanie- People shy away from giving explicit instruction but that can actually be helpful to new people. Jeff- On a public list, don't really want to tell people to do specific things, but maybe a different channel would be good to help with that, maybe do it off-list. Heikki- Maybe phrase it differently Melanie- Hash join memory patch- off-list debugging was done, but more things were done on list and included things like "hey, to do this we should use these functions" or similar. Seems like people don't want to actually do that though even though it could be really helpful to new people. Mark- Looked at peoples patches and it is missing the test code and sometimes add it but then it seems like hijacking credit from people and maybe that isn't good. Heikki- Maybe that feels good to some people though Melanie- Others may like that as it means other people are involved in helping with maintenance of it going forward Peter G- If I don't know something then I don't really want to say it but sometimes want to say- hey have you thought about this differently, but that may not be helpful. Melanie- With specific feedback then that could help still Heikki- Another part of the problem is people submit patches and no one responds Mark- Sometimes people propose things that no one wants Andres- That does happen sometimes and the thread dies pretty quickly after just a couple of messages. If the patch seems reasonable but you don't want to commit it right now. Patches which are obviously wrong are easier to provide feedback on. Peter G- Ambiguity kills patches and try to avoid that. Melanie- It's about investing and you need to take the time to think about what the next steps are. The reason to take the time to do that is to grow the people even if the patch itself is not as interesting to you. Katz- Mentorship isn't going to happen on the list directly. Need to do it over chat or zoom or similar. Would be good to have some kind of organic process and it'll take time and it'll be more overhead. Heikki- As part of CF process, would be nice if there was a step where people are told what the next step is for everything. Maybe too much for the CFM? Peter E- Some CFMs do do that. Magnus- Doing that for 100+ patches takes an awful lot of time. Katz- Developer work is more valued but the project work is also absolutely critical and if need more people doing that then we should say that. Magnus- People are going to be more open to spending time mentoring people in the company vs. others who might get hired by competitors or other people. Melanie- Would be good to have a goal to respond to all patches, maybe as CFM or others Tom- No one wants to be speaking from the community, at least not without a long discussion Noah- Reviewers do do that though, speak from the community in some small way Jeff- A lot of orgs, projects are set up in a way that something will eventually get committed. If we are speaking as a committer, we can say this is what I would do next, but even as a committer, only some patches actually get accepted and committed and giving feedback may not mean that the patch will end up being committed. Joe- Would it make sense to have the CF have an option where someone can ask for a mentor? Peter E- Would be adding another job on top of what we already have going on, barely have time to review things Melanie- If you've not met that person in real life, it might be difficult to do Joe- Don't want to impose yourself on someone but if someone is asking for it and it could be off-list or non-public then it could work Katz- Maybe have some idea about having office hours or something Joe- Office hours, people don't end up showing up really Katz- Maybe we could figure out an async way to do it. Being able to just quickly ask someone about something can be really helpful. Melanie- One things with that, everyone who wants their patch to get in might ask for a mentor though. Maybe there could be some stage where you've been involved long enough and you want to be around, maybe you could have someone who could talk to you or be there to ask questions of. Noah- Maybe a box to check to just say that someone is open to off-list messages or such. Heikki- maybe at the end of the CF, the CFM could just email people and ask them if they know what the next steps are with a their patch. Magnus- Maybe a "need help" option in the CF?
Action items
High level thoughts and feedback on moving toward ICU as the preferred collation provider
Jeff- Added a new wiki page which is linked, wanted to give general status info. Basically the situation is that we have problems with glibc. Benefits of ICU- platform independence and not depending on the OS for ordering. With ICU would be out of our hands but in the hands of the unicode org rather than it being from the OS which would be better. Independence also gives a bit more control over version changes. Really hard to control version of glibc, bit easier to control the version of ICU. Performance benefit also is good, can't use abbrev. keys with some versions of glibc. Peter G- Blanket assumption is all versions and that is a good assumption Jeff- More confidence in ICU for abbrev keys. Peter G- One might ask why you make that distinction and one reason is just historical and glibc doesn't have the same priorities as ICU does. The glibc folks didn't seem too concerned about the issues that they caused us, so why would it make sense to depend on glibc for such a critical thing. ICU is used by other projects who have similar requirements as PG and they seem to be looking at the problem in the same way that we do which is probably the most important thing. glibc doesn't seem to have the same sense of concern around it as we do. Stephen- Everyone here is generally agreed on ICU being better, I think? Jeff- One person not in the meeting today who has expressed some skepticism about it. Joe- ICU vs. glibc for collation is one topic, but another is how well is ICU integrated into PG, not enough people are really using it, it seems. Peter G- Definitely gotten easier in v15 to use ICU instead Joe- Is it integrated enough to make ICU the default is the real question Mark- Do we know that adopting ICU won't lead to corrupt indexes? Jeff- Changing versions of ICU can change sort ordering and therefore there can be corruption if you change the version of ICU. Peter G- One of the main goals of ICU is to have stability and to ensure that we don't end up with corrupted indexes. Of course ICU can have bugs too, but they take it very seriously if there are such cases. Joe- If you do an OS upgrade and have the ICU version change then that can cause problems, but you can install the matching version of ICU if needed unlike with glibc which can't really be changed due to everything else depending on it. Jeff- Need to be able to load multiple versions to address these things Dave P- Have used ICU in EPAS for a long time but it's an older version because it isn't easy to move people to a new version currently. Joe- Yes, need to work on that to be able move from one to another. Dave P- Assumes that there are multiple versions of ICU available for a given OS Peter G- Yes, that needs to be addressed too and is certainly a part of the problem. Noah- Not sure we see eye-to-eye with the ICU project in terms of having multiple versions around concurrently Peter G- Older versions of ICU definitely seem to be around and have continued to work for a very long time. Fundamentally you either have stability or don't, separate question about upgrade path, but a lot of people don't worry about that. Noah- Not just a packaging issue necessarily really Peter G- Can't really segregate the data from the code Dave P- No clean separation between code and lookup tables Joe- If you took new code and pointed it at the older tables you'd still end up with changes, doesn't really work. "copy-exact" came from intel / microchip world and idea is that everything is copied exactly and that's what you need to have to not break indexes Peter G- Fundamentally you have stability or you don't and don't really see that changing. Joe- In some performance testing, wrote collation torture test in python, one of the things noticed was in old versions of glibc it'll sort in 2 minutes or so in rhel7, with glibc 2.21 they pulled a cache out claiming it would help things and not hurt things, but took that 2m sort it went to 60m, but with ICU it's still 2m. Every modern version of glibc has this issue. Heikki- Asking the RMT, should we revert this change? Katz- From the RMT perspective, thinking of using beta2 as the cut-off, last week in June. Peter G- Making it the default signals intent, packagers aren't required to do things with that if they don't want to. Jeff- initdb has a default also which a packager doesn't really control Peter G- Would be good to have the option for people to opt-out of having it Katz- Packagers have some discretion with what the default is. ICU just focuses on collation, unlike glibc. Platform independence with ICU and consistency. Biggest area of concern is with upgrades because if you go from glibc with one major version to ICU in another major version, you have to rebuild all your indexes. Jeff- We maintain the provider across the major version upgrade Andres- pg_dump should really maintain collation/provider across. Peter E- Historically we don't maintain it through for providers or locale and such Dave P- This is going much deeper than we have time for.
Action items
Cloud operators have data of value to this community, e.g. frequency of errors that should never happen. What sort of data sharing regime might provide a good balance between value to the community and comfort for cloud operators?
Noah- Get relatively little feedback from users. Clouds could possibly provide feedback. If the community thinks an error message never happens, but cloud operators do see it with some regularity. What would the community like from the cloud operators? Peter G- Have been doing this on some limited basis. Back-patched a case with VACUUM where throw an error which we don't really need to. Similar one in the last week. What's notable- wasn't really based on some kind of measure, just based on knowing that this happens in the field, at all, was the important part. Andres- think there is a lot of value in specific numbers, and if it changes over time (such as with crashes). A lot of errors that we throw that should never happens. Building list of errors that shouldn't ever happen but isn't easy to do. Would be nice to have a list of things that really *really* shouldn't happen. Katz- To the main question- talk to a lot of users and also $dayjob is a big users of PG itself. If had known that a given change had a big impact on upgrades might have done something a bit differently with it. $dayjob is generally open to sharing numbers and information with the community to help PG succeed and move forward, but need to know what data would be helpful to share. Nathan- Most of my contributions are from customers who are seeing things. Already happening from a development perspective. Fleet-wide data information can help with things that maybe should be pulled out/removed since no one is using it. Katz- Other thing is number one is "reduce my downtime" and can share that kind of info. People trying to move more and more workloads to PG. Noah- No shocks in terms of what seen from cloud run PG Peter G- Also haven't really gotten much in the way of shocks Melanie- Survey of how big are indexes and other things and was a paper/study that was published with lots of info. Also looked at how many people are using PL vs. other features. Peter G- Sometimes people have bias when it comes to what they think is being used and may not match. Katz- can see lots of edge cases that not other people see. Mark- If the cloud operators only operate the database in a certain way then only going to see certain bugs. Engineers seem to think corruption isn't often happening, but support deals with corruption all the time and people don't remember what different versions of PG they upgraded through, etc. Andres- Would be good to get stats for corruption, like what we have for checksum failures. Lots of errors where an error is thrown but we continue on. Increment some counter on corruption being detected. Melanie- Do any of the logging tools provide that? Stephen- Think pgbadger does? Could, certainly. Andres- Lot of things where we just don't see it because user can't reproduce it reliably. If you just know that you have this issue happening maybe could do something about it. Peter E- Could have stats for all error codes maybe Joe- Yeah, kind of what I was getting at. Andres- Many many are just error code internal though which isn't helpful. Peter E- Need to do better about that rather than just using internal. Peter E- Maybe make some error message sharable (no details included?) Andres- Have just a way to have the format string maybe Stephen- Maybe log line of code? Tom- Maybe the function name? Line might not be useful enough and changes Peter G- Just knowing that something is happening at all would be something and better. Andres- Anonymized stats could be useful, where could it go? Peter G- Some things might be sensitive commercially even as problems. Andres- Can't share some things but could share percentages and things Heikki- Would be interested to see if cases not expected to happen, but actually are happening. Katz- Would it help to have customers come to the mailing list? Heikki- Happy to share things too
Action items
Any other business
Dave P- Challenge coins Joe- Packagers list include cloud providers? Tom- Would that be useful for them? Joe- Would be very helpful to know when things are coming Peter E- That's generally public though Joe- Don't believe we have a way to say that when we have a release tarball to tie it back to a specific commit which would be really good to have. Tom- The git commit hash does go into the tarball Joe- Can't test that really because of things that get generated being in the tarball. Would be nice if process for building tarball gave a way to tie it back. Michael- Debian reproducible builds might be something to look into Thomas- snapshot too old feature exists, doesn't work, no one is owning it, have talked about it a couple times. Tom- probably just need to kill it Peter E- Maybe leave the setting but ignore it Andres- Pretty invasive so not good to do in back-branches ... maybe just hard-code it to off. Wouldn't want to get rid of it in v16, but maybe v17 Thomas- I can write a patch for v17 LUNCH!