FOSDEM/PGDay 2024 Developer Meeting

From PostgreSQL wiki
Jump to navigationJump to search

FOSDEM 2024 Developer Meeting schedule by Time and Room

Time Studio 4
Thur 9:00-9:10 Welcome and Introductions
Thur 9:10-9:30 Property Graph Queries (Vik)
Thur 9:30-10:00 Binary format output per session (Dave C)
Thur 10:00 Coffee
Thur 10:30-11:00 Meson and packaging (Devrim)
Thur 11:00-11:30 Built-in collation provider for "C" and "C.UTF-8" (Jeff)
Thur 11:30-12:00 CREATE SUBSCRIPTION ... SERVER (Jeff)
Thur 12:00-12:30 The Path to un-reverting the MAINTAIN privilege (Nathan)


Thur 12:30-13:30 Lunch
Thur 13:30-13:40 Moving Forward with Pending Patches (Unknown: "I have encountered a situation where some of my patches have not been reviewed by others, preventing them from moving forward. I believe this is a common challenge faced by other developers as well. It would be great if we could engage in a discussion about this and potentially brainstorm ideas to improve it.").

Note from Daniel: "This was proposed by a hacker who was unable to attend".

Thur 13:40-13:50 Recognizing New Contributors (Unknown: "Many active developers, including myself, desire to be listed as a Contributor, but just do not know how. This lack of clarity can be confusing. I'm wondering if it's possible to have a discussion on how to effectively recognize and acknowledge new contributors.").

Note from Daniel: "This was proposed by that same person, and since he isn't able to attend I don't think it's fair to identify him (I did promise I'd bring this up anonymously to put focus on the issue and not the person). This is a person who I can vouch for being very active, prolific enough to show up on Roberts recent "Who contributed to.." blogpost."

Thur 13:50-14:00 Thomas' other thing.
Thur 14:00-14:30 Page Features / Reserve space on page
Thur 14:30-15:00 TDE / IVs / More
Thur 15:00 Tea


Thur 15:30-16:45 v17 Patch Triage
Thur 16:45-17:00 Any other business & close.

Suggested topics

  • Patch Triage
  • Moving Forward with Pending Patches
  • Recognizing New Contributors
  • What's new in SQL:2023
  • Binary format output per session
  • Meson and packaging
  • Developer Meeting during PGConf Europe
  • Page Features / Reserve space on page
  • TDE

Notes/Minutes

Property Graph Queries

Vik's presentation on SQL/PGQ - Property Graph Queries which is a new addition to SQL added into SQL 2023.

Goes over differences between the relational model and the graph model, talks about GQL (Graph Query Language), then about how to turn GQL queries into SQL/PGQ.

Discussion about implementation in the PG parser, how the definition of the property graphs works, what a PG implementation may look like, etc.

Binary format output per session

Dave C - Few years ago put out what all interfaces want. What all interfaces want is to get all info back in binary. Today everything comes back as text by default. JDBC and other drivers have to use extended query protocol and send a describe message (extra round-trip) to get data back in binary. In Java world, nearly everything comes back as text to avoid the extra round-trip. PGVector would benefit a lot from returning the data as binary instead of text due to it being lots of floats. Put together naive patch to use a GUC to pick OIDs to return as binary at the start of the session. Other interfaces have also looked at it- JDBC, npgsql (?) seem to like it. Doesn't seem like GUC is necessarily the best way to do this but would be good to do something.

Peter E - Wrote email in Oct about this, quick summary: using a GUC is complicated because it ends up being best-effort. Many patches for this exist and need to consider connection poolers too. Discussion of making it a new protocol setting but we are not really ready to extend the protocol in the way we want to.

Dave C - Don't see being able to extend the protocol?

Peter E - Not sure. How to identify the types- names? OIDs? What is the OID ends up being different in different systems due to extensions? Haven't considered that part of it. Independent install of postgis into separate systems you'll get different OIDs, names you have to deal with schemas too, etc. One idea- Fixed UUIDv4 type that would be global? Want to have this be some kind of session property though. Don't want to ship this request with every query. Not a lot of session state in the protocol, not much precedent for it. Column-encryption has a similar issue. Have to hook session state in with the discard command and other things. Maybe could work similar to client encoding? Client communicates to server what client encoding it supports. Client encoding isn't terribly robust and there are concerns with using that approach but there are a lot of edge cases.

Dave C - That's a challenge but the benefit is universal and quite large.

Peter E - Could just make this a connection option.

Dave C - That would be ok too.

Matthias - No type that doesn't have binary.

Peter E - Most do but some extensions may not.

Matthias - When we flush the datum it's binary

Peter E - No, it's the send/recv functions that we're talking about, not the internal.

Dave C - Current default is we send everything as text.

Nathan - If everything including extensions seems to have binary support then maybe that isn't a huge issue..

Peter E - Maybe just have a flag that says everything comes back as binary?

Dave C - Would work for me.

Alvaro - What about pgbouncer?

Peter E - Would still have to track it but would have only one flag to track.

Magnus - Maybe have to have separate pools, one for binary one for text.

Peter E - Need to think about it but maybe, but would probably be simpler with just one flag to deal with. Maybe survey how many types have binary support and how many have text and how many have both. Does JDBC driver use simple queries sometime?

Dave C - Yes, sometimes it does use simple queries.

Alvaro - Why use simple query?

Dave C - Extended query can do odd things sometimes. Mainly use it for isvalid (checking if the connection is valid).

Peter E - Maybe change the client to always use extended query and then send every query saying to have the query be returned in binary.

Dave C - Wasn't aware that it was possible to do everything as binary, very curious how that works.

Matthias - How does the driver know about the new type? Register it?

Peter E - Trying to get away from having to register it.

Dave C - Shouldn't matter because you wouldn't have a query asking for something that the driver doesn't have a decoder for. Very few types that are actually new, most use floats, et al.

Matthias - At least now you only use the binary types if you know you can handle it.

Dave C - The person who wrote the application is going to know that they'll be able to handle the type that they are querying for.

Magnus - Might have to make it optional in the driver if you're using an unknown extension.

Dave C - Yeah, if something new was introduced then the app would break and have to be fixed if something new happened. Or it would use text.

Alvaro - New developer adds new table with query and the app breaks or gets much slower?

Dave C - Wouldn't get slower, developer would have to either add the decoder or actually select that it's text and then deal with it being slower. With Java/JDBC, application could register a new decoder.

Munro - What about OIDs being reused? No validation machinery.

Peter E - Nothing in the protocol handles that today. Maybe drivers already basically hard-code things.

Matthias - For OIDs under the FirstValidUserOid (sp?) those won't ever change, only an issue for extensions adding/dropping extensions really.

Discussion about having some kind of permanent identifier for types. (Peter E, Munro, Magnus, Matthias)

Munro - Maybe some kind of number assigning authority..

Matthias - Redo manager has a wiki page for extensions to use.

Magnus - Works because there's very few users who need those.

Peter E - All of this would require some protocol changes. Would still need to wait for the older things out there to die off that doesn't support protocol changes.

Dave C - How old are those things?

Joe - Maybe client that ships with RHEL 7 or something?

Matthias - Any plans or ideas regarding update of our row send format because it's 4 bytes per field which is a lot of overhead but maybe we could reduce that somehow? No specific proposal for specific changes but am curious if there's interest in adding a new row message type for smaller data where the message itself has less overhead. Everything today works by handling up to a gigabyte per field but in some cases we don't need anywhere near that much. Stephen - Like VARLEN, Magnus - yea, Matthias - Similar to this, maybe also be able to do column compression somehow.

Dave C - Have thought about larger protocol changes

Peter E - From Robert - Flat out not viable to bump the protocol version (email from 2024-01-16).

Dave C - v16 is the first version that really handles protocol changes?

Peter E - Yeah but even then not sure that it really works. Certainly nothing before v16 in libpq would handle it. Other drivers would have to deal with it too.

Dave C - Maybe can just say to use the latest version for JDBC, at least. This would mean we're locked into this for quite a while.

Peter E - Much of this was back-patched in the middle of the prod releases and maybe if there's additional changes needed we could argue to back-patch those changes.

Magnus - libpq needs to not fail with this is the main thing.

Dave C - Have some research to do.

Meson and packaging

Devrim - Quick summary - started building packages 20-ish years ago, just the PG RPMS. Now we have haproxy, consul, lots of other things. 200-300 packages for updates now. Not about just meson but about most of the things. Want to create a connection between the packagers and the developers. Discussed ICU things with Jeff Davis in the past and with Munro regarding IBM things on the lists, this is great. When there is a new feature like llvm in PG11, having the hackers tell the packagers is very helpful and if the packagers don't know then the packagers may not include support for those new features. Packagers are reachable via wiki, packagers mailing list, etc. Regarding meson- not following every email but trying to follow those regarding meson. Wonder if PG17 will be built with meson? Not just about meson but packagers should be included in discussion regarding versions of libraries like zlib. We have a unified spec file for PG. First question- is meson good enough to be used for the PG17 package? What can we do to improve connection between hackers, users, and packagers. Not just about PG but about also includes postgis and the various libraries that it uses and other extensions too.

Peter E - meson is not 100% functional with the make build system for non-Windows platforms. As of today, not sensible to use for official packages. Possible that these issues will be fixed in time for PG17 but then we would be switching at the last minute. Also need to check if all of the extensions work properly with meson build. Maybe for v18 we switch early in the cycle for it.

Berg - Tried meson but noticed that it doesn't support llvm which is a giant feature and stopped because of that.

Alvaro - Need to make sure that once meson stuff is installed and you want to install an extension and is much harder to ensure that the extension can be built like pgxs, does it have support for meson and work?

Peter E - Maybe switch early in v18 cycle and try to fix it up.

Berg - As long as it is supposed to work then it's about fixing bugs.

Peter E - Not worthwhile since llvm support just doesn't work at all right now.

Alvaro - Can we insist on disallowing cross-compilation?

Munro - Packagers mailing list is currently restricted but maybe we need one that isn't restricted for packaging discussions?

Peter E - People can send to packagers ... and maybe CC other lists

Magnus - Gets weird CC'ing between lists that are private and not private

Devrim - Not just about Berg and Devrim but there's lots of other packagers out there who are not just using the community packages. More people need to be aware of this. Experience so far is that not everyone is aware and we release and then other people come and ask Devrim about the new things in the spec file. Don't expect me to follow everything though.

Peter E - Existing packaging lists have very little traffic currently. Packagers generally pull info from upstream through release notes and things, not the case that all of the hackers out there reach out to the packagers to tell them about new things.

Devrim - If zlib is added as a specific library required then maybe the patch author should check out if the library is actually available on all of the different platforms or not.

Berg - Maybe postgres can do better regarding having hackers communicating with our packagers. Should packagers generally be expected to add new features? Should some things not be enabled by default? What about checksums?

Peter E - Probably should just turn on all build-time options, but run-time options should be left as the PG defaults.

Bruce - Release notes are written for a broad audience. Not designed to show absolutely every build change or every config change.

Munro - Perhaps there needs to be a new document.

Bruce - Maybe. Sometimes things are not added, because something that most users aren't going to see doesn't need to be included and we want to have the document be reasonable in terms of size.

Peter E - We do have a section of incompatible changes, maybe we should have that be better organized.

Bruce - Probably would be best as a separate document.

Matthias - release notes for incompatible changes is much more user-focused. Maybe put something like this at the end of the page.

Bruce - How many people read the release notes? Stephen - Lots. Bruce - Only like dozens of people are needing this specific information while the release notes are for a much broader audience and so it would probably be better as a separate document.

Jeff - Is the biggest issue the dependencies?

Peter E - There are subsections not interesting to most people.. Bruce - Maybe we should remove those sections too then.

Matthias - Minor release notes have things that are very detailed.

Bruce - We include more detailed info because we don't expect them to re-test between minor versions, so we want to list things that people might run into when they do a minor release update.

Matthias - Maybe we add a separate page for search-level compatibility issues, developer-level issues, extension authors. Extensions being impacted by changes to internal structures.

Peter E - Would never finish the release notes to include all those.

Stephen - Extension authors should be following hackers or committers if they're using internal structures.

Devrim - We build the beta packages and so it would be good to have the info about new switches that are added into configure before then.

Magnus - Do we have a process for adding things to buildfarm animals in terms of switches?

Matthias - Hackers who add the feature tend to add that option to their buildfarm animals.

Munro - Or the hacker adding things has to reach out to buildfarm animal maintainers and keep on them to fix their animals and add support.

Matthias - When is there going to be a good guideline on how to build extensions with meson?

Peter E - In the fullness of time.

Matthias - Have that for make but don't see that for meson.

Peter E - No need to actually do that really.

Berg - Quite a few extensions are pretty small and some are using cmake and larger extensions need more research anyway and many extensions are too small to really benefit from meson anyway.

Peter E - What would be interesting would be to get these extensions working with meson to get Windows support for them as many could work on Windows but don't just because of the build issues that meson might fix.

Magnus - Would be good to have something like pgxs that's built on meson but pgxs is pretty fundamentally built on make files.

Matthias - Tried to get make files working on mac, then tried to use meson for an extension, didn't get it to work, copied one of the files from the PG project and was able to make it work but wasn't ideal.

Magnus - Would be nice to have a tool to take pgxs makefile and convert it to meson if possible. Simple makefiles it might be possible to do the conversion. More complex makefiles need additional research. Would be useful to have that tool though.

Built-in collation provider for "C" and "C.UTF-8"

Jeff - Talk about C.UTF-8 built-in collation provider. If you take aside collation and consider just other unicode functions- initcap, regular expressions, other parts of the backend use ascii sometimes and use the database default collation provider and use those for various things, adjusting characters to try to treat international characters better. All these other behaviors are still quite important and we are doing everything through libc or ICU and we can't document or test any of that in much the same way we do it for collations. We have all these problems not just for collations but also for all these behaviors because of the collation providers. This is difficult because you can't provide simple examples for functions. The other behavior is pretty easy to build in though. Collation carries a lot of other complications but these other unicode operations are generally just lookup tables with a few exceptions. Unless you're trying to localize upper/lower functions, in general it's basically always the same. There were very few cases where upper/lower did something different for every locale on a given system (eg: Turkish). Essentially saying that we could probably build-in these basic/default unicode behavior in a maintainable way by basically importing these lookup tables. We would also get a performance benefit by doing this. All of this setting aside collation, which does carry a lot of other complexity, but we could at least solve these non-collation behavior issues ourselves and solve that problem. We would still provide the ability to use libc or ICU for localized collation and other unicode behaviors but this would allow us to have our own built-in provider instead. This would be a better version of what C.UTF-8 locale offers, essentially, as it would be built-in and we could document it and test it. Have discussed it on the list and gotten some support for it from a useability standpoint and developer standpoint. Have also had some concerns raised but think those have been mainly responded to. Not a solution to the collation problem but instead all of the other issues around there. For collation, this would essentially be the C collation. For a database it's often difficult to chose a locale anyway. Often isn't really possible to pick a single collation for a whole database anyway.

Peter E - What is the difference between this and the C.UTF-8 locale?

Jeff - The C.UTF-8 has changed the collation before, for one thing.

Peter E - Example?

Munro - Independent libraries may have created C.UTF-8 locales and some project (Debian?) created one first and then glibc added support for it later but it was a bit different and then Debian dropped their patches for it and that changed things.

Peter E - More of a bootstrap problem?

Munro - Yeah, may be able to just ignore that issue as just being a historical problem.

Jeff - Anyone who is using C.UTF-8 today, this built-in provider would just be flat-out better always. Wouldn't have the risk of such a historical thing happening, at least, and we would be matching the version of unicode that this locale uses with the version of unicode that PG uses for normalization. Risk of these changes would be low. As new versions of unicode come out, this would be updated. This would allow us to avoid those changes happening at the OS level under us and instead we could document it as part of PG releases. We could then also document these changes as part of PG documentation. Unicode has quite strong guarantees around this behavior but won't say that it won't ever break especially around undefined unicode codepoints.

Munro - Think it's a really cool idea but for another reason- kill Windows locale support. It's completely unmaintained and it's unloved and would be great to just get rid of it and instead offer a built-in consistent option. If you don't want that, then you should just use ICU.

Jeff - Other aspects - this would also be available on all platforms, such as Mac which doesn't have C.UTF-8. Available beyond just glibc users. Collation is still an important topic and you would probably still want to use ICU for collation as it's just preferable for natural language and it's also a lot better than libc. You could use COLLATE clauses to get that natural ordering that you want. Another benefit to that is that applying the COLLATE clause to the query itself avoids issues with indexes. The sort step would end up being the final thing. Quite often that could end up being a better performing plan, even if using ICU which is faster than libc, but it's still going to be slower than using C locale and if that is what matters for a particular operation that could be much better performing.

Peter E - Hard disagree on that, can see the technical appeal of adding it just to have it but not sure that anyone would really actually use it. Expecting people to have to explicitly say COLLATE to get natural language sorting isn't going to go over well because we've been trying to get to a point where they don't have to explicitly ask for it.

Jeff - This is more for people who are using C.UTF-8 now really. Not trying to say we're really changing any defaults at all. For people who have a database default collation of C.UTF-8 today this would make more sense.

Peter E - Who is actually using that?

Jeff - Don't have specific numbers but think it's a pretty normal configuration as it performs better.

Matthias - If I don't care about natural ordering but just care about seeing similar things together.

Peter E - Maybe as a DBA but probably not the case as an application developer.

Matthias - I don't really care about real collation in my apps generally.

Berg - If you're running a server for the world then perhaps it's fine.

Peter E - Maybe we do this for v1 but in v2 add in full support.

Jeff - Don't want to rule out that possibility. Reasonable thing to consider but the issue is then maintenance of it. Would need enough people comfortable with that part of the code base to be able to maintain it. The root collation - unicode has all sorts of defaults for everything and it calls them defaults and as an example: France region of the French language has the same collation as the English language of the English language in the US. Not a linguist but the collation order is the same in both cases and is just the 'root' collation. Unicode has these defaults and they're meant as a guide but they're careful about calling things a default but they do have some kind of a default. The root collation order would be a great natural language sort order to provide by default by the project, but ICU offers that. libc does not offer that. No way to get root collation order from libc.

Peter E - No obvious way to ask for it but you could get it from libc by asking for like French.

Munro - A good number of different things are just symlinks between each other.

Jeff - Some people might just not include those locales. If there was to be a 'default default' then that might be a reasonable thing to have like ICU or we could consider what that would look like to build into PG overall. Happy to contribute and work on that and think it would be useful to go down that road, but don't want to promise that. Proposal for v17 is not that and don't want to promise that we would get there.

Munro - The obvious alternative would be to just say use ICU more and perhaps make it the default too. We could provide a different provider that uses ICU code for things but then we would be using ICU's version.

Jeff - One big benefit would be putting the unicode versions in lock-step which we wouldn't be able to do if we're using external dependencies. Big thing I like about this proposal is being able to document all of this, including things like being able to document how to do regexps with different locales. The idea that this would be documentable is a huge benefit but we can't get that with any dependency.

Munro - Sounds great when you talk about just basic C type but when we start talking about taking this further then maybe we should just buy into ICU more.

Matthias - Recently ICU had a release where they changed but they didn't change the identifier.

Joe - Every other database has the option to have a built-in collation and locale support. PG is really the only one that doesn't.

Munro - All those other ones suck though and they're poorly understood.

Jeff - Concern raised about having a built-in root collation because that might blur the line between using ICU and using built-in provider. Agree with that and after thinking about it we are not likely to want the tailoring and localization as ICU is going to be better at that and would want to push people towards ICU. Trying to own all the issues with ICU seems like a lot of work.

Berg - What you might do is have an internal techincal collation when the database does sorting on its own when it isn't asked for it (like for GROUP BY), maybe always use C locale for that?

Matthias - Using the GROUP BY with an ORDER BY might be much slower then

Stephen - But explicit idea is to only do this when not being asked for an ordering and no ORDER BY included.

Jeff - Similar case for indexes, if it's only for internal usage and not for other cases, but there was concern about teaching the planner how to do this and choose the right option. You'd have to decorate paths with knowledge of where we can do this and where we can't do this and we would add complexity and possibly bugs into the planner around this. This might be able to be overcome though and could provide some serious performance benefits. Another way to think about this is that we do something similar for hashing operations- we only use the specific functions if the collation is non-deterministic but for deterministic collations we just use generic hashing.

Matthias - Original proposal was to make primary key text indexes use this

Jeff - Wasn't exactly part of the proposal but the thread took off a bit. Without bringing up the thread specifically and such, there was a realization that we are accepting a lot of potential downsides with the risks associated with using a non-C collation for indexes and in other cases because when a collation changes then the index breaks. Also imposing a performance cost associated with building the index and also the cost of the index becoming corrupted due to changes. What I was trying to get across in the thread was to think about this cost/benefit question between these risks and costs which are imposed on all of these text indexes which may result in only very small benefits.

Joe - ICU in some cases it's 10x faster than glibc and so there's also the issue that many many people are using glibc.

Bruce - Yeah, you're bringing in the cost in terms of performance and the corruption risk for pretty limited benefit.

Jeff - Mostly for primary keys it's just an equality lookup and so this ends up being very impactful. There was a lot in the thread and we could also discuss later.

Munro - Attempting to predict what the user is going to do with the data. Maybe have a distinction between text and 'human text' or such.

Jeff - Not quite type or what the user is going to do with it..

Munro - User could say 'collate C'..

Jeff - Then you have a lot of extra typing to say COLLATE C for a lot of keys and then FKs, etc. All somewhat related at least. Very messy problem. ICU built by default in v16 which is a big step as libc is just really not good. More ICU would be great. One of the problems is that ICU doesn't support the C locale. Today we can't do away with libc support because a lot of people use C.UTF-8. With this proposal, people could use ICU or use the built-in provider and get rid of libc support.

Vondra - One of the main problems that I had with the proposal is that it seemed like "if user does not specify locale, then if the user didn't specify the locale then we just change it to whatever is faster" which seems a bit strange. Reasonable expectation of user is that the database is created with a specific locale, then everything will use that as the default. A lot of users specify it at the database level and expect it to work using that locale for everything in that database. Would be ok with changing the implementation detail as long as it keeps the same result. Would be really surprising for users though would be changing to something faster but changes the ordering.

Jeff - Database-level collation must be honored and so we would not be changing that.

Vondra - What thought the problem was is that there was a misunderstanding around that.

Jeff - Agreed that the discussion needs to be framed better to make it clear that this wasn't intended as impacting users in that way.

Matthias - Regarding new built-in locale - not sure we should use the C or C.UTF-8 as the name.

Peter E - Maybe use binary or something instead of C

Matthias - It'll exist everywhere and naming it as C or C.UTF-8 is not very user friendly.

Jeff - This capability isn't really accounted for in the SQL. Would be interesting to think about another way to specify the behavior of upper/lower. Could imagine one day maybe say just using the unicode data files and not have any dependency for upper/lower but also have support for things like the greek difference. Regarding renaming, pretty much fine with whatever name we pick. The SQL specifies only one example of the german capital S ... If we do choose names then we should try to leave room for possible variants.

Peter E - I've now signed up as a reviewer...

CREATE SUBSCRIPTION ... SERVER

Jeff - This is already being discussed on the list and it is changing a bit in direction thanks to that discussion. No real dispute around this patch. In the spirit of this meeting, decided to throw this topic out there in case someone wanted to discuss it. Not very controversial. Some of the original patch got some feedback about going in a bit of a different direction with only minor core changes and that's actually the direction that it's going in.

Peter E - Proposed something similar before and therefore generally happy with it.

The Path to un-reverting the MAINTAIN privilege

Nathan - MAINTAIN privilege originally slated for v16 but got reverted. Idea is to have it enable you to VACUUM, CLUSTER, LOCK (maybe others), and then a predefined role would exist that could allow that across all objects inside of a particular database. Reverted for v16 due to search_path trick concerns. Functional indexes could possibly end up causing issues. Patch is still applying cleanly. Big discussion is what to do about the search_path issue. A couple of options- reset the search_path to something 'safe' for all of the MAINTAIN sub-commands. This is already done for certain cases (like autovacuum). Downside of that is that the functionality might be changed sometimes. In practice, the autovacuum approach doesn't seem to have caused issues and maybe would be ok. Maybe only do this if you are not also the table owner though. Another option - maybe recommend that people set search_path within functions so that the function would always be run with that search_path. Issue is that that had performance issues, but work was done to try and improve that to deal with that performance problem. Last option is to just not do anything but maybe document this. Restricting search_path for maintenance commands seems simple and sufficient.

Jeff - Setting search_path while executing maintenance commands is fine, but maybe have more explicit support and saying how vacuum/autovacuum is doing this. Still seems pretty weird- the reason autovacuum does this is a bit of a hack to deal with the security issue. What was done then to solve the security problem was sensible but it isn't really sensible overall and is kind of a hack. Our security model around this has evolved some and tried to deal with so many problems. Having trouble getting a good idea of what to do next. Alternative - what should we do about functions, what should the search path be, how should we run them. Maybe we have a search clause for a function which would define the origin of the search path- from the user invoking the function, or the system default, or the owner. Proposal didn't have a lot of traction though.

Magnus - Issue is with expression indexes?

Jeff - That's the most acute issue.

Peter E - Functions in expression indexes which depend on the search_path ...? Seems like a terrible thing. Maybe just prohibit it.

Jeff - Costly case is attaching search_path option to functions. MAINTAIN reverted because there might be a function that depends on the search path which would allow the table owner to gain access to the MAINTAIN user's account.

Peter E - We could make the function fail if it tries to use the search_path. Issue was that it's expensive to set the search_path though?

Jeff - Conclusion to make this safe was to set a search_path for their function but that was slow in v16. That was made a lot faster in v17 though and so that then becomes a more viable possible solution.

Magnus - If no explicit search_path set on the function and the function used in an index then just set that to pg_catalog? Users could set their own search_path on the function if they want to.

Munro - If there is not an explicit search_path set then maybe set it at CREATE FUNCTION time?

Peter E - If we had done that originally but it's too late now perhaps.

Stephen - Current situation ends up with things breaking when the search_path changes under a function anyway in some cases at run-time and so maybe we can just make this change.

Munro - Maybe we could have it as a policy

Magnus - Or have it be the default to be turned on (save search_path as part of CREATE FUNCTION), but then allow the option to turn that off perhaps as a GUC so it can be dealt with globally.

Berg - Issue is that saving the entire search_path may not work because it could include other schemas which don't have an object there today but that object might be added later.

Joe - Shadow issue exists due to conersion issues too not just explicit function in one schema and also in another, but could be inside of a schema. Column in table is varchar, you use lower() function, existing catalog function is lower(text), but then someone creates lower(varchar), the lower(varchar) will get used instead and that's still an issue.

Stephen - End thought is at runtime when an expression index is used in some way, if the function in the index does not explicitly set a search_path then forcibly set it to pg_catalog.

Berg - At CREATE INDEX time too

Magnus - Yes

Nathan - Seems about where it landed.

Berg - Might break things in practice, but should be very clear

Magnus - If it breaks things then it really needed to be fixed anyway.

Jeff - Right now, if you do an INSERT and there's an index expression, the function will execute with the caller's search_path which is a problem.

Moving Forward with Pending Patches

(Unknown: "I have encountered a situation where some of my patches have not been reviewed by others, preventing them from moving forward. I believe this is a common challenge faced by other developers as well. It would be great if we could engage in a discussion about this and potentially brainstorm ideas to improve it.").

Note from Daniel: "This was proposed by a hacker who was unable to attend".

Alvaro - We shouldn't move patches out of the CF just because it's ignored.

Dave P - If there's no activity when assigned to author then boot them

Alvaro - Sure that's fine, but we shouldn't boot out patches just because they haven't been responded to. Can be very de-motivating which is not helpful.

Vik - Rule of 'review patch of equal size' when submitting patches should perhaps be enforced somehow. Maybe add something to the CF app to do it?

Dave P - How would you practically enforce that?

Peter E - Hard to keep up with even just updating the app with all of the state changes. Adding on analysis of who has reviewed what would make it even worse to keep it updated. Would take more time away from actually doing patch review.

Vik - Some folks say applied patch and all tests run and count that as a review, which it is, but still needs more review.

Dave C - Would like to review but not sure how to. Maybe something at PGConf.Dev will help with this?

Matthias - There will be a workshop at PGConf.Dev about how to make patches more committable and hopefully also about reviewing patches.

Dave C - A lot of unspoken rules ...

Matthias - On the list with hundreds of messages a day ...

Dave P - Trying to force people to do patch review will result in minimal reviews which wouldn't end up being helfpul.

[many] Maybe not helpful to ask people to just download and apply the patch and run it these days because cfbot is already doing that. Would be helpful to have people check if there are tests for the new feature or the change, or if there is documentation ...

Vondra - If the patch author does not explain what the patch does then few people are able to do a review. A junior person would struggle with the patch. Dealing with the reviewer bandwidth problem is to help reviewers with doing reviews and getting them to be proficient at it and this is part of the point of the workshop that's being done in Vancouver at PGConf.Dev. Want to explain the overall process and how it works and what tools are available for it. Only like an hour of talking and so not too deep into what you can do in reviews but to give an overall review of reviewing and what works and what doesn't. Only solution long-term is to increase the number of people who can do meaningful review.

Dave P - Need to also set expectations and need to make sure people don't think that if they do a review then they'd definitely get a review themselves.

Jeff - Can someone do something to get a patch closer to a point where a committer will be more likely to pick it up and so the committer would feel comfortable moving forward with the patch. Minor cleanup isn't likely to help very much, but an unresolved question on the mailing list would be helpful for a reviewer to highlight.

Vik - There are meaningful partial reviews. Have reviewed patches on the SQL level, but don't look at the code, so signing off may not be ideal because the code also needs to be reviewed.

Jeff - Committer looking at a SQL standard procedure would definitely be helped by a SQL person who has reviewed it and then checked the code.

Joe - If goal is to get more people to do reviews - would it make sense to have a way to recognize official reviewers or in some official way?

Berg - If the CF app had a way to take notes on a patch?

Magnus - There is an annotate feature which will generate an email.

Matthias - It isn't used very much.

Berg - What is missing is a free text field?

Stephen - Use the wiki?

Dave P - Issue might be using mailing lists instead of other tools ...

Berg - Perhaps a one-line summary as for what the actual state the patch is in

Vondra - Perhaps for the summary if a specific email could be highlighted the indicates what the current state is

Dave C - Who writes the summary though? A non-experienced reviewer likely wouldn't be able to write the summary.

Vondra - The patch author should be the one providing the summary, or maybe someone experienced who really reviews the patch. Then provide a status of the patch. If things are left out then someone would need to point that out, of course.

Recognizing New Contributors

(Unknown: "Many active developers, including myself, desire to be listed as a Contributor, but just do not know how. This lack of clarity can be confusing. I'm wondering if it's possible to have a discussion on how to effectively recognize and acknowledge new contributors.").

Note from Daniel: "This was proposed by that same person, and since he isn't able to attend I don't think it's fair to identify him (I did promise I'd bring this up anonymously to put focus on the issue and not the person). This is a person who I can vouch for being very active, prolific enough to show up on Roberts recent "Who contributed to.." blogpost."

Peter E - Probably should explain on the contributors page

Vondra - Perhaps have a dedicated page for it that talks about the contributors committee, etc. Do not think that having rigid rules would make sense, informal is probably better, but we should explain in general to allow people to have some visibility.

Dave P - There is a page on the wiki but it's pretty light-weight.

Nathan - The committers page may also have some.

Dave P - Contributors committee should propose a patch to pgweb to make it a proper policy.

Vondra - Not too prescriptive but at least outline the process and explain for new people to have an understanding of how it works.

Dave P - Sponsorship has a policy around this and might be a good model to use.

Library dependency situation

Munro - Didn't know enough about it at first when stepping up to work on it, but then I realized it was almost impossible that all of the versions would work because couldn't even get them all because there's so many different versions and tried to come up with a way to get rid of old versions so we don't have to continue to support them. Wanted to come up with a system to decide which versions to support. Daniel mentioned that this is a general problem and not specific to just LLVM. Don't have a way to decide which versions we actually support of various libraries. Having a machine in the buildfarm may be relevant but is kind of the tail wagging the dog and maybe we shouldn't just support what buildfarm supports- buildfarm should check what matters. Big Linux distributions have opinions on what should be supported but everything else doesn't really have an organization behind them to push specific support of things. Rules proposed for LLVM- take all the Linux distros in the buildfarm and take them as distributions that are interesting, then check the published EOL date for those operating systems. If the OS isn't supported anymore, then we don't care about it in terms of supporting it in newer versions of PG. Following through on that process with LLVM, and looking at possibly doing that for other libraries and dependencies. Just wanted to rant generally and raise the question about this and hopefully get to a point of having a policy where we can have little discussion about dropping things and instead just drop them.

Dave P - Similar issues with pgAdmin and it has come down a lot to what version of python is being used. Have had to get very strict on it. Lots of difference between LLVM though and other things. RedHat will lock for a specific version of OpenSSL as an example and instead will back-port things. They do not consider LLVM to be the same and they will happily bump up the version of LLVM which has been a problem.

Peter E - Is LLVM in the base system?

Devrim - Yes, it is.

Dave P - RPMs will be working towards fixing on a LLVM version.

Devrim - Yes, work is ongoing for that.

Munro - LLVM doesn't really have old versions, they just cut a new version every 6 months or so.

Dave P - Sometimes upstream doesn't support older things in terms of libraries. Also with pl/python, the ecosystem will drop support for something like a crypto library and you won't be able to install the latest version because it's using python 3.6 or something.

Peter E - Tried to put a chart together on the wiki as to what the requirements PG has vs. what OS's support what. Even figuring out what PG supports isn't easy. Going to continue to try to put together a wiki page of this which covers what version of PG supports what versions of what dependencies.

Munro - Tried to build a database of this by scraping from the buildfarm and the OS vendor pages.

Peter E - Not always guaranteed that will catch things, try to look into the RPM download directories

Dave P - buildfarm animals sometimes pick up things like homebrew and end up with things different from what's on the OS.

Munro - Seem like those can problably just be ignored generally or at least they don't seem to be as much of a concern.

Dave P - Some of those things will just pull in the version that is supported whatever the latest is from the upstream. One of the issues with that though is that sometimes they do lack behind because a recipe hasn't been updated for a new thing.

Munro - Things like homebrew are likely to have more-or-less current or new things.

Dave C - buildfarm maybe isn't representative of what people are running

Munro - People who don't have buildfarm animals for things they care about should probably add a new buildfarm animal.

Dave P - Good point was made to not let the buildfarm drive what we support

Munro - Yes, was able to remove a lot of code when we kicked some things out

Peter E - Need to make sure that we talk about support as being the 'normal' one and not the super extended support

Berg - Need to distinguish support for new major versions vs. back-branch versions of PG.

Munro - If someone is using older versions and we do not want to break support for back-branches but that should all generally be just fine because those libraries won't be getting broken on those older systems either and we only are providing support for 5 years for our major versions.

Jeff - Trying to figure out why we would need to support building new versions of PG on really old systems, don't think we really need that.

Dave P - Right, that is basically what we are doing for pgAdmin, which has a list of OS's with what versions of pgAdmin are supported or expected to work on those OS versions.


Page Features / Reserve space on page

TDE / IVs / More

v17 Patch Triage

FOSDEM/PGDay 2024 Developer Meeting Patch Review

Any other business & close