FOSDEM/PGDay 2024 Developer Meeting Patch Review

From PostgreSQL wiki
Jump to navigationJump to search

FOSDEM 2024 Developer Meeting Patch Review

Clients

functions to compute size of schemas/AMs (and maybe \dn++ and \dA++)

Stephen - Seems like a pretty useful thing

Peter E - Seems like there are a lot of details that need to be considered

Add non-blocking version of PQcancel

Alvaro - Working on it

Miscellaneous

Function to log backtrace of postgres processes

Peter E - Signal handling implications, raises a concern

Alvaro - Maybe you won't get the backtrace that you want in the case of a signal handler

Vondra - Process gets a signal and then logs a backtrace during CFI

Peter E - Not sure what this is useful for? In a production system you probably wouldn't end up getting much useful. In development, can just use a debugger.

Jeff - Would have found it useful in some cases.

Alvaro - Sometimes we have to ask users to get a backtrace and that is dangerous. This would be much safer.

Vondra - If the CFI is very far away then it may not be very useful really.

Peter E - Doesn't look like many committers have really commented on it. If someone wants to pick it up if someone is interested?

(crickets)

Bruce - Seems like it just may not be useful. Even if nothing is wrong with it we may not want it.

Munro - Does seem like people are interested enough in it and perhaps it is useful.

Monitoring & Control

pg_stat_statements and "IN" conditions

Vondra - Desire is to normalize the values in the IN condition. Currently different number of IN parameters end up as different queries.

Matthias - Concern is that it's a change in behavior, particularly with enum values.

Alvaro - Just need a way to turn it off, which looks like there is an option for

Peter E - Looks like it's pretty straight-forward

Nathan - Earlier versions were trying to group by the the number of options which seemed very odd

Vondra - Seems very arbitrary

Alvaro - Maybe but could be useful

Vondra - Maybe do the simple thing

Nathan - Agreed on doing the simple thing and then maybe it can get in

Add foreign-server health checks infrastructure

Stephen - Seems like a useful thing

Alvaro - Agreed

Munro - This is based on the option where the server will check if the socket has gone away, so that the query could be killed early if the client has gone away. This does the same thing for checking foreign server connections. The way it was done for client connections is more portable than the way it's being done in this patch. This patch should probably adjust to how the client check is done so that it's more portable.

Performance

More scalable multixacts buffers and locking

Alvaro- Will take this

Parallelize correlated subqueries that execute within each worker

Peter E - Seems currently active.

Vondra - Think this makes sense. It's possible to run these in each worker independently. Instead of disabling parallelism entirely.

Munro - Bit like a nested loop join

Vondra - Author is pretty good and if they need help then they will ask

In-place persistence change of a relation (fast ALTER TABLE ... SET LOGGED with wal_level=minimal)

Nathan - Some concerns around the 'undo log' addition

Vondra - Was really just undo for one operation

Nathan - Doesn't seem likely to be feasible for v17 though

Vondra - A lot of other corner-cases with minimal WAL level

AcquireExecutorLocks() and run-time pruning

Peter E - Basically waiting for review from Tom Lane

Speed up releasing of locks

Peter E - Folks involved can probably figure it out on their own

nbtree performance improvements through specialization on key shape

Matthias - What is remaining is a big patch to move functions around and then several patches which implement the specialization which are not very large in size. Apart from the first patch (240kb) which is code moving around, the patches are relatively small (23kb). Fairly straight-forward in terms of what it does. Only argument against it is that it adds binary size which might be a concern. Seems worth the effort because the performance is improved. Most interesting optimization have been split out into other patches which are more complex. Moves several functions into their own C files. Would like someone to review it in detail. Think it could maybe be done for v17 assuming there are not fundamental issues.

ALTER TABLE and CLUSTER fail to use a BulkInsertState for toast tables

Drouvot - Will look at it again.

Procedural Languages

session variables, LET command

Skip

Replication & Recovery

Switching XLog source from archive to streaming when primary available

Nathan - Will look at it.

pg_rewind WAL deletion pitfall

Stephen - Last comment seems to indicate it isn't really being tested by CFbot? Maybe respond commenting on if it actually is or isn't.

Server Features

ALTER TABLE SET ACCESS METHOD on partitioned tables

Stephen - More in-depth commit message would be helpful - what does it do?

Peter E - Probably can't specify access method on partitioned table today. Maybe this would set it so that it would then be inheirited, which seems like it's probably a reasonable thing.

Vik - When applied to a partitioned table, new partitions will use this

Peter E - Will take

Patch to implement missing join selectivity estimation for range types

Jeff - Might make it if given some help

Vondra - Concern on that patch is that the selectivity on range types will be better than on range conditions which seems bizarre. Not a good reason to reject it though.

logical decoding and replication of sequences, take 2

Vondra - Lots of discussion about this at pgconf.eu. People are asking if this should be built in to logical replication. You would never use sequences for primary/primary, but for things like failover it makes sense to have this. Alternative solution would be to have a separate tool to handle this. Consensus was that the easier to use the better and easier is to have it be built-in. Desirable to have it be built-in. Performance impact was a concern and this makes logical decoding do more work. In some corner cases like super heavy usage of sequences then this could have a 10x increase in decoding time. Tried to solve by adding a special flag that disables it and avoids the overhead of it. If you need to decode and apply sequences then you can set a flag during subscription creation. When you know you are creating a subscription for upgrade, for example, then you would enable the option and then it would handle it. Doing more work requires time. Question of correctness also raised because sequences are a bit annoying because they are not transactional. When you get things from a sequence, it's immediately visible to other sessions. There's some complexity to deal with that and in thinking about it. During the decoding you have one timeline for the original database and then you have a timeline of the decoding which may happen later and then you have the final timeline in which the changes are applied. This means it isn't the same order as the original and when sequences are not transactional this is causing issues. Some questions about if the way that the sequences are decoded is correct or not. There are maybe some bugs but if there are the question is if they can be fixed or if they are fundamental.

Jeff - Is there consensus around what correct is in this context?

Vondra - In principle, the same thing applies to logical replication in general- it should be the same as the commit order on the source database. For example, you shouldn't see a sequence going backwards. You shouldn't see an older version of the sequence.

Jeff - Would you potentially see spurious updates that didn't make it to the primary? Would that be considered incorrect?

Vondra - Don't see how you could get that and if it didn't make it into the transaction log then it didn't happen.

Jeff - Perhaps we can discuss later

Vondra - Happy to do that. Patch implementation has changed a few times. Eventually went back to the original simple decoding. Not sure how to improve that. Understand the concerns that have been raised on the list around the complexity of the interactions of these different timelines. Changes can happen in different orderings. In principle it might be wrong but that's very hard to prove. Not intent to just commit something to commit it and have to deal with bugs, want to discuss and try to solve them ahead of time.

Add the ability to limit the amount of memory that can be allocated to backends.

Stephen - This continues to be worked on and there is recent activity on it.

Multi-version ICU

Munro - Not going to be in v17

SQL Commands

Add SPLIT PARTITION/MERGE PARTITIONS commands

Peter E - Looks like quite big patches.

Nathan - Maybe need to suggest breaking into smaller patches, if possible.

Peter E - Perhaps, though looks like may be a fairly complete patch

Matthias - Was only for certain kinds of partitions? Don't think it covered RANGE partitions? Maybe because it is complex and hash partition splits are arguably easier

Peter E - Maybe because what people want to do this for are hash partitions

Vik - Looked at the changes and it didn't look that complicated on a very quick look

MERGE ... RETURNING

Jeff - Wanted to bring this up. There are some specifics about which of the keyword lists these keywords are included in. Would be nice to have someone look at them. Would like a sanity check as to which keyword list these keywords are being added to.

Vik - Haven't seen these keywords specifically. The only place this could be used in a RETURNING clause and so there is no risk of conflict with that...?

Jeff - Sort of ... There are some questions about the details here. Being held up on linguistic issues. There is a current patch which has answers for those details. Not 100% confident that those are correct. Feels better implementation wise.

System Administration

warn if GUC set to an invalid shared library

Berg - Added as reviewer

Testing

abidiff tests

Peter E - Lots of interest in having some way to assess if the backend ABI was changed as sometimes that happens by accident. abidiff is a Linux tool which checks if the ABI has changed. Various levels that can be done with it. Wondering if there is interest in this?

Stephen - Seems like it would be good to have

Berg - Packagers would probably be helped by this too

Peter E - Ideally this is something which would be caught before a release goes out with an unintentional change. Lots of options and you can write an exclude file to exclude things which are decided is acceptable.

Berg - Debian is tracking functions exported by libraries

Peter E - This also checks libraries. First patch just adds simple targets to generate XML files. Second patch adds a test suite to add it to CI. Some question about if this is stable enough across different platforms.