SerializableToDo

From PostgreSQL wiki
Jump to navigationJump to search

A list of smaller ideas for improvements to SERIALIZABLE:

  • Does TABLESAMPLE really need to predicate-lock whole relations? Discussion
  • Does index-only-scan really need to predicate-lock whole pages? Discussion Discussion
  • Can exclude constraint violations be made to check for SSI failures, like we did for unique constraint violations? Discussion
  • Ugh. It doesn't even work correctly for multi-btree tables. Bug report

TL;DR for someone interested hacking on SERIALIZABLE and looking for a small project with real world applications: It would be really nice if SSI failures took priority over unique and exclude constraint violations, because that is a practical requirement for automatic transaction retry. For example, imagine the client puts the logic to run a transaction into a function (lambda, code block, whatever) that will be re-run automatically inside a new transaction by something on the client side (eg Python/Java/Ruby/whatever transaction support libraries) that recognises retryable error conditions (SSI failures, deadlocks, ...). Whereas unique constraint violations should not be retried. The current arrangement makes SSI retry less automatable and thus less usable. The first step would be to get some "isoation tests" going that initially show the currently behaviour (eg new files under src/test/isolation/specs, basic example posted in the multi-btree thread, we also need one for exclusion constraints), and then figure out how to get the desired behaviour working (perhaps with some of the ideas from the threads above).

Bigger ideas to investigate:

  • SSI didn't turn out to scale very well past a few cores, due to contention on a couple of LWLocks. Redesign locking protocol, find a better way to clean up finished transactions that isn't so contended? This "garbage collection" is currently serialized and performs terribly.
  • Can we use hardware transactional memory to implement SSI efficiently? Early stage experimental patch (this is wild pie-in-the-sky stuff)
  • Can we allow SERIALIZABLE READ ONLY DEFERRABLE transactions on streaming replica nodes? Early stage experimental patch Thomas's blog on the topic
  • Can we allow scan order to be magically changed (conceptually like SKIP LOCKED, but you wouldn't actually say that) to allow queue-like access patterns to minimise conflicts, to fix the famous worst case for optimistic concurrency control schemes like SSI? Thomas's blog on the topic
  • Can we export SSI conflict graphs over postgres_fdw? Suppose we have a 2PC-based multi-server transactions as proposed, then there could be a system to merge all of the conflict graphs from remote nodes into the lead server's transaction set. Transactions could be rejected 'locally' in cases where individual nodes can prove they are not serializable, and global dangerous cycles could be detected after merging the graphs. This requires a way to label SERIALIZABLEXACTs (probably with xid on lead server). Then we could have federated SERIALIZABLE transactions (but only when everyone goes through the same head node).