https://wiki.postgresql.org/api.php?action=feedcontributions&user=Drkp&feedformat=atomPostgreSQL wiki - User contributions [en]2024-03-29T13:23:09ZUser contributionsMediaWiki 1.35.13https://wiki.postgresql.org/index.php?title=Serializable&diff=18881Serializable2013-01-23T00:10:20Z<p>Drkp: add links to VLDB 2012 paper and slides</p>
<hr />
<div>Information about the SSI implementation for the SERIALIZABLE transaction isolation level in PostgreSQL, new in release 9.1.<br />
<br />
== Overview ==<br />
<br />
With true serializable transactions, if you can show that your transaction will do the right thing if there are no concurrent transactions, it will do the right thing in any mix of serializable transactions or be rolled back with a serialization failure.<br />
<br />
This document is oriented toward the techniques used to implement the feature in PostgreSQL. For information oriented toward application programmers and database administrators, see the [[SSI]] Wiki page. Another description of SSI and how PostgreSQL's implementation differs from previous work is available in our paper [http://drkp.net/papers/ssi-vldb12.pdf Serializable Snapshot Isolation in PostgreSQL] from VLDB 2012[[#VLDB2012|<sup><nowiki>[5]</nowiki></sup>]].<br />
<br />
=== Serializable and Snapshot Transaction Isolation Levels ===<br />
<br />
Serializable transaction isolation is attractive for shops with active development by many programmers against a complex schema because it guarantees data integrity with very little staff time -- if a transaction can be shown to always do the right thing when it is run alone (before or after any other transaction), it will always do the right thing in any mix of concurrent serializable transactions. Where conflicts with other transactions would result in an inconsistent state within the database or an inconsistent view of the data, a serializable transaction will block or roll back to prevent the anomaly. The SQL standard provides a specific SQLSTATE for errors generated when a transaction rolls back for this reason, so that transactions can be retried automatically.<br />
<br />
Before version 9.1, PostgreSQL did not support a full serializable isolation level. A request for serializable transaction isolation actually provided snapshot isolation. This has well known anomalies which can allow data corruption or inconsistent views of the data during concurrent transactions; although these anomalies only occur when certain patterns of read-write dependencies exist within a set of concurrent transactions. Where these patterns exist, the anomalies can be prevented by introducing conflicts through explicitly programmed locks or otherwise unnecessary writes to the database. Snapshot isolation is popular because performance is better than serializable isolation and the integrity guarantees which it does provide allow anomalies to be avoided or managed with reasonable effort in many environments.<br />
<br />
<br />
[[Image:Serialization-Anomalies-in-Snapshot-Isolation.png|600px|center]]<br />
<br />
=== Serializable Isolation Implementation Strategies ===<br />
<br />
Techniques for implementing full serializable isolation have been published and in use in many database products for decades. The primary technique which has been used is Strict Two-Phase Locking (S2PL), which operates by blocking writes against data which has been read by concurrent transactions and blocking any access (read or write) against data which has been written by concurrent transactions. A cycle in a graph of blocking indicates a deadlock, requiring a rollback. Blocking and deadlocks under S2PL in high contention workloads can be debilitating, crippling throughput and response time.<br />
<br />
A new technique for implementing full serializable isolation in an MVCC database appears in the literature beginning in 2008[[#CahillEtAl2008|<sup><nowiki>[1]</nowiki></sup>]][[#Cahill2009|<sup><nowiki>[2]</nowiki></sup>]]. This technique, known as Serializable Snapshot Isolation (SSI) has many of the advantages of snapshot isolation. In particular, reads don't block anything and writes don't block reads. Essentially, it runs snapshot isolation but monitors the read-write conflicts between transactions to identify dangerous structures in the transaction graph which indicate that a set of concurrent transactions might produce an anomaly, and rolls back transactions to ensure that no anomalies occur. It will produce some false positives (where a transaction is rolled back even though there would not have been an anomaly), but will never let an anomaly occur. In the two known prototype implementations, performance for many workloads (even with the need to restart transactions which are rolled back) is very close to snapshot isolation and generally far better than an S2PL implementation.<br />
<br />
=== Apparent Serial Order of Execution ===<br />
<br />
One way to understand when snapshot anomalies can occur, and to visualize the difference between the serializable implementations described above, is to consider that among transactions executing at the serializable transaction isolation level, the results are required to be consistent with ''some'' serial (one-at-a-time) execution of the transactions[[#SQL92|<sup><nowiki>[4]</nowiki></sup>]]. How is that order determined in each?<br />
<br />
In S2PL, each transaction locks any data it accesses. It holds the locks until committing, preventing other transactions from making conflicting accesses to the same data in the interim. Some transactions may have to be rolled back to prevent deadlock. But successful transactions can always be viewed as having occurred sequentially, in the order they committed.<br />
<br />
With snapshot isolation, reads never block writes, nor vice versa, so more concurrency is possible. The order in which transactions appear to have executed is determined by something more subtle than in S2PL: read/write dependencies. If a transaction reads data, it appears to execute after the transaction that wrote the data it is reading. Similarly, if it updates data, it appears to execute after the transaction that wrote the previous version. These dependencies, which we call "wr-dependencies" and "ww-dependencies", are consistent with the commit order, because the first transaction must have committed before the second starts. However, there can also be dependencies between two *concurrent* transactions, i.e. where one was running when the other acquired its snapshot. These "rw-conflicts" occur when one transaction attempts to read data which is not visible to it because the transaction which wrote it (or will later write it) is concurrent. The reading transaction appears to have executed first, regardless of the actual sequence of transaction starts or commits, because it sees a database state prior to that in which the other transaction leaves it.<br />
<br />
Anomalies occur when a cycle is created in the graph of dependencies: when a dependency or series of dependencies causes transaction A to appear to have executed before transaction B, but another series of dependencies causes B to appear before A. If that's the case, then the results can't be consistent with any serial execution of the transactions.<br />
<br />
=== SSI Algorithm ===<br />
<br />
Serializable transaction in PostgreSQL are implemented using<br />
Serializable Snapshot Isolation (SSI), based on the work of Cahill,<br />
et al. Fundamentally, this allows snapshot isolation to run as it<br />
has, while monitoring for conditions which could create a serialization<br />
anomaly. <br />
<br />
SSI is based on the observation[[#Cahill2009|<sup><nowiki>[2]</nowiki></sup>]] that each snapshot isolation<br />
anomaly corresponds to a cycle that contains a "dangerous structure"<br />
of two adjacent rw-conflict edges:<br />
<br />
::T<sub>in</sub> ----<sub>''rw''</sub>---> T<sub>pivot</sub> ----<sub>''rw''</sub>---> T<sub>out</sub><br />
<br />
SSI works by watching for this dangerous structure, and rolling back a transaction when needed to prevent any anomaly. This means it only needs to track rw-conflicts between concurrent transactions, not wr- and ww-dependencies. It also means there is a risk of false positives, because not every dangerous structure corresponds to an actual serialization failure.<br />
<br />
The PostgreSQL implementation uses two additional optimizations:<br />
<br />
# T<sub>out</sub> must commit before any other transaction in the cycle (see proof of Theorem 2.1 of [[#Cahill2009|<nowiki>[2]</nowiki>]]). We only roll back a transaction if T<sub>out</sub> commits before T<sub>pivot</sub> and T<sub>in</sub>.<br />
# if T<sub>in</sub> is read-only, there can only be an anomaly if T<sub>out</sub> committed before T<sub>in</sub> takes its snapshot. This optimization is an original one. Proof:<br />
#* Because there is a cycle, there must be some transaction T<sub>0</sub> that precedes T<sub>in</sub> in the serial order. (T<sub>0</sub> might be the same as T<sub>out</sub>).<br />
#* The dependency between T<sub>0</sub> and T<sub>in</sub> can't be a rw-conflict, because T<sub>in</sub> was read-only, so it must be a ww- or wr-dependency. Those can only occur if T<sub>0</sub> committed before T<sub>in</sub> started.<br />
#* Because T<sub>out</sub> must commit before any other transaction in the cycle, it must commit before T<sub>0</sub> commits -- and thus before T<sub>in</sub> starts.<br />
<br />
=== PostgreSQL Implementation ===<br />
<br />
Notable aspects of the PostgreSQL implementation of SSI include:<br />
<br />
* Since this technique is based on Snapshot Isolation (SI), those areas in PostgreSQL which don't use SI can't be brought under SSI. This includes system tables, temporary tables, sequences, hint bit rewrites, etc. SSI can not eliminate existing anomalies in these areas.<br />
* Any transaction which is run at a transaction isolation level other than SERIALIZABLE will not be affected by SSI. If you want to enforce business rules through SSI, all transactions should be run at the SERIALIZABLE transaction isolation level, and that should probably be set as the default.<br />
* If all transactions are run at the SERIALIZABLE transaction isolation level, business rules can be enforced in triggers or application code without ever having a need to acquire an explicit lock or to use SELECT FOR SHARE or SELECT FOR UPDATE.<br />
* Those who want to continue to use snapshot isolation without the additional protections of SSI (and the associated costs of enforcing those protections), can use the REPEATABLE READ transaction isolation level. This level retains its legacy behavior, which is identical to the old SERIALIZABLE implementation and fully consistent with the standard's requirements for the REPEATABLE READ transaction isolation level.<br />
* Performance under this SSI implementation will be significantly improved if transactions which don't modify permanent tables are declared to be READ ONLY before they begin reading data.<br />
* Performance under SSI will tend to degrade more rapidly with a large number of active database transactions than under less strict isolation levels. Limiting the number of active transactions through use of a connection pool or similar techniques may be necessary to maintain good performance.<br />
* Any transaction which must be rolled back to prevent serialization anomalies will fail with SQLSTATE 40001, which has a standard meaning of "serialization failure".<br />
* This SSI implementation makes an effort to choose the transaction to be cancelled such that an immediate retry of the transaction can not fail due to conflicts with exactly the same transactions. Pursuant to this goal, no transaction is cancelled until one of the other transactions in the set of conflicts which could generate an anomaly has successfully committed. This is conceptually similar to how write conflicts are handled.<br />
* Modifying a heap tuple creates a rw-conflict with any transaction that holds a SIREAD lock on that tuple, or on the page or relation that contains it.<br />
* Inserting a new tuple creates a rw-conflict with any transaction holding a SIREAD lock on the entire relation. It doesn't conflict with page-level locks, because page-level locks are only used to aggregate tuple locks. Unlike index page locks, they don't lock "gaps" on the page.<br />
<br />
== Current Status ==<br />
<br />
'''Accepted as a feature for PostgreSQL 9.1!'''<br />
<br />
Many thanks to Joe, Heikki, Jeff, and Anssi for posing questions and making suggestions which have led to improvements in the patch! Thanks to Markus for providing dtester at a critical juncture, which allowed progress to continue, and Heikki for developing the src/test/isolation code to move the dcheck tests into the main PostgreSQL testing framework. Also, thanks to the many who have participated in discussions along the way.<br />
<br />
There are some features which should be considered for 9.2 once 9.1 is settled down; most notably integration with hot standby and fine-grained support for index AMs other than btree. Most other proposed work is related to possible performance improvements, which should each be carefully benchmarked before being accepted. At the top of that list is better optimization of ''de facto'' read only transactions -- those which aren't flagged as read only, but which don't actually do any writes to permanent database tables.<br />
<br />
== Development Path ==<br />
<br />
In general, the approach taken was to try for the fastest possible implementation of a serializable isolation level which allowed no anomalies, even though it had many false positives and very poor performance, and then optimize until the rollback rate and overall performance were within a range which allows practical application. No existing isolation level was removed, since not everyone will want to pay the performance price for true serializable behavior. An important goal was that for those not using serializable transaction isolation, the patch doesn't cause performance regression.<br />
<br />
=== Credits ===<br />
<br />
'''Feature Authors''': [[User:Kgrittn|<span title="different title">Kevin Grittner</span>]] and [http://drkp.net/ Dan R. K. Ports].<br />
<br />
'''Testing Support Authors''': Markus Wanner (dtester used during most of development) and Heikki Linnakangas (testing support consistent with other PostgreSQL regression testing, so that we had a testing suite suitable for commit).<br />
<br />
'''Reviewers''': Joe Conway (warning elimination, bug chasing, and style comments), Jeff Davis (general review and found problems with GiST support and lack of 2PC support), Anssi Kääriäinen (found problems with conditional indexes and performance issue with sequential scans during testing with production data), YAMAMOTO Takashi (found numerous bugs during long and heavy testing), and Heikki Linnakangas (general review and many useful observations and suggestions, plus general improvements during commit process).<br />
<br />
'''Committers''': Joe Conway (initial comment and name changes), Heikki Linnakangas (the bulk of the patch and most follow-up fixes), and Robert Haas (some follow-up fixes).<br />
<br />
'''Thanks''' to all those who participated in the on-list discussions and offered advice and support off-list. There were so many who contributed in this way it would be practically impossible to generate an accurate list, but Robert Haas stands out for offering great advice on an overall development strategy.<br />
<br />
'''Special thanks''' to Emmanuel Cecchet for pointing out the ACM SIGMOD paper in which this technique was originally published[[#CahillEtAl2008|<sup><nowiki>[1]</nowiki></sup>]], and to all those at the University of Sidney who contributed to the development of this innovative technique. This is what turned the discussion from wrangling over how best to document existing behavior toward changing it.<br />
<br />
=== Source Code Management ===<br />
<br />
A "serializable" git branch has been set up at this location:<br />
<br />
git://git.postgresql.org/git/users/kgrittn/postgres.git<br />
<br />
http://git.postgresql.org/git/users/kgrittn/postgres.git<br />
<br />
ssh://git@git.postgresql.org/users/kgrittn/postgres.git<br />
<br />
http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=shortlog;h=refs/heads/serializable<br />
<br />
=== Predicate Locking ===<br />
<br />
Both S2PL and SSI require some form of predicate locking to handle situations where reads conflict with later inserts or with later updates which move data into the selected range. PostgreSQL didn't have predicate locking, so it needed to be added. Practical implementations of predicate locking generally involve acquiring locks against data as it is accessed, using multiple granularities (tuple, page, table, etc.) with escalation as needed to keep the lock count to a number which can be tracked within RAM structures. Coarse granularities can cause some false positive indications of conflict. The number of false positives can be influenced by plan choice.<br />
<br />
==== Implementation overview ====<br />
<br />
New RAM structures, inspired by those used to track traditional locks in PostgreSQL, but tailored to the needs of SIREAD predicate locking, will be used. These will refer to physical objects actually accessed in the course of executing the query, to model the predicates through inference. Anyone interested in this subject should review the Hellerstein, Stonebraker and Hamilton paper[[#Foundations2007|<sup><nowiki>[3]</nowiki></sup>]], along with the locking papers referenced from that and the Cahill papers[[#CahillEtAl2008|<sup><nowiki>[1]</nowiki></sup>]][[#Cahill2009|<sup><nowiki>[2]</nowiki></sup>]].<br />
<br />
Because the SIREAD locks don't block, traditional locking techniques must be modified. Intent locking (locking higher level objects before locking lower level objects) doesn't work with non-blocking "locks" (which are, in some respects, more like flags than locks).<br />
<br />
A configurable amount of shared memory is reserved at postmaster start-up to track predicate locks. This size cannot be changed without a restart.<br />
* To prevent resource exhaustion, multiple fine-grained locks may be promoted to a single coarser-grained lock as needed.<br />
* An attempt to acquire an SIREAD lock on a tuple when the same transaction already holds an SIREAD lock on the page or the relation will be ignored. Likewise, an attempt to lock a page when the relation is locked will be ignored, and the acquisition of a coarser lock will result in the automatic release of all finer-grained locks it covers.<br />
<br />
==== Heap locking ====<br />
<br />
Predicate locks will be acquired for the heap based on the following:<br />
* For a table scan, the entire relation will be locked.<br />
* Each tuple read which is visible to the reading transaction will be locked, whether or not it meets selection criteria; except that there is no need to acquire an SIREAD lock on a tuple when the transaction already holds a write lock on any tuple representing the row, since a rw-dependency would also create a ww-dependency which has more aggressive enforcement and will thus prevent any anomaly.<br />
<br />
==== Default index locking ====<br />
<br />
There is a new ampredlocks flag in pg_am which should be set to false for any index which doesn't handle the predicate locking internally; indexes flagged this way will be predicate locked at the index relation level. Such a lock will conflict with any insert into the index, but will not conflict, for example, with deletes, HOT updates, or inserts which don't match the WHERE clause on an index (if present). This will allow correct behavior at the serializable transaction isolation level for new index types with minimal initial effort; but adding the predicate locking calls and changing the flag will improve performance in high contention workloads involving serializable transactions.<br />
<br />
==== Index AM implementations ====<br />
<br />
Since predicate locks only exist to detect writes which conflict with earlier reads, and heap tuple locks are acquired to cover all heap tuples actually read, including those read through indexes, the index tuples which were actually scanned are not of interest in themselves; we only care about their "new neighbors" -- later inserts into the index which ''would'' have been included in the scan had they existed at the time. Conceptually, we want to lock the ''gaps'' between and surrounding index entries within the scanned range.<br />
<br />
''Correctness'' requires that any insert into an index generate a rw-conflict with a concurrent serializable transaction if, after that insert, re-execution of any index scan of the other transaction would access the heap for a row not accessed during the previous execution. Note that a non-HOT update which expires an old index entry covered by the scan and adds a new entry for the modified row's new tuple ''need not'' generate a conflict, although an update which "moves" a row into the scan ''must'' generate a conflict. While correctness allows false positives, they should be minimized for performance reasons.<br />
<br />
Several optimizations are possible:<br />
<br />
* An index scan which is just finding the right position for an index insertion or deletion need not acquire a predicate lock.<br />
* An index scan which is comparing for equality on the entire key for a unique index need not acquire a predicate lock as long as a key is found corresponding to a visible tuple which has not been modified by another transaction -- there are no "between or around" gaps to cover.<br />
* As long as built-in foreign key enforcement continues to use its current "special tricks" to deal with MVCC issues, predicate locks should not be needed for scans done by enforcement code.<br />
* If a search determines that no rows can be found regardless of index contents because the search conditions are contradictory (e.g., x = 1 AND x = 2), then no predicate lock is needed.<br />
<br />
Other index AM implementation considerations:<br />
<br />
* If a btree search discovers that no root page has yet been created, a predicate lock on the index relation is required; otherwise btree searches must get to the leaf level to determine which tuples match, so predicate locks go there.<br />
* GiST searches can determine that there are no matches at any level of the index, so there must be a predicate lock at each index level during a GiST search. An index insert at the leaf level can then be trusted to ripple up to all levels and locations where conflicting predicate locks may exist.<br />
* The effects of page splits, overflows, consolidations, and removals must be carefully reviewed to ensure that predicate locks aren't "lost" during those operations, or kept with pages which could get re-used for different parts of the index.<br />
<br />
=== Testing ===<br />
<br />
For this development effort to succeed, it was absolutely necessary to have some client application which allowed execution of test scripts with specific interleaving of statements run against multiple backends. The dtester module from Markus Wanner was used for this during most of development. It requires python and several python packages (including twisted). Due to package dependencies and licensing issues the dtester module was not appropriate for commit to the PostgreSQL code base.<br />
<br />
Heikki Linnakangas developed a testing framework based on existing regression test code which has been committed to src/test/isolation. Besides being compatible with other PostgreSQL testing, it runs faster than dtester. It doesn't provide a nice display of the results by statement ordering permutation, but that can be added if needed by filtering the current output.<br />
<br />
Like many other proposed features and optimizations, this area could benefit from a "performance test farm" so that serializable performance can be better compared to other isolation levels, and so the performance impact of future enhancements can be determined.<br />
<br />
=== Documentation ===<br />
<br />
A README-SSI file was created, largely drawn from this Wiki page.<br />
<br />
Someone with update rights to Wikipedia should probably update references there which will be outdated with this feature:<br />
<br />
* http://en.wikipedia.org/wiki/Snapshot_isolation<br />
* http://en.wikipedia.org/wiki/Isolation_%28database_systems%29<br />
<br />
== Innovations ==<br />
<br />
The PostgreSQL implementation of Serializable Snapshot Isolation differs from what is described in the cited papers for several reasons:<br />
# PostgreSQL didn't have any existing predicate locking. It had to be added from scratch.<br />
# The existing in-memory lock structures were not suitable for tracking SIREAD locks.<br />
#* The database products used for the prototype implementations for the papers used update-in-place with a rollback log for their MVCC implementations, while PostgreSQL leaves the old version of a row in place and adds a new tuple to represent the row at a new location.<br />
#* In PostgreSQL, tuple level locks are not held in RAM for any length of time; lock information is written to the tuples involved in the transactions.<br />
#* In PostgreSQL, existing lock structures have pointers to memory which is related to a connection. SIREAD locks need to persist past the end of the originating transaction and even the connection which ran it.<br />
#* PostgreSQL needs to be able to tolerate a large number of transactions executing while one long-running transaction stays open -- the in-RAM techniques discussed in the papers wouldn't support that.<br />
# Unlike the database products used for the prototypes described in the papers, PostgreSQL didn't already have a true serializable isolation level distinct from snapshot isolation.<br />
# PostgreSQL supports subtransactions -- an issue not mentioned in the papers.<br />
# PostgreSQL doesn't assign a transaction number to a database transaction until and unless necessary.<br />
# PostgreSQL has pluggable data types with user-definable operators, as well as pluggable index types, not all of which are based around data types which support ordering.<br />
# Some possible optimizations became apparent during development and testing.<br />
<br />
Differences from the implementation described in the papers are listed below.<br />
<br />
* New structures needed to be created in shared memory to track the proper information for serializable transactions and their SIREAD locks.<br />
<br />
* Because PostgreSQL does not have the same concept of an "oldest transaction ID" for all serializable transactions as assumed in the Cahill these, we track the oldest snapshot xmin among serializable transactions, and a count of how many active transactions use that xmin. When the count hits zero we find the new oldest xmin and run a clean-up based on that.<br />
<br />
* Predicate locking in PostgreSQL will start at the tuple level when possible, with automatic conversion of multiple fine-grained locks to coarser granularity as need to avoid resource exhaustion. The amount of memory used for these structures will be configurable, to balance RAM usage against SIREAD lock granularity.<br />
<br />
* A process-local copy of locks held by a process and the coarser covering locks with counts, are kept to support granularity promotion decisions with low CPU and locking overhead.<br />
<br />
* Conflicts are identified by looking for predicate locks when tuples are written and looking at the MVCC information when tuples are read. There is no matching between two RAM-based locks.<br />
<br />
* Because write locks are stored in the heap tuples rather than a RAM-based lock table, the optimization described in the Cahill thesis which eliminates an SIREAD lock where there is a write lock is implemented by the following:<br />
*# When checking a heap write for conflicts against existing predicate locks, a tuple lock on the tuple being written is removed.<br />
*# When acquiring a predicate lock on a heap tuple, we return quickly without doing anything if it is a tuple written by the reading transaction.<br />
<br />
* Rather than using conflictIn and conflictOut pointers which use NULL to indicate no conflict and a self-reference to indicate multiple conflicts or conflicts with committed transactions, we use a list of rw-conflicts. With the more complete information, false positives are reduced and we have sufficient data for more aggressive clean-up and other optimizations.<br />
** We can avoid ever rolling back a transaction until and unless there is a pivot where a transaction on the conflict *out* side of the pivot committed before either of the other transactions.<br />
** We can avoid ever rolling back a transaction when the transaction on the conflict *in* side of the pivot is explicitly or implicitly READ ONLY unless the transaction on the conflict *out* side of the pivot committed before the READ ONLY transaction acquired its snapshot. (An implicit READ ONLY transaction is one which committed without writing, even though it was not explicitly declared to be READ ONLY.)<br />
** We can more aggressively clean up conflicts, predicate locks, and SSI transaction information.<br />
<br />
* Allow a READ ONLY transaction to "opt out" of SSI if there are no READ WRITE transactions which could cause the READ ONLY transaction to ever become part of a "dangerous structure" of overlapping transaction dependencies.<br />
<br />
* Allow the user to request that a READ ONLY transaction ''wait'' until the conditions are right for it to start in the "opt out" state described above. We add a DEFERRABLE state to transactions, which is specified and maintained in a way similar to to READ ONLY. It is ignored for transactions which are not SERIALIZABLE ''and'' READ ONLY.<br />
<br />
* When a transaction must be rolled back, we pick among the active transactions such that an immediate retry will not fail again on conflicts with the same transactions.<br />
<br />
* We use the PostgreSQL SLRU system to hold summarized information about older committed transactions to put an upper bound on RAM used. Beyond that limit, information spills to disk. Performance can degrade in a pessimal situation, but it should be tolerable, and transactions won't need to be cancelled or blocked from starting.<br />
<br />
== R&D Issues ==<br />
<br />
This is intended to be the place to record specific issues which need more detailed review or analysis.<br />
<br />
* '''WAL file replay'''. While serializable implementations using S2PL can guarantee that the write-ahead log contains commits in a sequence consistent with some serial execution of serializable transactions, SSI cannot make that guarantee. While the WAL replay is no less consistent than under snapshot isolation, it is possible that under PITR recovery or hot standby a database could reach a readable state where some transactions appear before other transactions which would have had to precede them to maintain serializable consistency. In essence, if we do nothing, WAL replay will be at snapshot isolation even for serializable transactions. Is this OK? If not, how do we address it?<br />
<br />
* '''External replication'''. Look at how this impacts external replication solutions, like Postgres-R, Slony, pgpool, HS/SR, etc. This is related to the "WAL file replay" issue.<br />
<br />
* '''UNIQUE btree search for equality on all columns'''. Since a search of a UNIQUE index using equality tests on all columns will lock the heap tuple if an entry is found, it appears that there is no need to get a predicate lock on the index in that case. A predicate lock ''is'' still needed for such a search if a matching index entry which points to a visible tuple is ''not'' found.<br />
<br />
* '''Minimize touching of shared memory'''. Should lists in shared memory push entries which have just been returned to the ''front'' of the available list, so they will be popped back off soon and some memory might never be touched, or should we keep adding returned items to the ''end'' of the available list?<br />
<br />
== Discussion ==<br />
<br />
[http://archives.postgresql.org/message-id/4A0019EE.EE98.0025.0@wicourts.gov "Serializable Isolation without blocking" - discusses paper in ACM SIGMOD on SSI]<br />
<br />
[http://archives.postgresql.org/message-id/4B2788EA020000250002D51C@gw.wicourts.gov "Update on true serializable techniques in MVCC" - discusses Cahill Doctoral Thesis on SSI]<br />
<br />
[http://archives.postgresql.org/message-id/4B389C79020000250002D987@gw.wicourts.gov "Serializable implementation" - discusses Wisconsin Court System plans]<br />
<br />
[http://archives.postgresql.org/message-id/4B3B88F4020000250002DAE1@gw.wicourts.gov "A third lock method" - discusses development path: rough prototype to refine toward production]<br />
<br />
[http://archives.postgresql.org/message-id/1262718843.5908.183.camel@monkey-cat.sm.truviso.com "true serializability and predicate locking" - discusses GiST and GIN issues]<br />
<br />
[http://archives.postgresql.org/message-id/4BF43DF702000025000318BE@gw.wicourts.gov WIP patch for serializable transactions with predicate locking]<br />
<br />
[http://archives.postgresql.org/pgsql-hackers/2010-09/msg00022.php "serializable" in comments and names]<br />
<br />
[http://archives.postgresql.org/message-id/4C8F5DB202000025000356A0@gw.wicourts.gov Serializable Snapshot Isolation]<br />
<br />
[http://archives.postgresql.org/message-id/4CFB574702000025000382FD@gw.wicourts.gov serializable read only deferrable]<br />
<br />
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg02119.php SSI memory mitigation & false positive degradation]<br />
<br />
== Presentations ==<br />
<br />
From PostgreSQL Conference U.S. East 2010:<br />
[[media:Transaction-Isolation-in-PostgreSQL.odp|Current Transaction Isolation in PostgreSQL and future directions]]<br />
<br />
From PGCon 2011: <br />
[http://drkp.net/drkp/papers/ssi-pgcon11-slides.pdf Serializable Snapshot Isolation: Making ISOLATION LEVEL SERIALIZABLE Provide Serializable Isolation]<br />
<br />
From VLDB 2012: <br />
[http://drkp.net/papers/ssi-vldb12-slides.pdf Serializable Snapshot Isolation in PostgreSQL]<br />
<br />
<br />
== Publications ==<br />
<br />
<span id="CahillEtAl2008"><nowiki>[1]</nowiki> [http://doi.acm.org/10.1145/1376616.1376690 Michael J. Cahill, Uwe Röhm, and Alan D. Fekete. 2008. Serializable isolation for snapshot databases. In SIGMOD ’08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 729–738, New York, NY, USA. ACM.] (This paper is listed mostly for context; the subsequent paper covers the same ground and more.)</span><br />
<br />
<span id="Cahill2009"><nowiki>[2]</nowiki> [http://hdl.handle.net/2123/5353 Michael James Cahill. 2009. Serializable Isolation for Snapshot Databases. Sydney Digital Theses. University of Sydney, School of Information Technologies.]</span><br />
<br />
<span id="Foundations2007"><nowiki>[3]</nowiki> [http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf Joseph M. Hellerstein, Michael Stonebraker and James Hamilton. 2007. Architecture of a Database System. Foundations and Trends(R) in Databases Vol. 1, No. 2 (2007) 141–259.]<br />
Of particular interest:<br />
* 6.1 A Note on ACID<br />
* 6.2 A Brief Review of Serializability<br />
* 6.3 Locking and Latching<br />
* 6.3.1 Transaction Isolation Levels<br />
* 6.5.3 Next-Key Locking: Physical Surrogates for Logical Properties</span><br />
<br />
<span id="SQL92"><nowiki>[4]</nowiki> [http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt SQL-92]<br />
Search for ''serial execution'' to find the relevant section.</span><br />
<br />
<span id="VLDB2012"><nowiki>[5]</nowiki> [http://drkp.net/papers/ssi-vldb12.pdf Dan R. K. Ports and Kevin Grittner. 2012. Serializable Snapshot Isolation in PostgreSQL. Proceedings of the VLDB Endowment vol. 5 (12) , August 2012 (pp. 1850--1861)]</span><br />
<br />
[[Category:PostgreSQL 9.1]]</div>Drkphttps://wiki.postgresql.org/index.php?title=PostgreSQL_9.1_Open_Items&diff=14844PostgreSQL 9.1 Open Items2011-07-07T21:42:11Z<p>Drkp: SSI issues resolved</p>
<hr />
<div>== Project Planning ==<br />
See the [[PostgreSQL 9.1 Development Plan]].<br />
<br />
== Meta-Issues ==<br />
* [[standard_conforming_strings]] -- readiness of drivers and applications<br />
* Review "Long Term" list of items from [[PostgreSQL 9.0 Open Items]]<br />
<br />
== Open Issues ==<br />
<br />
=== Blockers for Beta3 ===<br />
<br />
=== Not Blockers for Beta3 ===<br />
* {{MessageLink|4CEA5A0F.1030602@enterprisedb.com|do latches have memory-ordering problems?}}<br />
** [http://archives.postgresql.org/message-id/11500.1302097336@sss.pgh.pa.us mostly just needs testing]<br />
* {{MessageLink|4DA58686.1050501@enterprisedb.com| throw an error if you try to start from incomplete backup taken with pg_basebackup}}<br />
<br />
* {{MessageLink|4DEF5D4C.9070401@enterprisedb.com| Add regression tests for DDL commands with SSI}}<br />
* {{MessageLink|4E12FE51020000250003EFC7@gw.wicourts.gov|SSI 2PC coverage test}}<br />
** {{MessageLink|4E15CCE6.4010207@enterprisedb.com|associated bug fixed; need to shrink the test a bit before adding it}}<br />
<br />
== Resolved Issues ==<br />
<br />
=== Issues Resolved Prior to Beta3 ===<br />
* {{MessageLink|19848.1307666208@sss.pgh.pa.us|plpython isn't connected up to tarball building}}<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-06/msg01241.php ill-conceived grammar for NOT VALID constraints]<br />
* {{MessageLink|4DE8B1A8.2020505@enterprisedb.com| add a proof to README-SSI for the validity of the READ ONLY optimizations}}<br />
** {{MessageLink|20110602070419.GA10064@csail.mit.edu| Dan offered a concise proof, but we need a cite for one premise it uses}}<br />
** {{MessageLink|4DE67235020000250003DFBC@gw.wicourts.gov| Kevin offered a self-contained proof, but it's much longer than Dan's}}<br />
* {{MessageLink|4DF09C6F.70608@metrosystems.co.jp|FOREIGN TABLE doc fix}}<br />
* {{MessageLink|20110620091529.GG83336@csail.mit.edu|SxactGlobalXmin computed incorrectly after abort}}<br />
** {{MessageLink|20110621021828.GL83336@csail.mit.edu|patch}}<br />
* {{MessageLink|4DEFB5EA020000250003E3C3@gw.wicourts.gov|add a couple performance notes to the SSI docs}}<br />
** {{MessageLink|4DFB85C3020000250003E883@gw.wicourts.gov|patch to do this}}<br />
* {{MessageLink|4E01B0B8.2020205@enterprisedb.com|Missing predicate locking in lossy bitmap heap scans}}<br />
** {{MessageLink|4E07550F020000250003EC42@gw.wicourts.gov|patch to fix}}<br />
* {{MessageLink|537229.49593.qm@web112417.mail.gq1.yahoo.com|btree_gist breaks some behaviors involving <> operators}}<br />
* {{MessageLink|28389.1308264888@sss.pgh.pa.us|ALTER TABLE lock strength reduction patch is unsafe}}<br />
* {{MessageLink|4E12FDA8020000250003EFC1@gw.wicourts.gov|SSI atomic commit patch}}<br />
<br />
=== Issues Resolved Prior to Beta2 ===<br />
* {{MessageLink|20110428194112.GB12161@tornado.leadboat.com|ALTER TYPE DROP + composite-typed col vs. pg_upgrade}}<br />
* {{MessageLink|4DB7D194.6030906@enterprisedb.com|Memory leak in foreign scans}}<br />
* [http://archives.postgresql.org/message-id/19738.1306338472@sss.pgh.pa.us vacuum sometimes fails to update relpages/reltuples]<br />
* {{MessageLink|BANLkTik8KxxjJ1KW-pO+WWBdTEAT+80ArQ@mail.gmail.com|SSI HOT chain traversal issue}}<br />
* {{MessageLink|201106031542.p53FgDh5073991@wwwmaster.postgresql.org|CTEs attached to data-modifying statements fail with rules (bug 6051)}}<br />
* {{MessageLink|BANLkTiknTSLMbGuTaOYg4O0V7UAMsr_rOA|NOT VALID constraints don't dump properly}}<br />
* {{MessageLink|201105261137.p4QBbTlt077425@wwwmaster.postgresql.org|HS slaves do not handle unlogged tables nicely (bug 6041)}}<br />
* {{MessageLink|4DC00203020000250003D1E8@gw.wicourts.gov|Make DDL commands SSI-aware}}<br />
** {{MessageLink|4DEBF171020000250003E1A3@gw.wicourts.gov|patch to fix}} (Problems found with patch, discussion and modifications in progress.)<br />
* {{MessageLink|7486.1306253575@sss.pgh.pa.us|Domains over arrays no longer match ANYARRAY}}<br />
* {{MessageLink|12187.1307391240@sss.pgh.pa.us|contrib/citext exposes implementation shortcomings in collation code (bug 6053)}}<br />
* {{MessageLink|9773.1307545767@sss.pgh.pa.us|error in information_schema.element_types view definition (bug 5926)}}<br />
<br />
=== Issues Resolved Prior to Beta1 ===<br />
<br />
* [http://archives.postgresql.org/message-id/AANLkTin4o+eSgQsP=0i6EM=Evu3oX=ewHp8Bwod1UaWZ@mail.gmail.com wal_buffers = -1 causes spurious chatter on reload]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=jCTGC+Qxfkum6EXcw8q4tRFd2KHNpe+4HYUeS@mail.gmail.com is there a safeguard to prevent recovery from pausing before consistency is reached?]<br />
* {{MessageLink|09B23E7BF70425478C1330D893A722C602FEC019BD@MailSVR.invera.com|Walreceiver crashes in AIX}}<br />
** {{MessageLink|4C753155.3070708@ca.afilias.info|Steve Singer can't reproduce, suggests a possible way this could be pilot error}}<br />
* {{MessageLink|AANLkTik9HZi8GfSiKuHVY2N7g7xDV+sN46eRxPbOjO7P@mail.gmail.com|bug of the hot standby feedback}}<br />
* {{MessageLink|201011271931.oARJVV427882@momjian.us|GIST rewrite vs. pg_upgrade}}<br />
* [http://archives.postgresql.org/message-id/AANLkTin3PPOwXq2Cpf+tLNNKSv4OmHfDn5qr0aoeczA-@mail.gmail.com replication/README needs to be updated]<br />
* [http://archives.postgresql.org/message-id/AANLkTikJmP+bo1N-mFUWEpJiV6_OKisYw512OGeTJUbm@mail.gmail.com backend wrongly waits for sync rep even though max_wal_senders = 0]<br />
* [http://archives.postgresql.org/message-id/AANLkTik4tuG2EA6oeiov1=DO6UcDoARP45Lk+KGfy7HC@mail.gmail.com reload of the configuration file should not cause the server to end unexpectedly]<br />
* [http://archives.postgresql.org/pgsql-bugs/2010-11/msg00002.php Fix ecpg preprocessor regression]<br />
* {{MessageLink|AANLkTimDUiRrrWzZ2ZXWSRfeP5tHm9PGmpp6zfqaFpte@mail.gmail.com|Once sync_standbys_defined becomes true, there's no way for it to ever become false again. That can't be right.}}<br />
* {{MessageLink|AANLkTimfvNzJy490wp95vP1RLumkxEZBTWUjhA_Dd1QS@mail.gmail.com|backend no longer needs to wait for replication when synchronous_standby_names is set to '' and configuration file is reloaded.}}<br />
* [http://archives.postgresql.org/message-id/AANLkTi=z+eg6yMEtJCqYb0OnGaNMoTg9eYiK-xm8v0AQ@mail.gmail.com pause_at_recovery_target needs to be added in recovery.conf.sample]<br />
* {{MessageLink|201101181630.p0IGU45v047971@wwwmaster.postgresql.org|BUG #5842: Memory leak in PL/Python when taking slices of results}}<br />
* {{MessageLink|4D5EB8ED.7010002@enterprisedb.com|hot standby feedback message needs to be explained at protocol-replication.html}}<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg00720.php consider renaming ident to peer authentication on local connections]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=q5=x_WoOccf5dqHoAWXyDc7qXG0cpbnsNKUCB@mail.gmail.com serious problem by multiple backups]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=qR0zy5cS98n=TDCxda5sj=NODDA=xAZUCdJOE@mail.gmail.com FK constraints "NOT VALID" by default]<br />
* [http://archives.postgresql.org/message-id/AANLkTikvpjkUjwksZK=wOw7CDyMtaVEddmPN9yFv7jX-@mail.gmail.com sync rep very slow when fsync=off on standby]<br />
* [http://archives.postgresql.org/message-id/AANLkTinbzFaJXkzwm2xEegfytK1LPw8odo61wgZkkGp=@mail.gmail.com No longer need to check synchronous_standby_names and max_wal_senders at startup]<br />
* {{MessageLink|201012071131.55211.gabi.julien@broadsign.com|pg_last_xact_replay_timestamp limitations}}<br />
** rhaas says: It's not clear there's an action item here, so moving to resolved. Feel free to move back with more details.<br />
* [http://archives.postgresql.org/message-id/AANLkTikUAOYoStwkwG+DZOzTwT2QVj0H9aDywpzxvhxn@mail.gmail.com either remove write_location completely, or revert the change that broke it]<br />
** [http://archives.postgresql.org/message-id/AANLkTinHrymKd56m5AfawCdujuNM6B2g_--9UiOSSKGx@mail.gmail.com original report of problem with write_location] (but the other issues in that email are now fixed)<br />
** [http://archives.postgresql.org/message-id/AANLkTikphXd4LMXXOAZJg2s_8q0Fu5e3uuec4BG4xD4f@mail.gmail.com possible patch]<br />
* {{MessageLink|29244.1295376372@sss.pgh.pa.us|DO blocks leak memory}}<br />
* {{MessageLink|AANLkTimnwxEv-ZbqBLCSBSvmq-80vzvDb2u0pPchGm2r@mail.gmail.com|replication timeout}}<br />
* {{MessageLink|1293977249.5984.17.camel@vanquo.pezone.net|raise protocol version number}}<br />
** [http://archives.postgresql.org/message-id/1240.1301353669@sss.pgh.pa.us not done, would break things]<br />
* [http://archives.postgresql.org/message-id/AANLkTik6ArKPwnvA8_9XHo9j9+w4A2UEnsheX-mwR=Aj@mail.gmail.com fix attinhcount tracking]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00033.php SSI: SIReadLock lines in pg_locks don't show pid]<br />
** questions have been raised whether pid should be suppressed after connection closes or even maybe after transaction completion<br />
* [http://archives.postgresql.org/message-id/AANLkTi%3DkoHqna9WMm8_ATJN6c1GOLLp_0Tx6VswKhAdi%40mail.gmail.com synchronous_commit and synchronous_replication]<br />
* {{MessageLink|AANLkTimUuzn3DAaO1OLCijjjfcw0MXL-0HAvZKr9xRyV@mail.gmail.com|conversion from integer literals to money type}}<br />
* [http://archives.postgresql.org/message-id/4D892157.7070607@lelarge.info comments on SQL/MED objects]<br />
** rhaas says: [http://archives.postgresql.org/message-id/AANLkTimS1tDEuEeocz9PBjQ-RLSdjJ+VKFc_+F+Jm3Ou@mail.gmail.com proposed patch]<br />
** rhaas says: thom brown points out that [http://archives.postgresql.org/message-id/AANLkTi==NemsL7Vo=YfsPc8DdZActYF7ZwCFVO-0Nu21@mail.gmail.com I forgot USER MAPPINGs]<br />
** [http://archives.postgresql.org/message-id/7102.1302029271@sss.pgh.pa.us fixing user mappings opens unduly large can of worms]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-03/msg00352.php SSI: assertion failure on marking conflict-in due to race condition]<br />
** existing patch to recheck after trading shared LW locks for exclusive should fix<br />
* [http://archives.postgresql.org/message-id/29987.1301930239@sss.pgh.pa.us keywords table needs updating]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01170.php SSI: clumsy error handling results in generic message rather than more specific message with hint]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00157.php SSI: disable optimization when in subtransaction]<br />
* [http://archives.postgresql.org/message-id/4D9DD97B020000250003C53C@gw.wicourts.gov SSI: LOG message about SLRU wrap-around]<br />
* [http://archives.postgresql.org/message-id/BANLkTi=W8OrvqLHS+suU8R2b_rhFaqeEaw@mail.gmail.com sync rep and fast shutdown]<br />
** rhaas says: no easy resolution, i guess we'll leave this alone for now?<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01913.php SSI: three different HTABs contend for shared memory in a free-for-all]<br />
** Heikki proposes "We'll need to teach dynahash not to allocate any more entries after the preallocation. A new HASH_NO_GROW flag to hash_create() seems like a suitable interface."<br />
** [http://archives.postgresql.org/message-id/BANLkTimVuicyZG4j3F427BgfA2iYP8Od_Q@mail.gmail.com Alternatively, we could just use an initial allocation which matches the maximum entries, to ensure that all HTABs can allocate ''at least'' the configured maximum.] (There's an existing patch for that.)<br />
** I committed a patch to add the new flag to hash_create() - Heikki<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00083.php SSI: failure to clean up some SLRU-summarized locks]<br />
** existing patch to properly set commitSeqNo on the offending locks should fix<br />
* {{MessageLink|20110410103636.GC10697@tornado.leadboat.com|ALTER TABLE ADD COLUMN not creating TOAST tables for inheritance children}}<br />
* {{MessageLink|20110410015728.GA10162@tornado.leadboat.com|typed table DDL loose ends}}<br />
* [http://archives.postgresql.org/message-id/29173.1301114203@sss.pgh.pa.us assorted collation issues]<br />
** [http://archives.postgresql.org/message-id/21742.1303137667@sss.pgh.pa.us almost there...]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00334.php pl/python traceback fix]<br />
* [http://archives.postgresql.org/message-id/BANLkTinz2do3EqWO9RH66bY2FBOmL=uSoA@mail.gmail.com gin indexes don't get used unless you vacuum]<br />
* [http://archives.postgresql.org/message-id/4DB015B3020000250003CB51@gw.wicourts.gov SSI: High-contention UPDATE load on RAM disk database slows non-serializable transactions by a fraction of a percent]<br />
** [http://archives.postgresql.org/message-id/20110422220734.GG57793@csail.mit.edu Or speeds it up by a fraction of a percent. Either way it's too close to tease out from the noise easily.]<br />
** [http://archives.postgresql.org/message-id/20110425033308.GJ57793@csail.mit.edu As insurance against any performance hit in extreme high contention loads, it might be worthwhile to do a quick return (without taking any locks) if no serializable transactions are active.]<br />
* [http://archives.postgresql.org/message-id/201104201411.p3KEBOfA009414@wwwmaster.postgresql.org SSI: UPDATE setting a TOASTed value is broken, regardless of transaction isolation level]<br />
** [http://archives.postgresql.org/message-id/4DAEF7E6.7080107@enterprisedb.com proposed patch]<br />
* [http://archives.postgresql.org/message-id/BANLkTim6a_rfZ+UPPATTXap9Ed7-X7BzoA@mail.gmail.com foreign table permissions issues]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-04/msg00141.php CREATE TABLE IF NOT EXISTS doesn't work]<br />
* {{MessageLink|201103111328.p2BDSFd10499@momjian.us|Typed-tables patch broke pg_upgrade}}<br />
** {{MessageLink|20110418235041.GB2769@tornado.leadboat.com|proposed patch (tt2v2-binary-upgrade.patch)}}<br />
<br />
[[Category:PostgreSQL 9.1]]</div>Drkphttps://wiki.postgresql.org/index.php?title=PostgreSQL_9.1_Open_Items&diff=14697PostgreSQL 9.1 Open Items2011-06-22T18:46:12Z<p>Drkp: another SSI issue resolved</p>
<hr />
<div>== Project Planning ==<br />
See the [[PostgreSQL 9.1 Development Plan]].<br />
<br />
== Meta-Issues ==<br />
* [[standard_conforming_strings]] -- readiness of drivers and applications<br />
* Review "Long Term" list of items from [[PostgreSQL 9.0 Open Items]]<br />
<br />
== Open Issues ==<br />
<br />
=== Blockers for Beta3 ===<br />
* {{MessageLink|19848.1307666208@sss.pgh.pa.us|plpython isn't connected up to tarball building}}<br />
* {{MessageLink|28389.1308264888@sss.pgh.pa.us|ALTER TABLE lock strength reduction patch is unsafe}}<br />
=== Not Blockers for Beta3 ===<br />
* {{MessageLink|4CEA5A0F.1030602@enterprisedb.com|do latches have memory-ordering problems?}}<br />
** [http://archives.postgresql.org/message-id/11500.1302097336@sss.pgh.pa.us mostly just needs testing]<br />
* {{MessageLink|4DA58686.1050501@enterprisedb.com| throw an error if you try to start from incomplete backup taken with pg_basebackup}}<br />
<br />
* {{MessageLink|4DEF5D4C.9070401@enterprisedb.com| Add regression tests for DDL commands with SSI}}<br />
<br />
== Resolved Issues ==<br />
<br />
=== Issues Resolved Prior to Beta3 ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-06/msg01241.php ill-conceived grammar for NOT VALID constraints]<br />
* {{MessageLink|4DE8B1A8.2020505@enterprisedb.com| add a proof to README-SSI for the validity of the READ ONLY optimizations}}<br />
** {{MessageLink|20110602070419.GA10064@csail.mit.edu| Dan offered a concise proof, but we need a cite for one premise it uses}}<br />
** {{MessageLink|4DE67235020000250003DFBC@gw.wicourts.gov| Kevin offered a self-contained proof, but it's much longer than Dan's}}<br />
* {{MessageLink|4DF09C6F.70608@metrosystems.co.jp|FOREIGN TABLE doc fix}}<br />
* {{MessageLink|20110620091529.GG83336@csail.mit.edu|SxactGlobalXmin computed incorrectly after abort}}<br />
** {{MessageLink|20110621021828.GL83336@csail.mit.edu|patch}}<br />
* {{MessageLink|4DEFB5EA020000250003E3C3@gw.wicourts.gov|add a couple performance notes to the SSI docs}}<br />
** {{MessageLink|4DFB85C3020000250003E883@gw.wicourts.gov|patch to do this}}<br />
<br />
=== Issues Resolved Prior to Beta2 ===<br />
* {{MessageLink|20110428194112.GB12161@tornado.leadboat.com|ALTER TYPE DROP + composite-typed col vs. pg_upgrade}}<br />
* {{MessageLink|4DB7D194.6030906@enterprisedb.com|Memory leak in foreign scans}}<br />
* [http://archives.postgresql.org/message-id/19738.1306338472@sss.pgh.pa.us vacuum sometimes fails to update relpages/reltuples]<br />
* {{MessageLink|BANLkTik8KxxjJ1KW-pO+WWBdTEAT+80ArQ@mail.gmail.com|SSI HOT chain traversal issue}}<br />
* {{MessageLink|201106031542.p53FgDh5073991@wwwmaster.postgresql.org|CTEs attached to data-modifying statements fail with rules (bug 6051)}}<br />
* {{MessageLink|BANLkTiknTSLMbGuTaOYg4O0V7UAMsr_rOA|NOT VALID constraints don't dump properly}}<br />
* {{MessageLink|201105261137.p4QBbTlt077425@wwwmaster.postgresql.org|HS slaves do not handle unlogged tables nicely (bug 6041)}}<br />
* {{MessageLink|4DC00203020000250003D1E8@gw.wicourts.gov|Make DDL commands SSI-aware}}<br />
** {{MessageLink|4DEBF171020000250003E1A3@gw.wicourts.gov|patch to fix}} (Problems found with patch, discussion and modifications in progress.)<br />
* {{MessageLink|7486.1306253575@sss.pgh.pa.us|Domains over arrays no longer match ANYARRAY}}<br />
* {{MessageLink|12187.1307391240@sss.pgh.pa.us|contrib/citext exposes implementation shortcomings in collation code (bug 6053)}}<br />
* {{MessageLink|9773.1307545767@sss.pgh.pa.us|error in information_schema.element_types view definition (bug 5926)}}<br />
<br />
=== Issues Resolved Prior to Beta1 ===<br />
<br />
* [http://archives.postgresql.org/message-id/AANLkTin4o+eSgQsP=0i6EM=Evu3oX=ewHp8Bwod1UaWZ@mail.gmail.com wal_buffers = -1 causes spurious chatter on reload]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=jCTGC+Qxfkum6EXcw8q4tRFd2KHNpe+4HYUeS@mail.gmail.com is there a safeguard to prevent recovery from pausing before consistency is reached?]<br />
* {{MessageLink|09B23E7BF70425478C1330D893A722C602FEC019BD@MailSVR.invera.com|Walreceiver crashes in AIX}}<br />
** {{MessageLink|4C753155.3070708@ca.afilias.info|Steve Singer can't reproduce, suggests a possible way this could be pilot error}}<br />
* {{MessageLink|AANLkTik9HZi8GfSiKuHVY2N7g7xDV+sN46eRxPbOjO7P@mail.gmail.com|bug of the hot standby feedback}}<br />
* {{MessageLink|201011271931.oARJVV427882@momjian.us|GIST rewrite vs. pg_upgrade}}<br />
* [http://archives.postgresql.org/message-id/AANLkTin3PPOwXq2Cpf+tLNNKSv4OmHfDn5qr0aoeczA-@mail.gmail.com replication/README needs to be updated]<br />
* [http://archives.postgresql.org/message-id/AANLkTikJmP+bo1N-mFUWEpJiV6_OKisYw512OGeTJUbm@mail.gmail.com backend wrongly waits for sync rep even though max_wal_senders = 0]<br />
* [http://archives.postgresql.org/message-id/AANLkTik4tuG2EA6oeiov1=DO6UcDoARP45Lk+KGfy7HC@mail.gmail.com reload of the configuration file should not cause the server to end unexpectedly]<br />
* [http://archives.postgresql.org/pgsql-bugs/2010-11/msg00002.php Fix ecpg preprocessor regression]<br />
* {{MessageLink|AANLkTimDUiRrrWzZ2ZXWSRfeP5tHm9PGmpp6zfqaFpte@mail.gmail.com|Once sync_standbys_defined becomes true, there's no way for it to ever become false again. That can't be right.}}<br />
* {{MessageLink|AANLkTimfvNzJy490wp95vP1RLumkxEZBTWUjhA_Dd1QS@mail.gmail.com|backend no longer needs to wait for replication when synchronous_standby_names is set to '' and configuration file is reloaded.}}<br />
* [http://archives.postgresql.org/message-id/AANLkTi=z+eg6yMEtJCqYb0OnGaNMoTg9eYiK-xm8v0AQ@mail.gmail.com pause_at_recovery_target needs to be added in recovery.conf.sample]<br />
* {{MessageLink|201101181630.p0IGU45v047971@wwwmaster.postgresql.org|BUG #5842: Memory leak in PL/Python when taking slices of results}}<br />
* {{MessageLink|4D5EB8ED.7010002@enterprisedb.com|hot standby feedback message needs to be explained at protocol-replication.html}}<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg00720.php consider renaming ident to peer authentication on local connections]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=q5=x_WoOccf5dqHoAWXyDc7qXG0cpbnsNKUCB@mail.gmail.com serious problem by multiple backups]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=qR0zy5cS98n=TDCxda5sj=NODDA=xAZUCdJOE@mail.gmail.com FK constraints "NOT VALID" by default]<br />
* [http://archives.postgresql.org/message-id/AANLkTikvpjkUjwksZK=wOw7CDyMtaVEddmPN9yFv7jX-@mail.gmail.com sync rep very slow when fsync=off on standby]<br />
* [http://archives.postgresql.org/message-id/AANLkTinbzFaJXkzwm2xEegfytK1LPw8odo61wgZkkGp=@mail.gmail.com No longer need to check synchronous_standby_names and max_wal_senders at startup]<br />
* {{MessageLink|201012071131.55211.gabi.julien@broadsign.com|pg_last_xact_replay_timestamp limitations}}<br />
** rhaas says: It's not clear there's an action item here, so moving to resolved. Feel free to move back with more details.<br />
* [http://archives.postgresql.org/message-id/AANLkTikUAOYoStwkwG+DZOzTwT2QVj0H9aDywpzxvhxn@mail.gmail.com either remove write_location completely, or revert the change that broke it]<br />
** [http://archives.postgresql.org/message-id/AANLkTinHrymKd56m5AfawCdujuNM6B2g_--9UiOSSKGx@mail.gmail.com original report of problem with write_location] (but the other issues in that email are now fixed)<br />
** [http://archives.postgresql.org/message-id/AANLkTikphXd4LMXXOAZJg2s_8q0Fu5e3uuec4BG4xD4f@mail.gmail.com possible patch]<br />
* {{MessageLink|29244.1295376372@sss.pgh.pa.us|DO blocks leak memory}}<br />
* {{MessageLink|AANLkTimnwxEv-ZbqBLCSBSvmq-80vzvDb2u0pPchGm2r@mail.gmail.com|replication timeout}}<br />
* {{MessageLink|1293977249.5984.17.camel@vanquo.pezone.net|raise protocol version number}}<br />
** [http://archives.postgresql.org/message-id/1240.1301353669@sss.pgh.pa.us not done, would break things]<br />
* [http://archives.postgresql.org/message-id/AANLkTik6ArKPwnvA8_9XHo9j9+w4A2UEnsheX-mwR=Aj@mail.gmail.com fix attinhcount tracking]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00033.php SSI: SIReadLock lines in pg_locks don't show pid]<br />
** questions have been raised whether pid should be suppressed after connection closes or even maybe after transaction completion<br />
* [http://archives.postgresql.org/message-id/AANLkTi%3DkoHqna9WMm8_ATJN6c1GOLLp_0Tx6VswKhAdi%40mail.gmail.com synchronous_commit and synchronous_replication]<br />
* {{MessageLink|AANLkTimUuzn3DAaO1OLCijjjfcw0MXL-0HAvZKr9xRyV@mail.gmail.com|conversion from integer literals to money type}}<br />
* [http://archives.postgresql.org/message-id/4D892157.7070607@lelarge.info comments on SQL/MED objects]<br />
** rhaas says: [http://archives.postgresql.org/message-id/AANLkTimS1tDEuEeocz9PBjQ-RLSdjJ+VKFc_+F+Jm3Ou@mail.gmail.com proposed patch]<br />
** rhaas says: thom brown points out that [http://archives.postgresql.org/message-id/AANLkTi==NemsL7Vo=YfsPc8DdZActYF7ZwCFVO-0Nu21@mail.gmail.com I forgot USER MAPPINGs]<br />
** [http://archives.postgresql.org/message-id/7102.1302029271@sss.pgh.pa.us fixing user mappings opens unduly large can of worms]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-03/msg00352.php SSI: assertion failure on marking conflict-in due to race condition]<br />
** existing patch to recheck after trading shared LW locks for exclusive should fix<br />
* [http://archives.postgresql.org/message-id/29987.1301930239@sss.pgh.pa.us keywords table needs updating]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01170.php SSI: clumsy error handling results in generic message rather than more specific message with hint]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00157.php SSI: disable optimization when in subtransaction]<br />
* [http://archives.postgresql.org/message-id/4D9DD97B020000250003C53C@gw.wicourts.gov SSI: LOG message about SLRU wrap-around]<br />
* [http://archives.postgresql.org/message-id/BANLkTi=W8OrvqLHS+suU8R2b_rhFaqeEaw@mail.gmail.com sync rep and fast shutdown]<br />
** rhaas says: no easy resolution, i guess we'll leave this alone for now?<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01913.php SSI: three different HTABs contend for shared memory in a free-for-all]<br />
** Heikki proposes "We'll need to teach dynahash not to allocate any more entries after the preallocation. A new HASH_NO_GROW flag to hash_create() seems like a suitable interface."<br />
** [http://archives.postgresql.org/message-id/BANLkTimVuicyZG4j3F427BgfA2iYP8Od_Q@mail.gmail.com Alternatively, we could just use an initial allocation which matches the maximum entries, to ensure that all HTABs can allocate ''at least'' the configured maximum.] (There's an existing patch for that.)<br />
** I committed a patch to add the new flag to hash_create() - Heikki<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00083.php SSI: failure to clean up some SLRU-summarized locks]<br />
** existing patch to properly set commitSeqNo on the offending locks should fix<br />
* {{MessageLink|20110410103636.GC10697@tornado.leadboat.com|ALTER TABLE ADD COLUMN not creating TOAST tables for inheritance children}}<br />
* {{MessageLink|20110410015728.GA10162@tornado.leadboat.com|typed table DDL loose ends}}<br />
* [http://archives.postgresql.org/message-id/29173.1301114203@sss.pgh.pa.us assorted collation issues]<br />
** [http://archives.postgresql.org/message-id/21742.1303137667@sss.pgh.pa.us almost there...]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00334.php pl/python traceback fix]<br />
* [http://archives.postgresql.org/message-id/BANLkTinz2do3EqWO9RH66bY2FBOmL=uSoA@mail.gmail.com gin indexes don't get used unless you vacuum]<br />
* [http://archives.postgresql.org/message-id/4DB015B3020000250003CB51@gw.wicourts.gov SSI: High-contention UPDATE load on RAM disk database slows non-serializable transactions by a fraction of a percent]<br />
** [http://archives.postgresql.org/message-id/20110422220734.GG57793@csail.mit.edu Or speeds it up by a fraction of a percent. Either way it's too close to tease out from the noise easily.]<br />
** [http://archives.postgresql.org/message-id/20110425033308.GJ57793@csail.mit.edu As insurance against any performance hit in extreme high contention loads, it might be worthwhile to do a quick return (without taking any locks) if no serializable transactions are active.]<br />
* [http://archives.postgresql.org/message-id/201104201411.p3KEBOfA009414@wwwmaster.postgresql.org SSI: UPDATE setting a TOASTed value is broken, regardless of transaction isolation level]<br />
** [http://archives.postgresql.org/message-id/4DAEF7E6.7080107@enterprisedb.com proposed patch]<br />
* [http://archives.postgresql.org/message-id/BANLkTim6a_rfZ+UPPATTXap9Ed7-X7BzoA@mail.gmail.com foreign table permissions issues]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-04/msg00141.php CREATE TABLE IF NOT EXISTS doesn't work]<br />
* {{MessageLink|201103111328.p2BDSFd10499@momjian.us|Typed-tables patch broke pg_upgrade}}<br />
** {{MessageLink|20110418235041.GB2769@tornado.leadboat.com|proposed patch (tt2v2-binary-upgrade.patch)}}<br />
<br />
[[Category:PostgreSQL 9.1]]</div>Drkphttps://wiki.postgresql.org/index.php?title=PostgreSQL_9.1_Open_Items&diff=14682PostgreSQL 9.1 Open Items2011-06-21T16:56:26Z<p>Drkp: resolved</p>
<hr />
<div>== Project Planning ==<br />
See the [[PostgreSQL 9.1 Development Plan]].<br />
<br />
== Meta-Issues ==<br />
* [[standard_conforming_strings]] -- readiness of drivers and applications<br />
* Review "Long Term" list of items from [[PostgreSQL 9.0 Open Items]]<br />
<br />
== Open Issues ==<br />
<br />
=== Blockers for Beta3 ===<br />
* {{MessageLink|19848.1307666208@sss.pgh.pa.us|plpython isn't connected up to tarball building}}<br />
* {{MessageLink|28389.1308264888@sss.pgh.pa.us|ALTER TABLE lock strength reduction patch is unsafe}}<br />
=== Not Blockers for Beta3 ===<br />
* {{MessageLink|4CEA5A0F.1030602@enterprisedb.com|do latches have memory-ordering problems?}}<br />
** [http://archives.postgresql.org/message-id/11500.1302097336@sss.pgh.pa.us mostly just needs testing]<br />
* {{MessageLink|4DA58686.1050501@enterprisedb.com| throw an error if you try to start from incomplete backup taken with pg_basebackup}}<br />
<br />
* {{MessageLink|4DEF5D4C.9070401@enterprisedb.com| Add regression tests for DDL commands with SSI}}<br />
* {{MessageLink|4DEFB5EA020000250003E3C3@gw.wicourts.gov|add a couple performance notes to the SSI docs}}<br />
** {{MessageLink|4DFB85C3020000250003E883@gw.wicourts.gov|patch to do this}}<br />
<br />
== Resolved Issues ==<br />
<br />
=== Issues Resolved Prior to Beta3 ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-06/msg01241.php ill-conceived grammar for NOT VALID constraints]<br />
* {{MessageLink|4DE8B1A8.2020505@enterprisedb.com| add a proof to README-SSI for the validity of the READ ONLY optimizations}}<br />
** {{MessageLink|20110602070419.GA10064@csail.mit.edu| Dan offered a concise proof, but we need a cite for one premise it uses}}<br />
** {{MessageLink|4DE67235020000250003DFBC@gw.wicourts.gov| Kevin offered a self-contained proof, but it's much longer than Dan's}}<br />
* {{MessageLink|4DF09C6F.70608@metrosystems.co.jp|FOREIGN TABLE doc fix}}<br />
* {{MessageLink|20110620091529.GG83336@csail.mit.edu|SxactGlobalXmin computed incorrectly after abort}}<br />
** {{MessageLink|20110621021828.GL83336@csail.mit.edu|patch}}<br />
<br />
=== Issues Resolved Prior to Beta2 ===<br />
* {{MessageLink|20110428194112.GB12161@tornado.leadboat.com|ALTER TYPE DROP + composite-typed col vs. pg_upgrade}}<br />
* {{MessageLink|4DB7D194.6030906@enterprisedb.com|Memory leak in foreign scans}}<br />
* [http://archives.postgresql.org/message-id/19738.1306338472@sss.pgh.pa.us vacuum sometimes fails to update relpages/reltuples]<br />
* {{MessageLink|BANLkTik8KxxjJ1KW-pO+WWBdTEAT+80ArQ@mail.gmail.com|SSI HOT chain traversal issue}}<br />
* {{MessageLink|201106031542.p53FgDh5073991@wwwmaster.postgresql.org|CTEs attached to data-modifying statements fail with rules (bug 6051)}}<br />
* {{MessageLink|BANLkTiknTSLMbGuTaOYg4O0V7UAMsr_rOA|NOT VALID constraints don't dump properly}}<br />
* {{MessageLink|201105261137.p4QBbTlt077425@wwwmaster.postgresql.org|HS slaves do not handle unlogged tables nicely (bug 6041)}}<br />
* {{MessageLink|4DC00203020000250003D1E8@gw.wicourts.gov|Make DDL commands SSI-aware}}<br />
** {{MessageLink|4DEBF171020000250003E1A3@gw.wicourts.gov|patch to fix}} (Problems found with patch, discussion and modifications in progress.)<br />
* {{MessageLink|7486.1306253575@sss.pgh.pa.us|Domains over arrays no longer match ANYARRAY}}<br />
* {{MessageLink|12187.1307391240@sss.pgh.pa.us|contrib/citext exposes implementation shortcomings in collation code (bug 6053)}}<br />
* {{MessageLink|9773.1307545767@sss.pgh.pa.us|error in information_schema.element_types view definition (bug 5926)}}<br />
<br />
=== Issues Resolved Prior to Beta1 ===<br />
<br />
* [http://archives.postgresql.org/message-id/AANLkTin4o+eSgQsP=0i6EM=Evu3oX=ewHp8Bwod1UaWZ@mail.gmail.com wal_buffers = -1 causes spurious chatter on reload]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=jCTGC+Qxfkum6EXcw8q4tRFd2KHNpe+4HYUeS@mail.gmail.com is there a safeguard to prevent recovery from pausing before consistency is reached?]<br />
* {{MessageLink|09B23E7BF70425478C1330D893A722C602FEC019BD@MailSVR.invera.com|Walreceiver crashes in AIX}}<br />
** {{MessageLink|4C753155.3070708@ca.afilias.info|Steve Singer can't reproduce, suggests a possible way this could be pilot error}}<br />
* {{MessageLink|AANLkTik9HZi8GfSiKuHVY2N7g7xDV+sN46eRxPbOjO7P@mail.gmail.com|bug of the hot standby feedback}}<br />
* {{MessageLink|201011271931.oARJVV427882@momjian.us|GIST rewrite vs. pg_upgrade}}<br />
* [http://archives.postgresql.org/message-id/AANLkTin3PPOwXq2Cpf+tLNNKSv4OmHfDn5qr0aoeczA-@mail.gmail.com replication/README needs to be updated]<br />
* [http://archives.postgresql.org/message-id/AANLkTikJmP+bo1N-mFUWEpJiV6_OKisYw512OGeTJUbm@mail.gmail.com backend wrongly waits for sync rep even though max_wal_senders = 0]<br />
* [http://archives.postgresql.org/message-id/AANLkTik4tuG2EA6oeiov1=DO6UcDoARP45Lk+KGfy7HC@mail.gmail.com reload of the configuration file should not cause the server to end unexpectedly]<br />
* [http://archives.postgresql.org/pgsql-bugs/2010-11/msg00002.php Fix ecpg preprocessor regression]<br />
* {{MessageLink|AANLkTimDUiRrrWzZ2ZXWSRfeP5tHm9PGmpp6zfqaFpte@mail.gmail.com|Once sync_standbys_defined becomes true, there's no way for it to ever become false again. That can't be right.}}<br />
* {{MessageLink|AANLkTimfvNzJy490wp95vP1RLumkxEZBTWUjhA_Dd1QS@mail.gmail.com|backend no longer needs to wait for replication when synchronous_standby_names is set to '' and configuration file is reloaded.}}<br />
* [http://archives.postgresql.org/message-id/AANLkTi=z+eg6yMEtJCqYb0OnGaNMoTg9eYiK-xm8v0AQ@mail.gmail.com pause_at_recovery_target needs to be added in recovery.conf.sample]<br />
* {{MessageLink|201101181630.p0IGU45v047971@wwwmaster.postgresql.org|BUG #5842: Memory leak in PL/Python when taking slices of results}}<br />
* {{MessageLink|4D5EB8ED.7010002@enterprisedb.com|hot standby feedback message needs to be explained at protocol-replication.html}}<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg00720.php consider renaming ident to peer authentication on local connections]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=q5=x_WoOccf5dqHoAWXyDc7qXG0cpbnsNKUCB@mail.gmail.com serious problem by multiple backups]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=qR0zy5cS98n=TDCxda5sj=NODDA=xAZUCdJOE@mail.gmail.com FK constraints "NOT VALID" by default]<br />
* [http://archives.postgresql.org/message-id/AANLkTikvpjkUjwksZK=wOw7CDyMtaVEddmPN9yFv7jX-@mail.gmail.com sync rep very slow when fsync=off on standby]<br />
* [http://archives.postgresql.org/message-id/AANLkTinbzFaJXkzwm2xEegfytK1LPw8odo61wgZkkGp=@mail.gmail.com No longer need to check synchronous_standby_names and max_wal_senders at startup]<br />
* {{MessageLink|201012071131.55211.gabi.julien@broadsign.com|pg_last_xact_replay_timestamp limitations}}<br />
** rhaas says: It's not clear there's an action item here, so moving to resolved. Feel free to move back with more details.<br />
* [http://archives.postgresql.org/message-id/AANLkTikUAOYoStwkwG+DZOzTwT2QVj0H9aDywpzxvhxn@mail.gmail.com either remove write_location completely, or revert the change that broke it]<br />
** [http://archives.postgresql.org/message-id/AANLkTinHrymKd56m5AfawCdujuNM6B2g_--9UiOSSKGx@mail.gmail.com original report of problem with write_location] (but the other issues in that email are now fixed)<br />
** [http://archives.postgresql.org/message-id/AANLkTikphXd4LMXXOAZJg2s_8q0Fu5e3uuec4BG4xD4f@mail.gmail.com possible patch]<br />
* {{MessageLink|29244.1295376372@sss.pgh.pa.us|DO blocks leak memory}}<br />
* {{MessageLink|AANLkTimnwxEv-ZbqBLCSBSvmq-80vzvDb2u0pPchGm2r@mail.gmail.com|replication timeout}}<br />
* {{MessageLink|1293977249.5984.17.camel@vanquo.pezone.net|raise protocol version number}}<br />
** [http://archives.postgresql.org/message-id/1240.1301353669@sss.pgh.pa.us not done, would break things]<br />
* [http://archives.postgresql.org/message-id/AANLkTik6ArKPwnvA8_9XHo9j9+w4A2UEnsheX-mwR=Aj@mail.gmail.com fix attinhcount tracking]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00033.php SSI: SIReadLock lines in pg_locks don't show pid]<br />
** questions have been raised whether pid should be suppressed after connection closes or even maybe after transaction completion<br />
* [http://archives.postgresql.org/message-id/AANLkTi%3DkoHqna9WMm8_ATJN6c1GOLLp_0Tx6VswKhAdi%40mail.gmail.com synchronous_commit and synchronous_replication]<br />
* {{MessageLink|AANLkTimUuzn3DAaO1OLCijjjfcw0MXL-0HAvZKr9xRyV@mail.gmail.com|conversion from integer literals to money type}}<br />
* [http://archives.postgresql.org/message-id/4D892157.7070607@lelarge.info comments on SQL/MED objects]<br />
** rhaas says: [http://archives.postgresql.org/message-id/AANLkTimS1tDEuEeocz9PBjQ-RLSdjJ+VKFc_+F+Jm3Ou@mail.gmail.com proposed patch]<br />
** rhaas says: thom brown points out that [http://archives.postgresql.org/message-id/AANLkTi==NemsL7Vo=YfsPc8DdZActYF7ZwCFVO-0Nu21@mail.gmail.com I forgot USER MAPPINGs]<br />
** [http://archives.postgresql.org/message-id/7102.1302029271@sss.pgh.pa.us fixing user mappings opens unduly large can of worms]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-03/msg00352.php SSI: assertion failure on marking conflict-in due to race condition]<br />
** existing patch to recheck after trading shared LW locks for exclusive should fix<br />
* [http://archives.postgresql.org/message-id/29987.1301930239@sss.pgh.pa.us keywords table needs updating]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01170.php SSI: clumsy error handling results in generic message rather than more specific message with hint]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00157.php SSI: disable optimization when in subtransaction]<br />
* [http://archives.postgresql.org/message-id/4D9DD97B020000250003C53C@gw.wicourts.gov SSI: LOG message about SLRU wrap-around]<br />
* [http://archives.postgresql.org/message-id/BANLkTi=W8OrvqLHS+suU8R2b_rhFaqeEaw@mail.gmail.com sync rep and fast shutdown]<br />
** rhaas says: no easy resolution, i guess we'll leave this alone for now?<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01913.php SSI: three different HTABs contend for shared memory in a free-for-all]<br />
** Heikki proposes "We'll need to teach dynahash not to allocate any more entries after the preallocation. A new HASH_NO_GROW flag to hash_create() seems like a suitable interface."<br />
** [http://archives.postgresql.org/message-id/BANLkTimVuicyZG4j3F427BgfA2iYP8Od_Q@mail.gmail.com Alternatively, we could just use an initial allocation which matches the maximum entries, to ensure that all HTABs can allocate ''at least'' the configured maximum.] (There's an existing patch for that.)<br />
** I committed a patch to add the new flag to hash_create() - Heikki<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00083.php SSI: failure to clean up some SLRU-summarized locks]<br />
** existing patch to properly set commitSeqNo on the offending locks should fix<br />
* {{MessageLink|20110410103636.GC10697@tornado.leadboat.com|ALTER TABLE ADD COLUMN not creating TOAST tables for inheritance children}}<br />
* {{MessageLink|20110410015728.GA10162@tornado.leadboat.com|typed table DDL loose ends}}<br />
* [http://archives.postgresql.org/message-id/29173.1301114203@sss.pgh.pa.us assorted collation issues]<br />
** [http://archives.postgresql.org/message-id/21742.1303137667@sss.pgh.pa.us almost there...]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00334.php pl/python traceback fix]<br />
* [http://archives.postgresql.org/message-id/BANLkTinz2do3EqWO9RH66bY2FBOmL=uSoA@mail.gmail.com gin indexes don't get used unless you vacuum]<br />
* [http://archives.postgresql.org/message-id/4DB015B3020000250003CB51@gw.wicourts.gov SSI: High-contention UPDATE load on RAM disk database slows non-serializable transactions by a fraction of a percent]<br />
** [http://archives.postgresql.org/message-id/20110422220734.GG57793@csail.mit.edu Or speeds it up by a fraction of a percent. Either way it's too close to tease out from the noise easily.]<br />
** [http://archives.postgresql.org/message-id/20110425033308.GJ57793@csail.mit.edu As insurance against any performance hit in extreme high contention loads, it might be worthwhile to do a quick return (without taking any locks) if no serializable transactions are active.]<br />
* [http://archives.postgresql.org/message-id/201104201411.p3KEBOfA009414@wwwmaster.postgresql.org SSI: UPDATE setting a TOASTed value is broken, regardless of transaction isolation level]<br />
** [http://archives.postgresql.org/message-id/4DAEF7E6.7080107@enterprisedb.com proposed patch]<br />
* [http://archives.postgresql.org/message-id/BANLkTim6a_rfZ+UPPATTXap9Ed7-X7BzoA@mail.gmail.com foreign table permissions issues]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-04/msg00141.php CREATE TABLE IF NOT EXISTS doesn't work]<br />
* {{MessageLink|201103111328.p2BDSFd10499@momjian.us|Typed-tables patch broke pg_upgrade}}<br />
** {{MessageLink|20110418235041.GB2769@tornado.leadboat.com|proposed patch (tt2v2-binary-upgrade.patch)}}<br />
<br />
[[Category:PostgreSQL 9.1]]</div>Drkphttps://wiki.postgresql.org/index.php?title=PostgreSQL_9.1_Open_Items&diff=14676PostgreSQL 9.1 Open Items2011-06-21T02:20:22Z<p>Drkp: patch for SxactGlobalXmin issue</p>
<hr />
<div>== Project Planning ==<br />
See the [[PostgreSQL 9.1 Development Plan]].<br />
<br />
== Meta-Issues ==<br />
* [[standard_conforming_strings]] -- readiness of drivers and applications<br />
* Review "Long Term" list of items from [[PostgreSQL 9.0 Open Items]]<br />
<br />
== Open Issues ==<br />
<br />
=== Blockers for Beta3 ===<br />
* {{MessageLink|19848.1307666208@sss.pgh.pa.us|plpython isn't connected up to tarball building}}<br />
* {{MessageLink|28389.1308264888@sss.pgh.pa.us|ALTER TABLE lock strength reduction patch is unsafe}}<br />
* {{MessageLink|20110620091529.GG83336@csail.mit.edu|SxactGlobalXmin computed incorrectly after abort}}<br />
** {{MessageLink|20110621021828.GL83336@csail.mit.edu|patch}}<br />
=== Not Blockers for Beta3 ===<br />
* {{MessageLink|4CEA5A0F.1030602@enterprisedb.com|do latches have memory-ordering problems?}}<br />
** [http://archives.postgresql.org/message-id/11500.1302097336@sss.pgh.pa.us mostly just needs testing]<br />
* {{MessageLink|4DA58686.1050501@enterprisedb.com| throw an error if you try to start from incomplete backup taken with pg_basebackup}}<br />
<br />
* {{MessageLink|4DEF5D4C.9070401@enterprisedb.com| Add regression tests for DDL commands with SSI}}<br />
* {{MessageLink|4DEFB5EA020000250003E3C3@gw.wicourts.gov|add a couple performance notes to the SSI docs}}<br />
** {{MessageLink|4DFB85C3020000250003E883@gw.wicourts.gov|patch to do this}}<br />
<br />
== Resolved Issues ==<br />
<br />
=== Issues Resolved Prior to Beta3 ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-06/msg01241.php ill-conceived grammar for NOT VALID constraints]<br />
* {{MessageLink|4DE8B1A8.2020505@enterprisedb.com| add a proof to README-SSI for the validity of the READ ONLY optimizations}}<br />
** {{MessageLink|20110602070419.GA10064@csail.mit.edu| Dan offered a concise proof, but we need a cite for one premise it uses}}<br />
** {{MessageLink|4DE67235020000250003DFBC@gw.wicourts.gov| Kevin offered a self-contained proof, but it's much longer than Dan's}}<br />
* {{MessageLink|4DF09C6F.70608@metrosystems.co.jp|FOREIGN TABLE doc fix}}<br />
<br />
=== Issues Resolved Prior to Beta2 ===<br />
* {{MessageLink|20110428194112.GB12161@tornado.leadboat.com|ALTER TYPE DROP + composite-typed col vs. pg_upgrade}}<br />
* {{MessageLink|4DB7D194.6030906@enterprisedb.com|Memory leak in foreign scans}}<br />
* [http://archives.postgresql.org/message-id/19738.1306338472@sss.pgh.pa.us vacuum sometimes fails to update relpages/reltuples]<br />
* {{MessageLink|BANLkTik8KxxjJ1KW-pO+WWBdTEAT+80ArQ@mail.gmail.com|SSI HOT chain traversal issue}}<br />
* {{MessageLink|201106031542.p53FgDh5073991@wwwmaster.postgresql.org|CTEs attached to data-modifying statements fail with rules (bug 6051)}}<br />
* {{MessageLink|BANLkTiknTSLMbGuTaOYg4O0V7UAMsr_rOA|NOT VALID constraints don't dump properly}}<br />
* {{MessageLink|201105261137.p4QBbTlt077425@wwwmaster.postgresql.org|HS slaves do not handle unlogged tables nicely (bug 6041)}}<br />
* {{MessageLink|4DC00203020000250003D1E8@gw.wicourts.gov|Make DDL commands SSI-aware}}<br />
** {{MessageLink|4DEBF171020000250003E1A3@gw.wicourts.gov|patch to fix}} (Problems found with patch, discussion and modifications in progress.)<br />
* {{MessageLink|7486.1306253575@sss.pgh.pa.us|Domains over arrays no longer match ANYARRAY}}<br />
* {{MessageLink|12187.1307391240@sss.pgh.pa.us|contrib/citext exposes implementation shortcomings in collation code (bug 6053)}}<br />
* {{MessageLink|9773.1307545767@sss.pgh.pa.us|error in information_schema.element_types view definition (bug 5926)}}<br />
<br />
=== Issues Resolved Prior to Beta1 ===<br />
<br />
* [http://archives.postgresql.org/message-id/AANLkTin4o+eSgQsP=0i6EM=Evu3oX=ewHp8Bwod1UaWZ@mail.gmail.com wal_buffers = -1 causes spurious chatter on reload]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=jCTGC+Qxfkum6EXcw8q4tRFd2KHNpe+4HYUeS@mail.gmail.com is there a safeguard to prevent recovery from pausing before consistency is reached?]<br />
* {{MessageLink|09B23E7BF70425478C1330D893A722C602FEC019BD@MailSVR.invera.com|Walreceiver crashes in AIX}}<br />
** {{MessageLink|4C753155.3070708@ca.afilias.info|Steve Singer can't reproduce, suggests a possible way this could be pilot error}}<br />
* {{MessageLink|AANLkTik9HZi8GfSiKuHVY2N7g7xDV+sN46eRxPbOjO7P@mail.gmail.com|bug of the hot standby feedback}}<br />
* {{MessageLink|201011271931.oARJVV427882@momjian.us|GIST rewrite vs. pg_upgrade}}<br />
* [http://archives.postgresql.org/message-id/AANLkTin3PPOwXq2Cpf+tLNNKSv4OmHfDn5qr0aoeczA-@mail.gmail.com replication/README needs to be updated]<br />
* [http://archives.postgresql.org/message-id/AANLkTikJmP+bo1N-mFUWEpJiV6_OKisYw512OGeTJUbm@mail.gmail.com backend wrongly waits for sync rep even though max_wal_senders = 0]<br />
* [http://archives.postgresql.org/message-id/AANLkTik4tuG2EA6oeiov1=DO6UcDoARP45Lk+KGfy7HC@mail.gmail.com reload of the configuration file should not cause the server to end unexpectedly]<br />
* [http://archives.postgresql.org/pgsql-bugs/2010-11/msg00002.php Fix ecpg preprocessor regression]<br />
* {{MessageLink|AANLkTimDUiRrrWzZ2ZXWSRfeP5tHm9PGmpp6zfqaFpte@mail.gmail.com|Once sync_standbys_defined becomes true, there's no way for it to ever become false again. That can't be right.}}<br />
* {{MessageLink|AANLkTimfvNzJy490wp95vP1RLumkxEZBTWUjhA_Dd1QS@mail.gmail.com|backend no longer needs to wait for replication when synchronous_standby_names is set to '' and configuration file is reloaded.}}<br />
* [http://archives.postgresql.org/message-id/AANLkTi=z+eg6yMEtJCqYb0OnGaNMoTg9eYiK-xm8v0AQ@mail.gmail.com pause_at_recovery_target needs to be added in recovery.conf.sample]<br />
* {{MessageLink|201101181630.p0IGU45v047971@wwwmaster.postgresql.org|BUG #5842: Memory leak in PL/Python when taking slices of results}}<br />
* {{MessageLink|4D5EB8ED.7010002@enterprisedb.com|hot standby feedback message needs to be explained at protocol-replication.html}}<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg00720.php consider renaming ident to peer authentication on local connections]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=q5=x_WoOccf5dqHoAWXyDc7qXG0cpbnsNKUCB@mail.gmail.com serious problem by multiple backups]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=qR0zy5cS98n=TDCxda5sj=NODDA=xAZUCdJOE@mail.gmail.com FK constraints "NOT VALID" by default]<br />
* [http://archives.postgresql.org/message-id/AANLkTikvpjkUjwksZK=wOw7CDyMtaVEddmPN9yFv7jX-@mail.gmail.com sync rep very slow when fsync=off on standby]<br />
* [http://archives.postgresql.org/message-id/AANLkTinbzFaJXkzwm2xEegfytK1LPw8odo61wgZkkGp=@mail.gmail.com No longer need to check synchronous_standby_names and max_wal_senders at startup]<br />
* {{MessageLink|201012071131.55211.gabi.julien@broadsign.com|pg_last_xact_replay_timestamp limitations}}<br />
** rhaas says: It's not clear there's an action item here, so moving to resolved. Feel free to move back with more details.<br />
* [http://archives.postgresql.org/message-id/AANLkTikUAOYoStwkwG+DZOzTwT2QVj0H9aDywpzxvhxn@mail.gmail.com either remove write_location completely, or revert the change that broke it]<br />
** [http://archives.postgresql.org/message-id/AANLkTinHrymKd56m5AfawCdujuNM6B2g_--9UiOSSKGx@mail.gmail.com original report of problem with write_location] (but the other issues in that email are now fixed)<br />
** [http://archives.postgresql.org/message-id/AANLkTikphXd4LMXXOAZJg2s_8q0Fu5e3uuec4BG4xD4f@mail.gmail.com possible patch]<br />
* {{MessageLink|29244.1295376372@sss.pgh.pa.us|DO blocks leak memory}}<br />
* {{MessageLink|AANLkTimnwxEv-ZbqBLCSBSvmq-80vzvDb2u0pPchGm2r@mail.gmail.com|replication timeout}}<br />
* {{MessageLink|1293977249.5984.17.camel@vanquo.pezone.net|raise protocol version number}}<br />
** [http://archives.postgresql.org/message-id/1240.1301353669@sss.pgh.pa.us not done, would break things]<br />
* [http://archives.postgresql.org/message-id/AANLkTik6ArKPwnvA8_9XHo9j9+w4A2UEnsheX-mwR=Aj@mail.gmail.com fix attinhcount tracking]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00033.php SSI: SIReadLock lines in pg_locks don't show pid]<br />
** questions have been raised whether pid should be suppressed after connection closes or even maybe after transaction completion<br />
* [http://archives.postgresql.org/message-id/AANLkTi%3DkoHqna9WMm8_ATJN6c1GOLLp_0Tx6VswKhAdi%40mail.gmail.com synchronous_commit and synchronous_replication]<br />
* {{MessageLink|AANLkTimUuzn3DAaO1OLCijjjfcw0MXL-0HAvZKr9xRyV@mail.gmail.com|conversion from integer literals to money type}}<br />
* [http://archives.postgresql.org/message-id/4D892157.7070607@lelarge.info comments on SQL/MED objects]<br />
** rhaas says: [http://archives.postgresql.org/message-id/AANLkTimS1tDEuEeocz9PBjQ-RLSdjJ+VKFc_+F+Jm3Ou@mail.gmail.com proposed patch]<br />
** rhaas says: thom brown points out that [http://archives.postgresql.org/message-id/AANLkTi==NemsL7Vo=YfsPc8DdZActYF7ZwCFVO-0Nu21@mail.gmail.com I forgot USER MAPPINGs]<br />
** [http://archives.postgresql.org/message-id/7102.1302029271@sss.pgh.pa.us fixing user mappings opens unduly large can of worms]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-03/msg00352.php SSI: assertion failure on marking conflict-in due to race condition]<br />
** existing patch to recheck after trading shared LW locks for exclusive should fix<br />
* [http://archives.postgresql.org/message-id/29987.1301930239@sss.pgh.pa.us keywords table needs updating]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01170.php SSI: clumsy error handling results in generic message rather than more specific message with hint]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00157.php SSI: disable optimization when in subtransaction]<br />
* [http://archives.postgresql.org/message-id/4D9DD97B020000250003C53C@gw.wicourts.gov SSI: LOG message about SLRU wrap-around]<br />
* [http://archives.postgresql.org/message-id/BANLkTi=W8OrvqLHS+suU8R2b_rhFaqeEaw@mail.gmail.com sync rep and fast shutdown]<br />
** rhaas says: no easy resolution, i guess we'll leave this alone for now?<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01913.php SSI: three different HTABs contend for shared memory in a free-for-all]<br />
** Heikki proposes "We'll need to teach dynahash not to allocate any more entries after the preallocation. A new HASH_NO_GROW flag to hash_create() seems like a suitable interface."<br />
** [http://archives.postgresql.org/message-id/BANLkTimVuicyZG4j3F427BgfA2iYP8Od_Q@mail.gmail.com Alternatively, we could just use an initial allocation which matches the maximum entries, to ensure that all HTABs can allocate ''at least'' the configured maximum.] (There's an existing patch for that.)<br />
** I committed a patch to add the new flag to hash_create() - Heikki<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00083.php SSI: failure to clean up some SLRU-summarized locks]<br />
** existing patch to properly set commitSeqNo on the offending locks should fix<br />
* {{MessageLink|20110410103636.GC10697@tornado.leadboat.com|ALTER TABLE ADD COLUMN not creating TOAST tables for inheritance children}}<br />
* {{MessageLink|20110410015728.GA10162@tornado.leadboat.com|typed table DDL loose ends}}<br />
* [http://archives.postgresql.org/message-id/29173.1301114203@sss.pgh.pa.us assorted collation issues]<br />
** [http://archives.postgresql.org/message-id/21742.1303137667@sss.pgh.pa.us almost there...]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00334.php pl/python traceback fix]<br />
* [http://archives.postgresql.org/message-id/BANLkTinz2do3EqWO9RH66bY2FBOmL=uSoA@mail.gmail.com gin indexes don't get used unless you vacuum]<br />
* [http://archives.postgresql.org/message-id/4DB015B3020000250003CB51@gw.wicourts.gov SSI: High-contention UPDATE load on RAM disk database slows non-serializable transactions by a fraction of a percent]<br />
** [http://archives.postgresql.org/message-id/20110422220734.GG57793@csail.mit.edu Or speeds it up by a fraction of a percent. Either way it's too close to tease out from the noise easily.]<br />
** [http://archives.postgresql.org/message-id/20110425033308.GJ57793@csail.mit.edu As insurance against any performance hit in extreme high contention loads, it might be worthwhile to do a quick return (without taking any locks) if no serializable transactions are active.]<br />
* [http://archives.postgresql.org/message-id/201104201411.p3KEBOfA009414@wwwmaster.postgresql.org SSI: UPDATE setting a TOASTed value is broken, regardless of transaction isolation level]<br />
** [http://archives.postgresql.org/message-id/4DAEF7E6.7080107@enterprisedb.com proposed patch]<br />
* [http://archives.postgresql.org/message-id/BANLkTim6a_rfZ+UPPATTXap9Ed7-X7BzoA@mail.gmail.com foreign table permissions issues]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-04/msg00141.php CREATE TABLE IF NOT EXISTS doesn't work]<br />
* {{MessageLink|201103111328.p2BDSFd10499@momjian.us|Typed-tables patch broke pg_upgrade}}<br />
** {{MessageLink|20110418235041.GB2769@tornado.leadboat.com|proposed patch (tt2v2-binary-upgrade.patch)}}<br />
<br />
[[Category:PostgreSQL 9.1]]</div>Drkphttps://wiki.postgresql.org/index.php?title=PostgreSQL_9.1_Open_Items&diff=14671PostgreSQL 9.1 Open Items2011-06-20T09:16:37Z<p>Drkp: add SxactGlobalXmin issue</p>
<hr />
<div>== Project Planning ==<br />
See the [[PostgreSQL 9.1 Development Plan]].<br />
<br />
== Meta-Issues ==<br />
* [[standard_conforming_strings]] -- readiness of drivers and applications<br />
* Review "Long Term" list of items from [[PostgreSQL 9.0 Open Items]]<br />
<br />
== Open Issues ==<br />
<br />
=== Blockers for Beta3 ===<br />
* {{MessageLink|19848.1307666208@sss.pgh.pa.us|plpython isn't connected up to tarball building}}<br />
* {{MessageLink|28389.1308264888@sss.pgh.pa.us|ALTER TABLE lock strength reduction patch is unsafe}}<br />
* {{MessageLink|20110620091529.GG83336@csail.mit.edu|SxactGlobalXmin computed incorrectly after abort}}<br />
=== Not Blockers for Beta3 ===<br />
* {{MessageLink|4CEA5A0F.1030602@enterprisedb.com|do latches have memory-ordering problems?}}<br />
** [http://archives.postgresql.org/message-id/11500.1302097336@sss.pgh.pa.us mostly just needs testing]<br />
* {{MessageLink|4DA58686.1050501@enterprisedb.com| throw an error if you try to start from incomplete backup taken with pg_basebackup}}<br />
<br />
* {{MessageLink|4DEF5D4C.9070401@enterprisedb.com| Add regression tests for DDL commands with SSI}}<br />
* {{MessageLink|4DEFB5EA020000250003E3C3@gw.wicourts.gov|add a couple performance notes to the SSI docs}}<br />
** {{MessageLink|4DFB85C3020000250003E883@gw.wicourts.gov|patch to do this}}<br />
<br />
== Resolved Issues ==<br />
<br />
=== Issues Resolved Prior to Beta3 ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-06/msg01241.php ill-conceived grammar for NOT VALID constraints]<br />
* {{MessageLink|4DE8B1A8.2020505@enterprisedb.com| add a proof to README-SSI for the validity of the READ ONLY optimizations}}<br />
** {{MessageLink|20110602070419.GA10064@csail.mit.edu| Dan offered a concise proof, but we need a cite for one premise it uses}}<br />
** {{MessageLink|4DE67235020000250003DFBC@gw.wicourts.gov| Kevin offered a self-contained proof, but it's much longer than Dan's}}<br />
* {{MessageLink|4DF09C6F.70608@metrosystems.co.jp|FOREIGN TABLE doc fix}}<br />
<br />
=== Issues Resolved Prior to Beta2 ===<br />
* {{MessageLink|20110428194112.GB12161@tornado.leadboat.com|ALTER TYPE DROP + composite-typed col vs. pg_upgrade}}<br />
* {{MessageLink|4DB7D194.6030906@enterprisedb.com|Memory leak in foreign scans}}<br />
* [http://archives.postgresql.org/message-id/19738.1306338472@sss.pgh.pa.us vacuum sometimes fails to update relpages/reltuples]<br />
* {{MessageLink|BANLkTik8KxxjJ1KW-pO+WWBdTEAT+80ArQ@mail.gmail.com|SSI HOT chain traversal issue}}<br />
* {{MessageLink|201106031542.p53FgDh5073991@wwwmaster.postgresql.org|CTEs attached to data-modifying statements fail with rules (bug 6051)}}<br />
* {{MessageLink|BANLkTiknTSLMbGuTaOYg4O0V7UAMsr_rOA|NOT VALID constraints don't dump properly}}<br />
* {{MessageLink|201105261137.p4QBbTlt077425@wwwmaster.postgresql.org|HS slaves do not handle unlogged tables nicely (bug 6041)}}<br />
* {{MessageLink|4DC00203020000250003D1E8@gw.wicourts.gov|Make DDL commands SSI-aware}}<br />
** {{MessageLink|4DEBF171020000250003E1A3@gw.wicourts.gov|patch to fix}} (Problems found with patch, discussion and modifications in progress.)<br />
* {{MessageLink|7486.1306253575@sss.pgh.pa.us|Domains over arrays no longer match ANYARRAY}}<br />
* {{MessageLink|12187.1307391240@sss.pgh.pa.us|contrib/citext exposes implementation shortcomings in collation code (bug 6053)}}<br />
* {{MessageLink|9773.1307545767@sss.pgh.pa.us|error in information_schema.element_types view definition (bug 5926)}}<br />
<br />
=== Issues Resolved Prior to Beta1 ===<br />
<br />
* [http://archives.postgresql.org/message-id/AANLkTin4o+eSgQsP=0i6EM=Evu3oX=ewHp8Bwod1UaWZ@mail.gmail.com wal_buffers = -1 causes spurious chatter on reload]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=jCTGC+Qxfkum6EXcw8q4tRFd2KHNpe+4HYUeS@mail.gmail.com is there a safeguard to prevent recovery from pausing before consistency is reached?]<br />
* {{MessageLink|09B23E7BF70425478C1330D893A722C602FEC019BD@MailSVR.invera.com|Walreceiver crashes in AIX}}<br />
** {{MessageLink|4C753155.3070708@ca.afilias.info|Steve Singer can't reproduce, suggests a possible way this could be pilot error}}<br />
* {{MessageLink|AANLkTik9HZi8GfSiKuHVY2N7g7xDV+sN46eRxPbOjO7P@mail.gmail.com|bug of the hot standby feedback}}<br />
* {{MessageLink|201011271931.oARJVV427882@momjian.us|GIST rewrite vs. pg_upgrade}}<br />
* [http://archives.postgresql.org/message-id/AANLkTin3PPOwXq2Cpf+tLNNKSv4OmHfDn5qr0aoeczA-@mail.gmail.com replication/README needs to be updated]<br />
* [http://archives.postgresql.org/message-id/AANLkTikJmP+bo1N-mFUWEpJiV6_OKisYw512OGeTJUbm@mail.gmail.com backend wrongly waits for sync rep even though max_wal_senders = 0]<br />
* [http://archives.postgresql.org/message-id/AANLkTik4tuG2EA6oeiov1=DO6UcDoARP45Lk+KGfy7HC@mail.gmail.com reload of the configuration file should not cause the server to end unexpectedly]<br />
* [http://archives.postgresql.org/pgsql-bugs/2010-11/msg00002.php Fix ecpg preprocessor regression]<br />
* {{MessageLink|AANLkTimDUiRrrWzZ2ZXWSRfeP5tHm9PGmpp6zfqaFpte@mail.gmail.com|Once sync_standbys_defined becomes true, there's no way for it to ever become false again. That can't be right.}}<br />
* {{MessageLink|AANLkTimfvNzJy490wp95vP1RLumkxEZBTWUjhA_Dd1QS@mail.gmail.com|backend no longer needs to wait for replication when synchronous_standby_names is set to '' and configuration file is reloaded.}}<br />
* [http://archives.postgresql.org/message-id/AANLkTi=z+eg6yMEtJCqYb0OnGaNMoTg9eYiK-xm8v0AQ@mail.gmail.com pause_at_recovery_target needs to be added in recovery.conf.sample]<br />
* {{MessageLink|201101181630.p0IGU45v047971@wwwmaster.postgresql.org|BUG #5842: Memory leak in PL/Python when taking slices of results}}<br />
* {{MessageLink|4D5EB8ED.7010002@enterprisedb.com|hot standby feedback message needs to be explained at protocol-replication.html}}<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg00720.php consider renaming ident to peer authentication on local connections]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=q5=x_WoOccf5dqHoAWXyDc7qXG0cpbnsNKUCB@mail.gmail.com serious problem by multiple backups]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=qR0zy5cS98n=TDCxda5sj=NODDA=xAZUCdJOE@mail.gmail.com FK constraints "NOT VALID" by default]<br />
* [http://archives.postgresql.org/message-id/AANLkTikvpjkUjwksZK=wOw7CDyMtaVEddmPN9yFv7jX-@mail.gmail.com sync rep very slow when fsync=off on standby]<br />
* [http://archives.postgresql.org/message-id/AANLkTinbzFaJXkzwm2xEegfytK1LPw8odo61wgZkkGp=@mail.gmail.com No longer need to check synchronous_standby_names and max_wal_senders at startup]<br />
* {{MessageLink|201012071131.55211.gabi.julien@broadsign.com|pg_last_xact_replay_timestamp limitations}}<br />
** rhaas says: It's not clear there's an action item here, so moving to resolved. Feel free to move back with more details.<br />
* [http://archives.postgresql.org/message-id/AANLkTikUAOYoStwkwG+DZOzTwT2QVj0H9aDywpzxvhxn@mail.gmail.com either remove write_location completely, or revert the change that broke it]<br />
** [http://archives.postgresql.org/message-id/AANLkTinHrymKd56m5AfawCdujuNM6B2g_--9UiOSSKGx@mail.gmail.com original report of problem with write_location] (but the other issues in that email are now fixed)<br />
** [http://archives.postgresql.org/message-id/AANLkTikphXd4LMXXOAZJg2s_8q0Fu5e3uuec4BG4xD4f@mail.gmail.com possible patch]<br />
* {{MessageLink|29244.1295376372@sss.pgh.pa.us|DO blocks leak memory}}<br />
* {{MessageLink|AANLkTimnwxEv-ZbqBLCSBSvmq-80vzvDb2u0pPchGm2r@mail.gmail.com|replication timeout}}<br />
* {{MessageLink|1293977249.5984.17.camel@vanquo.pezone.net|raise protocol version number}}<br />
** [http://archives.postgresql.org/message-id/1240.1301353669@sss.pgh.pa.us not done, would break things]<br />
* [http://archives.postgresql.org/message-id/AANLkTik6ArKPwnvA8_9XHo9j9+w4A2UEnsheX-mwR=Aj@mail.gmail.com fix attinhcount tracking]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00033.php SSI: SIReadLock lines in pg_locks don't show pid]<br />
** questions have been raised whether pid should be suppressed after connection closes or even maybe after transaction completion<br />
* [http://archives.postgresql.org/message-id/AANLkTi%3DkoHqna9WMm8_ATJN6c1GOLLp_0Tx6VswKhAdi%40mail.gmail.com synchronous_commit and synchronous_replication]<br />
* {{MessageLink|AANLkTimUuzn3DAaO1OLCijjjfcw0MXL-0HAvZKr9xRyV@mail.gmail.com|conversion from integer literals to money type}}<br />
* [http://archives.postgresql.org/message-id/4D892157.7070607@lelarge.info comments on SQL/MED objects]<br />
** rhaas says: [http://archives.postgresql.org/message-id/AANLkTimS1tDEuEeocz9PBjQ-RLSdjJ+VKFc_+F+Jm3Ou@mail.gmail.com proposed patch]<br />
** rhaas says: thom brown points out that [http://archives.postgresql.org/message-id/AANLkTi==NemsL7Vo=YfsPc8DdZActYF7ZwCFVO-0Nu21@mail.gmail.com I forgot USER MAPPINGs]<br />
** [http://archives.postgresql.org/message-id/7102.1302029271@sss.pgh.pa.us fixing user mappings opens unduly large can of worms]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-03/msg00352.php SSI: assertion failure on marking conflict-in due to race condition]<br />
** existing patch to recheck after trading shared LW locks for exclusive should fix<br />
* [http://archives.postgresql.org/message-id/29987.1301930239@sss.pgh.pa.us keywords table needs updating]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01170.php SSI: clumsy error handling results in generic message rather than more specific message with hint]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00157.php SSI: disable optimization when in subtransaction]<br />
* [http://archives.postgresql.org/message-id/4D9DD97B020000250003C53C@gw.wicourts.gov SSI: LOG message about SLRU wrap-around]<br />
* [http://archives.postgresql.org/message-id/BANLkTi=W8OrvqLHS+suU8R2b_rhFaqeEaw@mail.gmail.com sync rep and fast shutdown]<br />
** rhaas says: no easy resolution, i guess we'll leave this alone for now?<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01913.php SSI: three different HTABs contend for shared memory in a free-for-all]<br />
** Heikki proposes "We'll need to teach dynahash not to allocate any more entries after the preallocation. A new HASH_NO_GROW flag to hash_create() seems like a suitable interface."<br />
** [http://archives.postgresql.org/message-id/BANLkTimVuicyZG4j3F427BgfA2iYP8Od_Q@mail.gmail.com Alternatively, we could just use an initial allocation which matches the maximum entries, to ensure that all HTABs can allocate ''at least'' the configured maximum.] (There's an existing patch for that.)<br />
** I committed a patch to add the new flag to hash_create() - Heikki<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00083.php SSI: failure to clean up some SLRU-summarized locks]<br />
** existing patch to properly set commitSeqNo on the offending locks should fix<br />
* {{MessageLink|20110410103636.GC10697@tornado.leadboat.com|ALTER TABLE ADD COLUMN not creating TOAST tables for inheritance children}}<br />
* {{MessageLink|20110410015728.GA10162@tornado.leadboat.com|typed table DDL loose ends}}<br />
* [http://archives.postgresql.org/message-id/29173.1301114203@sss.pgh.pa.us assorted collation issues]<br />
** [http://archives.postgresql.org/message-id/21742.1303137667@sss.pgh.pa.us almost there...]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00334.php pl/python traceback fix]<br />
* [http://archives.postgresql.org/message-id/BANLkTinz2do3EqWO9RH66bY2FBOmL=uSoA@mail.gmail.com gin indexes don't get used unless you vacuum]<br />
* [http://archives.postgresql.org/message-id/4DB015B3020000250003CB51@gw.wicourts.gov SSI: High-contention UPDATE load on RAM disk database slows non-serializable transactions by a fraction of a percent]<br />
** [http://archives.postgresql.org/message-id/20110422220734.GG57793@csail.mit.edu Or speeds it up by a fraction of a percent. Either way it's too close to tease out from the noise easily.]<br />
** [http://archives.postgresql.org/message-id/20110425033308.GJ57793@csail.mit.edu As insurance against any performance hit in extreme high contention loads, it might be worthwhile to do a quick return (without taking any locks) if no serializable transactions are active.]<br />
* [http://archives.postgresql.org/message-id/201104201411.p3KEBOfA009414@wwwmaster.postgresql.org SSI: UPDATE setting a TOASTed value is broken, regardless of transaction isolation level]<br />
** [http://archives.postgresql.org/message-id/4DAEF7E6.7080107@enterprisedb.com proposed patch]<br />
* [http://archives.postgresql.org/message-id/BANLkTim6a_rfZ+UPPATTXap9Ed7-X7BzoA@mail.gmail.com foreign table permissions issues]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-04/msg00141.php CREATE TABLE IF NOT EXISTS doesn't work]<br />
* {{MessageLink|201103111328.p2BDSFd10499@momjian.us|Typed-tables patch broke pg_upgrade}}<br />
** {{MessageLink|20110418235041.GB2769@tornado.leadboat.com|proposed patch (tt2v2-binary-upgrade.patch)}}<br />
<br />
[[Category:PostgreSQL 9.1]]</div>Drkphttps://wiki.postgresql.org/index.php?title=PostgreSQL_9.1_Open_Items&diff=14640PostgreSQL 9.1 Open Items2011-06-17T04:45:03Z<p>Drkp: remaining SSI loose ends are non-blockers; list them separately</p>
<hr />
<div>== Project Planning ==<br />
See the [[PostgreSQL 9.1 Development Plan]].<br />
<br />
== Meta-Issues ==<br />
* [[standard_conforming_strings]] -- readiness of drivers and applications<br />
* Review "Long Term" list of items from [[PostgreSQL 9.0 Open Items]]<br />
<br />
== Open Issues ==<br />
<br />
=== Blockers for Beta3 ===<br />
* {{MessageLink|19848.1307666208@sss.pgh.pa.us|plpython isn't connected up to tarball building}}<br />
<br />
=== Not Blockers for Beta3 ===<br />
* [http://archives.postgresql.org/pgsql-general/2011-01/msg00890.php generate_series boundary issue] [http://archives.postgresql.org/pgsql-hackers/2011-02/msg00423.php and prototype patch]<br />
** This is a pre-existing bug, not a regression, so it is not a beta blocker.<br />
* {{MessageLink|4CEA5A0F.1030602@enterprisedb.com|do latches have memory-ordering problems?}}<br />
** [http://archives.postgresql.org/message-id/11500.1302097336@sss.pgh.pa.us mostly just needs testing]<br />
* {{MessageLink|4DA58686.1050501@enterprisedb.com| throw an error if you try to start from incomplete backup taken with pg_basebackup}}<br />
<br />
* {{MessageLink|4DEF5D4C.9070401@enterprisedb.com| Add regression tests for DDL commands with SSI}}<br />
* {{MessageLink|4DEFB5EA020000250003E3C3@gw.wicourts.gov|add a couple performance notes to the SSI docs}}<br />
== Resolved Issues ==<br />
<br />
=== Issues Resolved Prior to Beta3 ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-06/msg01241.php ill-conceived grammar for NOT VALID constraints]<br />
* {{MessageLink|4DE8B1A8.2020505@enterprisedb.com| add a proof to README-SSI for the validity of the READ ONLY optimizations}}<br />
** {{MessageLink|20110602070419.GA10064@csail.mit.edu| Dan offered a concise proof, but we need a cite for one premise it uses}}<br />
** {{MessageLink|4DE67235020000250003DFBC@gw.wicourts.gov| Kevin offered a self-contained proof, but it's much longer than Dan's}}<br />
* {{MessageLink|4DF09C6F.70608@metrosystems.co.jp|FOREIGN TABLE doc fix}}<br />
<br />
=== Issues Resolved Prior to Beta2 ===<br />
* {{MessageLink|20110428194112.GB12161@tornado.leadboat.com|ALTER TYPE DROP + composite-typed col vs. pg_upgrade}}<br />
* {{MessageLink|4DB7D194.6030906@enterprisedb.com|Memory leak in foreign scans}}<br />
* [http://archives.postgresql.org/message-id/19738.1306338472@sss.pgh.pa.us vacuum sometimes fails to update relpages/reltuples]<br />
* {{MessageLink|BANLkTik8KxxjJ1KW-pO+WWBdTEAT+80ArQ@mail.gmail.com|SSI HOT chain traversal issue}}<br />
* {{MessageLink|201106031542.p53FgDh5073991@wwwmaster.postgresql.org|CTEs attached to data-modifying statements fail with rules (bug 6051)}}<br />
* {{MessageLink|BANLkTiknTSLMbGuTaOYg4O0V7UAMsr_rOA|NOT VALID constraints don't dump properly}}<br />
* {{MessageLink|201105261137.p4QBbTlt077425@wwwmaster.postgresql.org|HS slaves do not handle unlogged tables nicely (bug 6041)}}<br />
* {{MessageLink|4DC00203020000250003D1E8@gw.wicourts.gov|Make DDL commands SSI-aware}}<br />
** {{MessageLink|4DEBF171020000250003E1A3@gw.wicourts.gov|patch to fix}} (Problems found with patch, discussion and modifications in progress.)<br />
* {{MessageLink|7486.1306253575@sss.pgh.pa.us|Domains over arrays no longer match ANYARRAY}}<br />
* {{MessageLink|12187.1307391240@sss.pgh.pa.us|contrib/citext exposes implementation shortcomings in collation code (bug 6053)}}<br />
* {{MessageLink|9773.1307545767@sss.pgh.pa.us|error in information_schema.element_types view definition (bug 5926)}}<br />
<br />
=== Issues Resolved Prior to Beta1 ===<br />
<br />
* [http://archives.postgresql.org/message-id/AANLkTin4o+eSgQsP=0i6EM=Evu3oX=ewHp8Bwod1UaWZ@mail.gmail.com wal_buffers = -1 causes spurious chatter on reload]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=jCTGC+Qxfkum6EXcw8q4tRFd2KHNpe+4HYUeS@mail.gmail.com is there a safeguard to prevent recovery from pausing before consistency is reached?]<br />
* {{MessageLink|09B23E7BF70425478C1330D893A722C602FEC019BD@MailSVR.invera.com|Walreceiver crashes in AIX}}<br />
** {{MessageLink|4C753155.3070708@ca.afilias.info|Steve Singer can't reproduce, suggests a possible way this could be pilot error}}<br />
* {{MessageLink|AANLkTik9HZi8GfSiKuHVY2N7g7xDV+sN46eRxPbOjO7P@mail.gmail.com|bug of the hot standby feedback}}<br />
* {{MessageLink|201011271931.oARJVV427882@momjian.us|GIST rewrite vs. pg_upgrade}}<br />
* [http://archives.postgresql.org/message-id/AANLkTin3PPOwXq2Cpf+tLNNKSv4OmHfDn5qr0aoeczA-@mail.gmail.com replication/README needs to be updated]<br />
* [http://archives.postgresql.org/message-id/AANLkTikJmP+bo1N-mFUWEpJiV6_OKisYw512OGeTJUbm@mail.gmail.com backend wrongly waits for sync rep even though max_wal_senders = 0]<br />
* [http://archives.postgresql.org/message-id/AANLkTik4tuG2EA6oeiov1=DO6UcDoARP45Lk+KGfy7HC@mail.gmail.com reload of the configuration file should not cause the server to end unexpectedly]<br />
* [http://archives.postgresql.org/pgsql-bugs/2010-11/msg00002.php Fix ecpg preprocessor regression]<br />
* {{MessageLink|AANLkTimDUiRrrWzZ2ZXWSRfeP5tHm9PGmpp6zfqaFpte@mail.gmail.com|Once sync_standbys_defined becomes true, there's no way for it to ever become false again. That can't be right.}}<br />
* {{MessageLink|AANLkTimfvNzJy490wp95vP1RLumkxEZBTWUjhA_Dd1QS@mail.gmail.com|backend no longer needs to wait for replication when synchronous_standby_names is set to '' and configuration file is reloaded.}}<br />
* [http://archives.postgresql.org/message-id/AANLkTi=z+eg6yMEtJCqYb0OnGaNMoTg9eYiK-xm8v0AQ@mail.gmail.com pause_at_recovery_target needs to be added in recovery.conf.sample]<br />
* {{MessageLink|201101181630.p0IGU45v047971@wwwmaster.postgresql.org|BUG #5842: Memory leak in PL/Python when taking slices of results}}<br />
* {{MessageLink|4D5EB8ED.7010002@enterprisedb.com|hot standby feedback message needs to be explained at protocol-replication.html}}<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg00720.php consider renaming ident to peer authentication on local connections]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=q5=x_WoOccf5dqHoAWXyDc7qXG0cpbnsNKUCB@mail.gmail.com serious problem by multiple backups]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=qR0zy5cS98n=TDCxda5sj=NODDA=xAZUCdJOE@mail.gmail.com FK constraints "NOT VALID" by default]<br />
* [http://archives.postgresql.org/message-id/AANLkTikvpjkUjwksZK=wOw7CDyMtaVEddmPN9yFv7jX-@mail.gmail.com sync rep very slow when fsync=off on standby]<br />
* [http://archives.postgresql.org/message-id/AANLkTinbzFaJXkzwm2xEegfytK1LPw8odo61wgZkkGp=@mail.gmail.com No longer need to check synchronous_standby_names and max_wal_senders at startup]<br />
* {{MessageLink|201012071131.55211.gabi.julien@broadsign.com|pg_last_xact_replay_timestamp limitations}}<br />
** rhaas says: It's not clear there's an action item here, so moving to resolved. Feel free to move back with more details.<br />
* [http://archives.postgresql.org/message-id/AANLkTikUAOYoStwkwG+DZOzTwT2QVj0H9aDywpzxvhxn@mail.gmail.com either remove write_location completely, or revert the change that broke it]<br />
** [http://archives.postgresql.org/message-id/AANLkTinHrymKd56m5AfawCdujuNM6B2g_--9UiOSSKGx@mail.gmail.com original report of problem with write_location] (but the other issues in that email are now fixed)<br />
** [http://archives.postgresql.org/message-id/AANLkTikphXd4LMXXOAZJg2s_8q0Fu5e3uuec4BG4xD4f@mail.gmail.com possible patch]<br />
* {{MessageLink|29244.1295376372@sss.pgh.pa.us|DO blocks leak memory}}<br />
* {{MessageLink|AANLkTimnwxEv-ZbqBLCSBSvmq-80vzvDb2u0pPchGm2r@mail.gmail.com|replication timeout}}<br />
* {{MessageLink|1293977249.5984.17.camel@vanquo.pezone.net|raise protocol version number}}<br />
** [http://archives.postgresql.org/message-id/1240.1301353669@sss.pgh.pa.us not done, would break things]<br />
* [http://archives.postgresql.org/message-id/AANLkTik6ArKPwnvA8_9XHo9j9+w4A2UEnsheX-mwR=Aj@mail.gmail.com fix attinhcount tracking]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00033.php SSI: SIReadLock lines in pg_locks don't show pid]<br />
** questions have been raised whether pid should be suppressed after connection closes or even maybe after transaction completion<br />
* [http://archives.postgresql.org/message-id/AANLkTi%3DkoHqna9WMm8_ATJN6c1GOLLp_0Tx6VswKhAdi%40mail.gmail.com synchronous_commit and synchronous_replication]<br />
* {{MessageLink|AANLkTimUuzn3DAaO1OLCijjjfcw0MXL-0HAvZKr9xRyV@mail.gmail.com|conversion from integer literals to money type}}<br />
* [http://archives.postgresql.org/message-id/4D892157.7070607@lelarge.info comments on SQL/MED objects]<br />
** rhaas says: [http://archives.postgresql.org/message-id/AANLkTimS1tDEuEeocz9PBjQ-RLSdjJ+VKFc_+F+Jm3Ou@mail.gmail.com proposed patch]<br />
** rhaas says: thom brown points out that [http://archives.postgresql.org/message-id/AANLkTi==NemsL7Vo=YfsPc8DdZActYF7ZwCFVO-0Nu21@mail.gmail.com I forgot USER MAPPINGs]<br />
** [http://archives.postgresql.org/message-id/7102.1302029271@sss.pgh.pa.us fixing user mappings opens unduly large can of worms]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-03/msg00352.php SSI: assertion failure on marking conflict-in due to race condition]<br />
** existing patch to recheck after trading shared LW locks for exclusive should fix<br />
* [http://archives.postgresql.org/message-id/29987.1301930239@sss.pgh.pa.us keywords table needs updating]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01170.php SSI: clumsy error handling results in generic message rather than more specific message with hint]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00157.php SSI: disable optimization when in subtransaction]<br />
* [http://archives.postgresql.org/message-id/4D9DD97B020000250003C53C@gw.wicourts.gov SSI: LOG message about SLRU wrap-around]<br />
* [http://archives.postgresql.org/message-id/BANLkTi=W8OrvqLHS+suU8R2b_rhFaqeEaw@mail.gmail.com sync rep and fast shutdown]<br />
** rhaas says: no easy resolution, i guess we'll leave this alone for now?<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01913.php SSI: three different HTABs contend for shared memory in a free-for-all]<br />
** Heikki proposes "We'll need to teach dynahash not to allocate any more entries after the preallocation. A new HASH_NO_GROW flag to hash_create() seems like a suitable interface."<br />
** [http://archives.postgresql.org/message-id/BANLkTimVuicyZG4j3F427BgfA2iYP8Od_Q@mail.gmail.com Alternatively, we could just use an initial allocation which matches the maximum entries, to ensure that all HTABs can allocate ''at least'' the configured maximum.] (There's an existing patch for that.)<br />
** I committed a patch to add the new flag to hash_create() - Heikki<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00083.php SSI: failure to clean up some SLRU-summarized locks]<br />
** existing patch to properly set commitSeqNo on the offending locks should fix<br />
* {{MessageLink|20110410103636.GC10697@tornado.leadboat.com|ALTER TABLE ADD COLUMN not creating TOAST tables for inheritance children}}<br />
* {{MessageLink|20110410015728.GA10162@tornado.leadboat.com|typed table DDL loose ends}}<br />
* [http://archives.postgresql.org/message-id/29173.1301114203@sss.pgh.pa.us assorted collation issues]<br />
** [http://archives.postgresql.org/message-id/21742.1303137667@sss.pgh.pa.us almost there...]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00334.php pl/python traceback fix]<br />
* [http://archives.postgresql.org/message-id/BANLkTinz2do3EqWO9RH66bY2FBOmL=uSoA@mail.gmail.com gin indexes don't get used unless you vacuum]<br />
* [http://archives.postgresql.org/message-id/4DB015B3020000250003CB51@gw.wicourts.gov SSI: High-contention UPDATE load on RAM disk database slows non-serializable transactions by a fraction of a percent]<br />
** [http://archives.postgresql.org/message-id/20110422220734.GG57793@csail.mit.edu Or speeds it up by a fraction of a percent. Either way it's too close to tease out from the noise easily.]<br />
** [http://archives.postgresql.org/message-id/20110425033308.GJ57793@csail.mit.edu As insurance against any performance hit in extreme high contention loads, it might be worthwhile to do a quick return (without taking any locks) if no serializable transactions are active.]<br />
* [http://archives.postgresql.org/message-id/201104201411.p3KEBOfA009414@wwwmaster.postgresql.org SSI: UPDATE setting a TOASTed value is broken, regardless of transaction isolation level]<br />
** [http://archives.postgresql.org/message-id/4DAEF7E6.7080107@enterprisedb.com proposed patch]<br />
* [http://archives.postgresql.org/message-id/BANLkTim6a_rfZ+UPPATTXap9Ed7-X7BzoA@mail.gmail.com foreign table permissions issues]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-04/msg00141.php CREATE TABLE IF NOT EXISTS doesn't work]<br />
* {{MessageLink|201103111328.p2BDSFd10499@momjian.us|Typed-tables patch broke pg_upgrade}}<br />
** {{MessageLink|20110418235041.GB2769@tornado.leadboat.com|proposed patch (tt2v2-binary-upgrade.patch)}}<br />
<br />
[[Category:PostgreSQL 9.1]]</div>Drkphttps://wiki.postgresql.org/index.php?title=PostgreSQL_9.1_Open_Items&diff=14637PostgreSQL 9.1 Open Items2011-06-17T02:03:26Z<p>Drkp: README-SSI issue resolved</p>
<hr />
<div>== Project Planning ==<br />
See the [[PostgreSQL 9.1 Development Plan]].<br />
<br />
== Meta-Issues ==<br />
* [[standard_conforming_strings]] -- readiness of drivers and applications<br />
* Review "Long Term" list of items from [[PostgreSQL 9.0 Open Items]]<br />
<br />
== Open Issues ==<br />
<br />
=== Blockers for Beta3 ===<br />
* {{MessageLink|4DEFB5EA020000250003E3C3@gw.wicourts.gov|more SSI loose ends}}<br />
* {{MessageLink|19848.1307666208@sss.pgh.pa.us|plpython isn't connected up to tarball building}}<br />
<br />
=== Not Blockers for Beta3 ===<br />
* [http://archives.postgresql.org/pgsql-general/2011-01/msg00890.php generate_series boundary issue] [http://archives.postgresql.org/pgsql-hackers/2011-02/msg00423.php and prototype patch]<br />
** This is a pre-existing bug, not a regression, so it is not a beta blocker.<br />
* {{MessageLink|4CEA5A0F.1030602@enterprisedb.com|do latches have memory-ordering problems?}}<br />
** [http://archives.postgresql.org/message-id/11500.1302097336@sss.pgh.pa.us mostly just needs testing]<br />
* {{MessageLink|4DA58686.1050501@enterprisedb.com| throw an error if you try to start from incomplete backup taken with pg_basebackup}}<br />
<br />
* {{MessageLink|4DD3D6C6.5060006@2ndquadrant.com|core extensions, part of the main docs, shipped by default}}<br />
* {{MessageLink|4DEF5D4C.9070401@enterprisedb.com| Add regression tests for DDL commands with SSI}}<br />
* {{MessageLink|4DF09C6F.70608@metrosystems.co.jp|FOREIGN TABLE doc fix}}<br />
<br />
== Resolved Issues ==<br />
<br />
=== Issues Resolved Prior to Beta3 ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-06/msg01241.php ill-conceived grammar for NOT VALID constraints]<br />
* {{MessageLink|4DE8B1A8.2020505@enterprisedb.com| add a proof to README-SSI for the validity of the READ ONLY optimizations}}<br />
** {{MessageLink|20110602070419.GA10064@csail.mit.edu| Dan offered a concise proof, but we need a cite for one premise it uses}}<br />
** {{MessageLink|4DE67235020000250003DFBC@gw.wicourts.gov| Kevin offered a self-contained proof, but it's much longer than Dan's}}<br />
<br />
=== Issues Resolved Prior to Beta2 ===<br />
* {{MessageLink|20110428194112.GB12161@tornado.leadboat.com|ALTER TYPE DROP + composite-typed col vs. pg_upgrade}}<br />
* {{MessageLink|4DB7D194.6030906@enterprisedb.com|Memory leak in foreign scans}}<br />
* [http://archives.postgresql.org/message-id/19738.1306338472@sss.pgh.pa.us vacuum sometimes fails to update relpages/reltuples]<br />
* {{MessageLink|BANLkTik8KxxjJ1KW-pO+WWBdTEAT+80ArQ@mail.gmail.com|SSI HOT chain traversal issue}}<br />
* {{MessageLink|201106031542.p53FgDh5073991@wwwmaster.postgresql.org|CTEs attached to data-modifying statements fail with rules (bug 6051)}}<br />
* {{MessageLink|BANLkTiknTSLMbGuTaOYg4O0V7UAMsr_rOA|NOT VALID constraints don't dump properly}}<br />
* {{MessageLink|201105261137.p4QBbTlt077425@wwwmaster.postgresql.org|HS slaves do not handle unlogged tables nicely (bug 6041)}}<br />
* {{MessageLink|4DC00203020000250003D1E8@gw.wicourts.gov|Make DDL commands SSI-aware}}<br />
** {{MessageLink|4DEBF171020000250003E1A3@gw.wicourts.gov|patch to fix}} (Problems found with patch, discussion and modifications in progress.)<br />
* {{MessageLink|7486.1306253575@sss.pgh.pa.us|Domains over arrays no longer match ANYARRAY}}<br />
* {{MessageLink|12187.1307391240@sss.pgh.pa.us|contrib/citext exposes implementation shortcomings in collation code (bug 6053)}}<br />
* {{MessageLink|9773.1307545767@sss.pgh.pa.us|error in information_schema.element_types view definition (bug 5926)}}<br />
<br />
=== Issues Resolved Prior to Beta1 ===<br />
<br />
* [http://archives.postgresql.org/message-id/AANLkTin4o+eSgQsP=0i6EM=Evu3oX=ewHp8Bwod1UaWZ@mail.gmail.com wal_buffers = -1 causes spurious chatter on reload]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=jCTGC+Qxfkum6EXcw8q4tRFd2KHNpe+4HYUeS@mail.gmail.com is there a safeguard to prevent recovery from pausing before consistency is reached?]<br />
* {{MessageLink|09B23E7BF70425478C1330D893A722C602FEC019BD@MailSVR.invera.com|Walreceiver crashes in AIX}}<br />
** {{MessageLink|4C753155.3070708@ca.afilias.info|Steve Singer can't reproduce, suggests a possible way this could be pilot error}}<br />
* {{MessageLink|AANLkTik9HZi8GfSiKuHVY2N7g7xDV+sN46eRxPbOjO7P@mail.gmail.com|bug of the hot standby feedback}}<br />
* {{MessageLink|201011271931.oARJVV427882@momjian.us|GIST rewrite vs. pg_upgrade}}<br />
* [http://archives.postgresql.org/message-id/AANLkTin3PPOwXq2Cpf+tLNNKSv4OmHfDn5qr0aoeczA-@mail.gmail.com replication/README needs to be updated]<br />
* [http://archives.postgresql.org/message-id/AANLkTikJmP+bo1N-mFUWEpJiV6_OKisYw512OGeTJUbm@mail.gmail.com backend wrongly waits for sync rep even though max_wal_senders = 0]<br />
* [http://archives.postgresql.org/message-id/AANLkTik4tuG2EA6oeiov1=DO6UcDoARP45Lk+KGfy7HC@mail.gmail.com reload of the configuration file should not cause the server to end unexpectedly]<br />
* [http://archives.postgresql.org/pgsql-bugs/2010-11/msg00002.php Fix ecpg preprocessor regression]<br />
* {{MessageLink|AANLkTimDUiRrrWzZ2ZXWSRfeP5tHm9PGmpp6zfqaFpte@mail.gmail.com|Once sync_standbys_defined becomes true, there's no way for it to ever become false again. That can't be right.}}<br />
* {{MessageLink|AANLkTimfvNzJy490wp95vP1RLumkxEZBTWUjhA_Dd1QS@mail.gmail.com|backend no longer needs to wait for replication when synchronous_standby_names is set to '' and configuration file is reloaded.}}<br />
* [http://archives.postgresql.org/message-id/AANLkTi=z+eg6yMEtJCqYb0OnGaNMoTg9eYiK-xm8v0AQ@mail.gmail.com pause_at_recovery_target needs to be added in recovery.conf.sample]<br />
* {{MessageLink|201101181630.p0IGU45v047971@wwwmaster.postgresql.org|BUG #5842: Memory leak in PL/Python when taking slices of results}}<br />
* {{MessageLink|4D5EB8ED.7010002@enterprisedb.com|hot standby feedback message needs to be explained at protocol-replication.html}}<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg00720.php consider renaming ident to peer authentication on local connections]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=q5=x_WoOccf5dqHoAWXyDc7qXG0cpbnsNKUCB@mail.gmail.com serious problem by multiple backups]<br />
* [http://archives.postgresql.org/message-id/AANLkTi=qR0zy5cS98n=TDCxda5sj=NODDA=xAZUCdJOE@mail.gmail.com FK constraints "NOT VALID" by default]<br />
* [http://archives.postgresql.org/message-id/AANLkTikvpjkUjwksZK=wOw7CDyMtaVEddmPN9yFv7jX-@mail.gmail.com sync rep very slow when fsync=off on standby]<br />
* [http://archives.postgresql.org/message-id/AANLkTinbzFaJXkzwm2xEegfytK1LPw8odo61wgZkkGp=@mail.gmail.com No longer need to check synchronous_standby_names and max_wal_senders at startup]<br />
* {{MessageLink|201012071131.55211.gabi.julien@broadsign.com|pg_last_xact_replay_timestamp limitations}}<br />
** rhaas says: It's not clear there's an action item here, so moving to resolved. Feel free to move back with more details.<br />
* [http://archives.postgresql.org/message-id/AANLkTikUAOYoStwkwG+DZOzTwT2QVj0H9aDywpzxvhxn@mail.gmail.com either remove write_location completely, or revert the change that broke it]<br />
** [http://archives.postgresql.org/message-id/AANLkTinHrymKd56m5AfawCdujuNM6B2g_--9UiOSSKGx@mail.gmail.com original report of problem with write_location] (but the other issues in that email are now fixed)<br />
** [http://archives.postgresql.org/message-id/AANLkTikphXd4LMXXOAZJg2s_8q0Fu5e3uuec4BG4xD4f@mail.gmail.com possible patch]<br />
* {{MessageLink|29244.1295376372@sss.pgh.pa.us|DO blocks leak memory}}<br />
* {{MessageLink|AANLkTimnwxEv-ZbqBLCSBSvmq-80vzvDb2u0pPchGm2r@mail.gmail.com|replication timeout}}<br />
* {{MessageLink|1293977249.5984.17.camel@vanquo.pezone.net|raise protocol version number}}<br />
** [http://archives.postgresql.org/message-id/1240.1301353669@sss.pgh.pa.us not done, would break things]<br />
* [http://archives.postgresql.org/message-id/AANLkTik6ArKPwnvA8_9XHo9j9+w4A2UEnsheX-mwR=Aj@mail.gmail.com fix attinhcount tracking]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00033.php SSI: SIReadLock lines in pg_locks don't show pid]<br />
** questions have been raised whether pid should be suppressed after connection closes or even maybe after transaction completion<br />
* [http://archives.postgresql.org/message-id/AANLkTi%3DkoHqna9WMm8_ATJN6c1GOLLp_0Tx6VswKhAdi%40mail.gmail.com synchronous_commit and synchronous_replication]<br />
* {{MessageLink|AANLkTimUuzn3DAaO1OLCijjjfcw0MXL-0HAvZKr9xRyV@mail.gmail.com|conversion from integer literals to money type}}<br />
* [http://archives.postgresql.org/message-id/4D892157.7070607@lelarge.info comments on SQL/MED objects]<br />
** rhaas says: [http://archives.postgresql.org/message-id/AANLkTimS1tDEuEeocz9PBjQ-RLSdjJ+VKFc_+F+Jm3Ou@mail.gmail.com proposed patch]<br />
** rhaas says: thom brown points out that [http://archives.postgresql.org/message-id/AANLkTi==NemsL7Vo=YfsPc8DdZActYF7ZwCFVO-0Nu21@mail.gmail.com I forgot USER MAPPINGs]<br />
** [http://archives.postgresql.org/message-id/7102.1302029271@sss.pgh.pa.us fixing user mappings opens unduly large can of worms]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-03/msg00352.php SSI: assertion failure on marking conflict-in due to race condition]<br />
** existing patch to recheck after trading shared LW locks for exclusive should fix<br />
* [http://archives.postgresql.org/message-id/29987.1301930239@sss.pgh.pa.us keywords table needs updating]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01170.php SSI: clumsy error handling results in generic message rather than more specific message with hint]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00157.php SSI: disable optimization when in subtransaction]<br />
* [http://archives.postgresql.org/message-id/4D9DD97B020000250003C53C@gw.wicourts.gov SSI: LOG message about SLRU wrap-around]<br />
* [http://archives.postgresql.org/message-id/BANLkTi=W8OrvqLHS+suU8R2b_rhFaqeEaw@mail.gmail.com sync rep and fast shutdown]<br />
** rhaas says: no easy resolution, i guess we'll leave this alone for now?<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-03/msg01913.php SSI: three different HTABs contend for shared memory in a free-for-all]<br />
** Heikki proposes "We'll need to teach dynahash not to allocate any more entries after the preallocation. A new HASH_NO_GROW flag to hash_create() seems like a suitable interface."<br />
** [http://archives.postgresql.org/message-id/BANLkTimVuicyZG4j3F427BgfA2iYP8Od_Q@mail.gmail.com Alternatively, we could just use an initial allocation which matches the maximum entries, to ensure that all HTABs can allocate ''at least'' the configured maximum.] (There's an existing patch for that.)<br />
** I committed a patch to add the new flag to hash_create() - Heikki<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00083.php SSI: failure to clean up some SLRU-summarized locks]<br />
** existing patch to properly set commitSeqNo on the offending locks should fix<br />
* {{MessageLink|20110410103636.GC10697@tornado.leadboat.com|ALTER TABLE ADD COLUMN not creating TOAST tables for inheritance children}}<br />
* {{MessageLink|20110410015728.GA10162@tornado.leadboat.com|typed table DDL loose ends}}<br />
* [http://archives.postgresql.org/message-id/29173.1301114203@sss.pgh.pa.us assorted collation issues]<br />
** [http://archives.postgresql.org/message-id/21742.1303137667@sss.pgh.pa.us almost there...]<br />
* [http://archives.postgresql.org/pgsql-hackers/2011-04/msg00334.php pl/python traceback fix]<br />
* [http://archives.postgresql.org/message-id/BANLkTinz2do3EqWO9RH66bY2FBOmL=uSoA@mail.gmail.com gin indexes don't get used unless you vacuum]<br />
* [http://archives.postgresql.org/message-id/4DB015B3020000250003CB51@gw.wicourts.gov SSI: High-contention UPDATE load on RAM disk database slows non-serializable transactions by a fraction of a percent]<br />
** [http://archives.postgresql.org/message-id/20110422220734.GG57793@csail.mit.edu Or speeds it up by a fraction of a percent. Either way it's too close to tease out from the noise easily.]<br />
** [http://archives.postgresql.org/message-id/20110425033308.GJ57793@csail.mit.edu As insurance against any performance hit in extreme high contention loads, it might be worthwhile to do a quick return (without taking any locks) if no serializable transactions are active.]<br />
* [http://archives.postgresql.org/message-id/201104201411.p3KEBOfA009414@wwwmaster.postgresql.org SSI: UPDATE setting a TOASTed value is broken, regardless of transaction isolation level]<br />
** [http://archives.postgresql.org/message-id/4DAEF7E6.7080107@enterprisedb.com proposed patch]<br />
* [http://archives.postgresql.org/message-id/BANLkTim6a_rfZ+UPPATTXap9Ed7-X7BzoA@mail.gmail.com foreign table permissions issues]<br />
* [http://archives.postgresql.org/pgsql-bugs/2011-04/msg00141.php CREATE TABLE IF NOT EXISTS doesn't work]<br />
* {{MessageLink|201103111328.p2BDSFd10499@momjian.us|Typed-tables patch broke pg_upgrade}}<br />
** {{MessageLink|20110418235041.GB2769@tornado.leadboat.com|proposed patch (tt2v2-binary-upgrade.patch)}}<br />
<br />
[[Category:PostgreSQL 9.1]]</div>Drkphttps://wiki.postgresql.org/index.php?title=Serializable&diff=14475Serializable2011-05-22T00:12:13Z<p>Drkp: add link to PGCon slides</p>
<hr />
<div>Information about the SSI implementation for the SERIALIZABLE transaction isolation level in PostgreSQL, new in release 9.1.<br />
<br />
== Overview ==<br />
<br />
With true serializable transactions, if you can show that your transaction will do the right thing if there are no concurrent transactions, it will do the right thing in any mix of serializable transactions or be rolled back with a serialization failure.<br />
<br />
This document is oriented toward the techniques used to implement the feature in PostgreSQL. For information oriented toward application programmers and database administrators, see the [[SSI]] Wiki page.<br />
<br />
=== Serializable and Snapshot Transaction Isolation Levels ===<br />
<br />
Serializable transaction isolation is attractive for shops with active development by many programmers against a complex schema because it guarantees data integrity with very little staff time -- if a transaction can be shown to always do the right thing when it is run alone (before or after any other transaction), it will always do the right thing in any mix of concurrent serializable transactions. Where conflicts with other transactions would result in an inconsistent state within the database, or an inconsistent view of the data, a serializable transaction will block or roll back to prevent the anomaly. The SQL standard provides a specific SQLSTATE for errors generated when a transaction rolls back for this reason, so that transactions can be retried automatically.<br />
<br />
PostgreSQL does not currently support a full serializable isolation level. A request for serializable transaction isolation actually provides snapshot isolation. This has well known anomalies which can allow data corruption or inconsistent views of the data during concurrent transactions; although these anomalies only occur when certain patterns of read-write dependencies exist within a set of concurrent transactions. Where these patterns exist, the anomalies can be prevented by introducing conflicts through explicitly programmed locks or otherwise unnecessary writes to the database. Snapshot isolation is popular because performance is better than serializable isolation and the integrity guarantees which it does provide allow anomalies to be avoided or managed with reasonable effort in many environments.<br />
<br />
<br />
[[Image:Serialization-Anomalies-in-Snapshot-Isolation.png|600px|center]]<br />
<nowiki><br />
<br />
<br />
</nowiki><br />
<br />
=== Serializable Isolation Implementation Strategies ===<br />
<br />
Techniques for implementing full serializable isolation have been published and in use in many database products for decades. The primary technique which has been used is Strict 2 Phase Locking (S2PL), which operates by blocking writes against data which has been read by concurrent transactions and blocking any access (read or write) against data which has been written by concurrent transactions. A cycle in a graph of blocking indicates a deadlock, requiring a rollback. Blocking and deadlocks under S2PL in high contention workloads can be debilitating, crippling throughput and response time.<br />
<br />
A new technique for implementing full serializable isolation in an MVCC database appears in the literature beginning in 2008<nowiki>[1][2]</nowiki>. This technique, known as Serializable Snapshot Isolation (SSI) has many of the advantages of snapshot isolation. In particular, reads don't block anything and writes don't block reads. Essentially, it runs snapshot isolation but monitors the read-write conflicts between transactions to identify dangerous structures in the transaction graph which indicate that a set of concurrent transactions might produce an anomaly, and rolls back transactions to ensure that no anomalies occur. It will produce some false positives (where a transaction is rolled back even though there would not have been an anomaly), but will never let an anomaly occur. In the two known prototype implementations, performance for many workloads (even with the need to restart transactions which are rolled back) is very close to snapshot isolation and generally far better than an S2PL implementation.<br />
<br />
=== Apparent Serial Order of Execution ===<br />
<br />
One way to understand when snapshot anomalies can occur, and to visualize the difference between the serializable implementations described above, is to consider that among transactions executing at the serializable transaction isolation level, the results are required to be consistent with ''some'' serial (one-at-a-time) execution of the transactions[4]. How is that order determined in each?<br />
<br />
S2PL locks rows used by the transaction in a way which blocks conflicting access, so that at the moment of a successful commit it is certain that no conflicting access has occurred. Some transactions may have blocked, essentially being partially serialized with the committing transaction, to allow this. Some transactions may have been rolled back, due to cycles in the blocking. But with S2PL, transactions can always be viewed as having occurred serially, in the order of successful commit.<br />
<br />
With snapshot isolation, reads never block writes, nor vice versa, so there is much less actual serialization. The order in which transactions appear to have executed is determined by something more subtle than in S2PL: read/write dependencies. If a transaction attempts to read data which is ''not'' visible to it because the transaction which wrote it (or will later write it) is concurrent (one of them was running when the other acquired its snapshot), then the reading transaction appears to have executed first, regardless of the actual sequence of transaction starts or commits (since it sees a database state prior to that in which the other transaction leaves it). If one transaction has both rw-dependencies ''in'' (meaning that a concurrent transaction attempts to read data it writes) and ''out'' (meaning it attempts to read data a concurrent transaction writes), and a couple other conditions are met, there can appear to be a cycle in execution order of the transactions. This is when the anomalies occur.<br />
<br />
SSI works by watching for the conditions mentioned above, and rolling back a transaction when needed to prevent any anomaly. The apparent order of execution will always be consistent with any actual serialization; but among ''concurrent'' transactions it will appear that the transaction on the ''read'' side of a rw-dependency executed before the transaction on the ''write'' side.<br />
<br />
=== PostgreSQL Implementation ===<br />
<br />
The implementation of serializable transactions for PostgreSQL is accomplished through Serializable Snapshot Isolation (SSI), based on the work of Cahill, et al[1][2]. Fundamentally, this allows snapshot isolation to run as it has, with monitoring for conditions which could create a serialization anomaly.<br />
<br />
* Since this technique is based on Snapshot Isolation (SI), those areas in PostgreSQL which don't use SI can't be brought under SSI. This includes system tables, temporary tables, sequences, hint bit rewrites, etc. SSI can not eliminate existing anomalies in these areas.<br />
<br />
* Any transaction which is run at a transaction isolation level other than SERIALIZABLE will not be affected by SSI. If you want to enforce business rules through SSI, all transactions should be run at the SERIALIZABLE transaction isolation level, and that should probably be set as the default.<br />
<br />
* If all transactions are run at the SERIALIZABLE transaction isolation level, business rules can be enforced in triggers or application code without ever having a need to acquire an explicit lock or to use SELECT FOR SHARE or SELECT FOR UPDATE.<br />
<br />
* Those who want to continue to use snapshot isolation without the additional protections of SSI (and the associated costs of enforcing those protections), can use the REPEATABLE READ transaction isolation level. This level will retain its legacy behavior, which is identical to the old SERIALIZABLE implementation and fully consistent with the standard's requirements for the REPEATABLE READ transaction isolation level.<br />
<br />
* Performance under this SSI implementation will be significantly improved if transactions which don't modify permanent tables are declared to be READ ONLY before they begin reading data.<br />
<br />
* Performance under SSI will tend to degrade more rapidly with a large number of active database transactions than under less strict isolation levels. Limiting the number of active transactions through use of a connection pool or similar techniques may be necessary to maintain good performance.<br />
<br />
* Any transaction which must be rolled back to prevent serialization anomalies will fail with SQLSTATE 40001, which has a standard meaning of "serialization failure".<br />
<br />
* This SSI implementation makes an effort to choose the transaction to be cancelled such that an immediate retry of the transaction can not fail due to conflicts with exactly the same transactions. Pursuant to this goal, no transaction is cancelled until one of the other transactions in the set of conflicts which could generate an anomaly has successfully committed. This is conceptually similar to how write conflicts are handled.<br />
<br />
== Current Status ==<br />
<br />
'''Accepted as a feature for PostgreSQL 9.1!'''<br />
<br />
Many thanks to Joe, Heikki, Jeff, and Anssi for posing questions and making suggestions which have led to improvements in the patch! Thanks to Markus for providing dtester at a critical juncture, which allowed progress to continue, and Heikki for developing the src/test/isolation code to move the dcheck tests into the main PostgreSQL testing framework. Also, thanks to the many who have participated in discussions along the way.<br />
<br />
There are some features which should be considered for 9.2 once 9.1 is settled down; most notably integration with hot standby and fine-grained support for index AMs other than btree. Most other proposed work is related to possible performance improvements, which should each be carefully benchmarked before being accepted. At the top of that list is better optimization of ''de facto'' read only transactions -- those which aren't flagged as read only, but which don't actually do any writes to permanent database tables.<br />
<br />
== Development Path ==<br />
<br />
In general, the approach taken was to try for the fastest possible implementation of a serializable isolation level which allowed no anomalies, even though it had many false positives and very poor performance, and then optimize until the rollback rate and overall performance were within a range which allows practical application. No existing isolation level was removed, since not everyone will want to pay the performance price for true serializable behavior. An important goal was that for those not using serializable transaction isolation, the patch doesn't cause performance regression.<br />
<br />
=== Credits ===<br />
<br />
'''Feature Authors''': [[User:Kgrittn|<span title="different title">Kevin Grittner</span>]] and [http://drkp.net/ Dan R. K. Ports].<br />
<br />
'''Testing Support Authors''': Markus Wanner (dtester used during most of development) and Heikki Linnakangas (testing support consistent with other PostgreSQL regression testing, so that we had a testing suite suitable for commit).<br />
<br />
'''Reviewers''': Joe Conway (warning elimination, bug chasing, and style comments), Jeff Davis (general review and found problems with GiST support and lack of 2PC support), Anssi Kääriäinen (found problems with conditional indexes and performance issue with sequential scans during testing with production data), YAMAMOTO Takashi (found numerous bugs during long and heavy testing), and Heikki Linnakangas (general review and many useful observations and suggestions, plus general improvements during commit process).<br />
<br />
'''Committers''': Joe Conway (initial comment and name changes), Heikki Linnakangas (the bulk of the patch and most follow-up fixes), and Robert Haas (some follow-up fixes).<br />
<br />
'''Thanks''' to all those who participated in the on-list discussions and offered advice and support off-list. There were so many who contributed in this way it would be practically impossible to generate an accurate list, but Robert Haas stands out for offering great advice on an overall development strategy.<br />
<br />
'''Special thanks''' to Emmanuel Cecchet for pointing out the ACM SIGMOD paper in which this technique was originally published[1], and to all those at the University of Sidney who contributed to the development of this innovative technique. This is what turned the discussion from wrangling over how best to document existing behavior toward changing it.<br />
<br />
=== Source Code Management ===<br />
<br />
A "serializable" git branch has been set up at this location:<br />
<br />
git://git.postgresql.org/git/users/kgrittn/postgres.git<br />
<br />
http://git.postgresql.org/git/users/kgrittn/postgres.git<br />
<br />
ssh://git@git.postgresql.org/users/kgrittn/postgres.git<br />
<br />
http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=shortlog;h=refs/heads/serializable<br />
<br />
=== Predicate Locking ===<br />
<br />
Both S2PL and SSI require some form of predicate locking to handle situations where reads conflict with later inserts or with later updates which move data into the selected range. PostgreSQL didn't have predicate locking, so it needed to be added. Practical implementations of predicate locking generally involve acquiring locks against data as it is accessed, using multiple granularities (tuple, page, table, etc.) with escalation as needed to keep the lock count to a number which can be tracked within RAM structures. Coarse granularities can cause some false positive indications of conflict. The number of false positives can be influenced by plan choice.<br />
<br />
==== Implementation overview ====<br />
<br />
New RAM structures, inspired by those used to track traditional locks in PostgreSQL, but tailored to the needs of SIREAD predicate locking, will be used. These will refer to physical objects actually accessed in the course of executing the query, to model the predicates through inference. Anyone interested in this subject should review the Hellerstein, Stonebraker and Hamilton paper<nowiki>[3]</nowiki>, along with the locking papers referenced from that and the Cahill papers<nowiki>[1][2]</nowiki>.<br />
<br />
Because the SIREAD locks don't block, traditional locking techniques must be modified. Intent locking (locking higher level objects before locking lower level objects) doesn't work with non-blocking "locks" (which are, in some respects, more like flags than locks).<br />
<br />
A configurable amount of shared memory is reserved at postmaster start-up to track predicate locks. This size cannot be changed without a restart.<br />
* To prevent resource exhaustion, multiple fine-grained locks may be promoted to a single coarser-grained lock as needed.<br />
* An attempt to acquire an SIREAD lock on a tuple when the same transaction already holds an SIREAD lock on the page or the relation will be ignored. Likewise, an attempt to lock a page when the relation is locked will be ignored, and the acquisition of a coarser lock will result in the automatic release of all finer-grained locks it covers.<br />
<br />
==== Heap locking ====<br />
<br />
Predicate locks will be acquired for the heap based on the following:<br />
* For a table scan, the entire relation will be locked.<br />
* Each tuple read which is visible to the reading transaction will be locked, whether or not it meets selection criteria; except that there is no need to acquire an SIREAD lock on a tuple when the transaction already holds a write lock on any tuple representing the row, since a rw-dependency would also create a ww-dependency which has more aggressive enforcement and will thus prevent any anomaly.<br />
<br />
==== Default index locking ====<br />
<br />
There is a new ampredlocks flag in pg_am which should be set to false for any index which doesn't handle the predicate locking internally; indexes flagged this way will be predicate locked at the index relation level. Such a lock will conflict with any insert into the index, but will not conflict, for example, with deletes, HOT updates, or inserts which don't match the WHERE clause on an index (if present). This will allow correct behavior at the serializable transaction isolation level for new index types with minimal initial effort; but adding the predicate locking calls and changing the flag will improve performance in high contention workloads involving serializable transactions.<br />
<br />
==== Index AM implementations ====<br />
<br />
Since predicate locks only exist to detect writes which conflict with earlier reads, and heap tuple locks are acquired to cover all heap tuples actually read, including those read through indexes, the index tuples which were actually scanned are not of interest in themselves; we only care about their "new neighbors" -- later inserts into the index which ''would'' have been included in the scan had they existed at the time. Conceptually, we want to lock the ''gaps'' between and surrounding index entries within the scanned range.<br />
<br />
''Correctness'' requires that any insert into an index generate a rw-conflict with a concurrent serializable transaction if, after that insert, re-execution of any index scan of the other transaction would access the heap for a row not accessed during the previous execution. Note that a non-HOT update which expires an old index entry covered by the scan and adds a new entry for the modified row's new tuple ''need not'' generate a conflict, although an update which "moves" a row into the scan ''must'' generate a conflict. While correctness allows false positives, they should be minimized for performance reasons.<br />
<br />
Several optimizations are possible:<br />
<br />
* An index scan which is just finding the right position for an index insertion or deletion need not acquire a predicate lock.<br />
<br />
* An index scan which is comparing for equality on the entire key for a unique index need not acquire a predicate lock as long as a key is found corresponding to a visible tuple which has not been modified by another transaction -- there are no "between or around" gaps to cover.<br />
<br />
* As long as built-in foreign key enforcement continues to use its current "special tricks" to deal with MVCC issues, predicate locks should not be needed for scans done by enforcement code.<br />
<br />
* If a search determines that no rows can be found regardless of index contents because the search conditions are contradictory (e.g., x = 1 AND x = 2), then no predicate lock is needed.<br />
<br />
Other index AM implementation considerations:<br />
<br />
* If a btree search discovers that no root page has yet been created, a predicate lock on the index relation is required; otherwise btree searches must get to the leaf level to determine which tuples match, so predicate locks go there.<br />
<br />
* GiST searches can determine that there are no matches at any level of the index, so there must be a predicate lock at each index level during a GiST search. An index insert at the leaf level can then be trusted to ripple up to all levels and locations where conflicting predicate locks may exist.<br />
<br />
* The effects of page splits, overflows, consolidations, and removals must be carefully reviewed to ensure that predicate locks aren't "lost" during those operations, or kept with pages which could get re-used for different parts of the index.<br />
<br />
=== Testing ===<br />
<br />
For this development effort to succeed, it was absolutely necessary to have some client application which allowed execution of test scripts with specific interleaving of statements run against multiple backends. The dtester module from Markus Wanner was used for this during most of development. It requires python and several python packages (including twisted). Due to package dependencies and licensing issues the dtester module was not appropriate for commit to the PostgreSQL code base.<br />
<br />
Heikki Linnakangas developed a testing framework based on existing regression test code which has been committed to src/test/isolation. Besides being compatible with other PostgreSQL testing, it runs faster than dtester. It doesn't provide a nice display of the results by statement ordering permutation, but that can be added if needed by filtering the current output.<br />
<br />
Like many other proposed features and optimizations, this area could benefit from a "performance test farm" so that serializable performance can be better compared to other isolation levels, and so the performance impact of future enhancements can be determined.<br />
<br />
=== Documentation ===<br />
<br />
A README-SSI file was created, largely drawn from this Wiki page.<br />
<br />
Someone with update rights to Wikipedia should probably update references there which will be outdated with this feature:<br />
<br />
* http://en.wikipedia.org/wiki/Snapshot_isolation<br />
* http://en.wikipedia.org/wiki/Isolation_%28database_systems%29<br />
<br />
== Innovations ==<br />
<br />
The PostgreSQL implementation of Serializable Snapshot Isolation differs from what is described in the cited papers for several reasons:<br />
# PostgreSQL didn't have any existing predicate locking. It had to be added from scratch.<br />
# The existing in-memory lock structures were not suitable for tracking SIREAD locks.<br />
#* The database products used for the prototype implementations for the papers used update-in-place with a rollback log for their MVCC implementations, while PostgreSQL leaves the old version of a row in place and adds a new tuple to represent the row at a new location.<br />
#* In PostgreSQL, tuple level locks are not held in RAM for any length of time; lock information is written to the tuples involved in the transactions.<br />
#* In PostgreSQL, existing lock structures have pointers to memory which is related to a connection. SIREAD locks need to persist past the end of the originating transaction and even the connection which ran it.<br />
#* PostgreSQL needs to be able to tolerate a large number of transactions executing while one long-running transaction stays open -- the in-RAM techniques discussed in the papers wouldn't support that.<br />
# Unlike the database products used for the prototypes described in the papers, PostgreSQL didn't already have a true serializable isolation level distinct from snapshot isolation.<br />
# PostgreSQL supports subtransactions -- an issue not mentioned in the papers.<br />
# PostgreSQL doesn't assign a transaction number to a database transaction until and unless necessary.<br />
# PostgreSQL has pluggable data types with user-definable operators, as well as pluggable index types, not all of which are based around data types which support ordering.<br />
# Some possible optimizations became apparent during development and testing.<br />
<br />
Differences from the implementation described in the papers are listed below.<br />
<br />
* New structures needed to be created in shared memory to track the proper information for serializable transactions and their SIREAD locks.<br />
<br />
* Because PostgreSQL does not have the same concept of an "oldest transaction ID" for all serializable transactions as assumed in the Cahill these, we track the oldest snapshot xmin among serializable transactions, and a count of how many active transactions use that xmin. When the count hits zero we find the new oldest xmin and run a clean-up based on that.<br />
<br />
* Predicate locking in PostgreSQL will start at the tuple level when possible, with automatic conversion of multiple fine-grained locks to coarser granularity as need to avoid resource exhaustion. The amount of memory used for these structures will be configurable, to balance RAM usage against SIREAD lock granularity.<br />
<br />
* A process-local copy of locks held by a process and the coarser covering locks with counts, are kept to support granularity promotion decisions with low CPU and locking overhead.<br />
<br />
* Conflicts are identified by looking for predicate locks when tuples are written and looking at the MVCC information when tuples are read. There is no matching between two RAM-based locks.<br />
<br />
* Because write locks are stored in the heap tuples rather than a RAM-based lock table, the optimization described in the Cahill thesis which eliminates an SIREAD lock where there is a write lock is implemented by the following:<br />
*# When checking a heap write for conflicts against existing predicate locks, a tuple lock on the tuple being written is removed.<br />
*# When acquiring a predicate lock on a heap tuple, we return quickly without doing anything if it is a tuple written by the reading transaction.<br />
<br />
* Rather than using conflictIn and conflictOut pointers which use NULL to indicate no conflict and a self-reference to indicate multiple conflicts or conflicts with committed transactions, we use a list of rw-conflicts. With the more complete information, false positives are reduced and we have sufficient data for more aggressive clean-up and other optimizations.<br />
** We can avoid ever rolling back a transaction until and unless there is a pivot where a transaction on the conflict *out* side of the pivot committed before either of the other transactions.<br />
** We can avoid ever rolling back a transaction when the transaction on the conflict *in* side of the pivot is explicitly or implicitly READ ONLY unless the transaction on the conflict *out* side of the pivot committed before the READ ONLY transaction acquired its snapshot. (An implicit READ ONLY transaction is one which committed without writing, even though it was not explicitly declared to be READ ONLY.)<br />
** We can more aggressively clean up conflicts, predicate locks, and SSI transaction information.<br />
<br />
* Allow a READ ONLY transaction to "opt out" of SSI if there are no READ WRITE transactions which could cause the READ ONLY transaction to ever become part of a "dangerous structure" of overlapping transaction dependencies.<br />
<br />
* Allow the user to request that a READ ONLY transaction ''wait'' until the conditions are right for it to start in the "opt out" state described above. We add a DEFERRABLE state to transactions, which is specified and maintained in a way similar to to READ ONLY. It is ignored for transactions which are not SERIALIZABLE ''and'' READ ONLY.<br />
<br />
* When a transaction must be rolled back, we pick among the active transactions such that an immediate retry will not fail again on conflicts with the same transactions.<br />
<br />
* We use the PostgreSQL SLRU system to hold summarized information about older committed transactions to put an upper bound on RAM used. Beyond that limit, information spills to disk. Performance can degrade in a pessimal situation, but it should be tolerable, and transactions won't need to be cancelled or blocked from starting.<br />
<br />
== R&D Issues ==<br />
<br />
This is intended to be the place to record specific issues which need more detailed review or analysis.<br />
<br />
* '''WAL file replay'''. While serializable implementations using S2PL can guarantee that the write-ahead log contains commits in a sequence consistent with some serial execution of serializable transactions, SSI cannot make that guarantee. While the WAL replay is no less consistent than under snapshot isolation, it is possible that under PITR recovery or hot standby a database could reach a readable state where some transactions appear before other transactions which would have had to precede them to maintain serializable consistency. In essence, if we do nothing, WAL replay will be at snapshot isolation even for serializable transactions. Is this OK? If not, how do we address it?<br />
<br />
* '''External replication'''. Look at how this impacts external replication solutions, like Postgres-R, Slony, pgpool, HS/SR, etc. This is related to the "WAL file replay" issue.<br />
<br />
* '''UNIQUE btree search for equality on all columns'''. Since a search of a UNIQUE index using equality tests on all columns will lock the heap tuple if an entry is found, it appears that there is no need to get a predicate lock on the index in that case. A predicate lock ''is'' still needed for such a search if a matching index entry which points to a visible tuple is ''not'' found.<br />
<br />
* '''Planner index probes'''. To avoid problems with data skew at the ends of an index which have historically caused bad plans, the planner now probes the end of an index to see what the maximum or minimum value is when a query appears to be requesting a range of data outside what statistics shows is present. These planner checks don't require predicate locking, but there's currently no easy way to avoid it. What can we do to avoid predicate locking for such planner activity?<br />
<br />
* '''Minimize touching of shared memory'''. Should lists in shared memory push entries which have just been returned to the ''front'' of the available list, so they will be popped back off soon and some memory might never be touched, or should we keep adding returned items to the ''end'' of the available list?<br />
<br />
== Discussion ==<br />
<br />
[http://archives.postgresql.org/message-id/4A0019EE.EE98.0025.0@wicourts.gov "Serializable Isolation without blocking" - discusses paper in ACM SIGMOD on SSI]<br />
<br />
[http://archives.postgresql.org/message-id/4B2788EA020000250002D51C@gw.wicourts.gov "Update on true serializable techniques in MVCC" - discusses Cahill Doctoral Thesis on SSI]<br />
<br />
[http://archives.postgresql.org/message-id/4B389C79020000250002D987@gw.wicourts.gov "Serializable implementation" - discusses Wisconsin Court System plans]<br />
<br />
[http://archives.postgresql.org/message-id/4B3B88F4020000250002DAE1@gw.wicourts.gov "A third lock method" - discusses development path: rough prototype to refine toward production]<br />
<br />
[http://archives.postgresql.org/message-id/1262718843.5908.183.camel@monkey-cat.sm.truviso.com "true serializability and predicate locking" - discusses GiST and GIN issues]<br />
<br />
[http://archives.postgresql.org/message-id/4BF43DF702000025000318BE@gw.wicourts.gov WIP patch for serializable transactions with predicate locking]<br />
<br />
[http://archives.postgresql.org/pgsql-hackers/2010-09/msg00022.php "serializable" in comments and names]<br />
<br />
[http://archives.postgresql.org/message-id/4C8F5DB202000025000356A0@gw.wicourts.gov Serializable Snapshot Isolation]<br />
<br />
[http://archives.postgresql.org/message-id/4CFB574702000025000382FD@gw.wicourts.gov serializable read only deferrable]<br />
<br />
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg02119.php SSI memory mitigation & false positive degradation]<br />
<br />
== Presentations ==<br />
<br />
From PostgreSQL Conference U.S. East 2010:<br />
[[media:Transaction-Isolation-in-PostgreSQL.odp|Current Transaction Isolation in PostgreSQL and future directions]]<br />
<br />
From PGCon 2011: <br />
[http://drkp.net/drkp/papers/ssi-pgcon11-slides.pdf Serializable Snapshot Isolation: Making ISOLATION LEVEL SERIALIZABLE Provide Serializable Isolation]<br />
<br />
== Publications ==<br />
<br />
<nowiki>[1]</nowiki> [http://doi.acm.org/10.1145/1376616.1376690 Michael J. Cahill, Uwe Röhm, and Alan D. Fekete. 2008. Serializable isolation for snapshot databases. In SIGMOD ’08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 729–738, New York, NY, USA. ACM.] (This paper is listed mostly for context; the subsequent paper covers the same ground and more.)<br />
<br />
<nowiki>[2]</nowiki> [http://hdl.handle.net/2123/5353 Michael James Cahill. 2009. Serializable Isolation for Snapshot Databases. Sydney Digital Theses. University of Sydney, School of Information Technologies.]<br />
<br />
<nowiki>[3]</nowiki> [http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf Joseph M. Hellerstein, Michael Stonebraker and James Hamilton. 2007. Architecture of a Database System. Foundations and Trends(R) in Databases Vol. 1, No. 2 (2007) 141–259.]<br />
Of particular interest:<br />
* 6.1 A Note on ACID<br />
* 6.2 A Brief Review of Serializability<br />
* 6.3 Locking and Latching<br />
* 6.3.1 Transaction Isolation Levels<br />
* 6.5.3 Next-Key Locking: Physical Surrogates for Logical Properties<br />
<br />
<nowiki>[4]</nowiki> [http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt SQL-92]<br />
Search for ''serial execution'' to find the relevant section.</div>Drkp