[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [HACKERS] [OSSTEST PATCH 0/1] PostgreSQL db: Retry on constraint violation [and 2 more messages] [and 1 more messages]
On Thu, Dec 15, 2016 at 6:09 AM, Ian Jackson <ian.jackson@xxxxxxxxxxxxx> wrote: > However, in that example, as you seem to allude to, there is still a > complete serialisation of all the transactions, even the failed T3: > T1,T2,T3. The database has detected the problem before returning data > in T3 that would contradict the serialisation order. In that case, yes. Thinking about it, I suspect that a read-only transaction will never actually return results that need to be ignored. Let me take a quick run at an argument to support that. I'm working from the previous conclusion about read-only transactions: that the read-only transaction will only be the one to experience serialization failure if the other two transactions involved in the "dangerous structure" have already committed without developing a serialization failure, and the failure will be detected during a read of the data by the read-only transaction -- never during commit. Catching the initial exception and trying to suppress it can cause it to resurface on the commit, but it would have been initially detected and a serialization error thrown on the read, but even if it is re-thrown on the commit, the initial exception would have prevented the data which contradicted already-committed state from being returned. I also realized some other properties of read-only transactions that might interest you (and that I should probably document). Since the only way for a read-only transaction to be the on experiencing a serialization failure is if Tout commits before the read-only transaction (which is always Tin) acquires its snapshot, Tpivot is still running when Tin acquires its snapshot, Tpivot commits before a serialization failure involving Tin is detected, and *then* Tin reads a data set affected by the writes of Tpivot. Since a snapshot is only acquired when a statement is run which requires a snapshot, that means that a query run in an implicit transaction (i.e., there is no START or BEGIN statement to explicitly start it; the SELECT or other data-accessing statement automatically starts the transaction so it has a valid context in which to run) that does not write data can never return bad results nor receive a serialization failure. Nor can those things happen on the *first* or *only* non-writing statement in an explicit transaction. > The thing that was puzzling me, after having slept on it, and before I > read your mail, was how it could happen that the serialisation failure > (of a transaction that did only reads) would only be detected at > commit. The point about attempts to suppress the serialisation > failure is part of the answer to that. Are there other reasons, > besides previously suppressed serialisation failures, why commit of a > transaction that did only reads[1] might fail ? I'm pretty confident that if you're not using prepared transactions the answer is "no". Our initial implementation of serializable prepared transactions was found to have a bug after crash and when dealing with the persisted data found during recovery. The safest way to fix that on stable branches was, until those prepared transactions which were found during recovery were committed or rolled back, to be *very* eager to throw serialization failures for any new transactions which developed a rw-dependency with them. This can be improved, but I fear that until and unless that happens, if "pre-crash" prepared transactions are still open, some of the deductions above may not hold. If you don't use prepared transactions, or promptly clean up any that were pending when a server crashes, that should not be a problem, but it's probably worth mentioning. One other situation in which I'm not entirely sure, and it would take me some time to review code to be sure, is if max_pred_locks_per_transaction is not set high enough to accommodate tracking all serializable transactions in allocated RAM (recognizing that they must often be tracked after commit, until overlapping serializable transactions commit), we have a mechanism to summarize some of the committed transactions and spill them to disk (using an internal SLRU module). The summarized data might not be able to determine all of the above as precisely as the "normal" data tracked in RAM. To avoid this, be generous when setting max_pred_locks_per_transaction; not only will it avoid this summarization, but it will reduce the amount of summarization of multiple page locks in the predicate locking system to relation locks. Coarser locks increase the "false positive" rate of serialization failures, reducing performance. > [1] I mean to include transactions which don't update even if they're > not explicitly declared `read only', so that the application retained > (until it said to commit) the option to try to make changes. There is an attempt to recognize, at commit time, *implicit* read-only transactions -- those that, in spite of not being *declared* as READ ONLY never wrote any data. Although these have higher overhead than transactions explicitly declared READ ONLY up front, many of the properties of the explicitly declared read-only transaction hold -- including, it seems to me, the property of never returning badly serialized data nor of being chosen to receive the serialization failure error unless both Tout and Tpivot have already committed (in that order). > Supposing I understand your `doomed' flag correctly, I think it is > then probably possible to construct an argument that proves that > allowing the application to trap and suppress serialisation failures > does not make it harder to provide coherency guarantees. That is the intention. > Or to put it another way: does pgsql already detect serialisation > problems (in transactions which only read) at the point where it would > otherwise return data not consistent with any serialisation order ? > (As it does in the `Rollover' example.) Yes, as explained above. > If so presumably it always throws a serialisation failure at that > point. I think that is then sufficient. There is no need to tell the > application programmer they have to commit even transactions which > only read. Well, if they don't explicitly start a transaction there is no need to explicitly commit it, period. An implicit transaction is created if a statement needing execution context (such as a SELECT) is started outside of any explicit transaction; but such a transaction is always explicitly committed or rolled back upon completion of the statement. There is always a transaction, but there is not always a need to explicitly manage it. > If my supposition is right then I will try to develop this argument > more formally. I think that would be worthwhile because the converse > property is very surprising to non-database programmers, and would > require very explicit documentation by pgsql, and careful attention by > application programmers. It would be nice to be able to document a > stronger promise. If you can put together a patch to improve the documentation, that is always welcome! In case you're not already aware of how we do things, patches are developed against the master branch, and then there is a determination about how far back to back-patch it in stable branches. While the rule is that stable branches are only modified to correct serious bugs or security vulnerabilities, in order to make it as safe as possible for people to apply minor releases without fear of breaking something that works, I think we could consider an argument for back-patching a doc change that clarifies or fills omissions that make it difficult to use a feature correctly. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |