From: | Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Kevin Grittner <kgrittn(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql(at)j-davis(dot)com, joe(dot)conway(at)crunchydata(dot)com |
Subject: | In-place updates and serializable transactions |
Date: | 2018-11-14 04:45:43 |
Message-ID: | CAGz5QCJzreUqJqHeXrbEs6xb0zCNKBHhOj6D9Tjd3btJTzydxg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello hackers,
Currently, we're working on the serializable implementations for
zheap. As mentioned in README-SSI documentation[1], there is one
difference in SSI implementation of PostgreSQL that can differentiate
the conflict detection behaviour with other storage engines that
supports updates in-place.
The heap storage in PostgreSQL does not use "update in place" with a
rollback log for its MVCC implementation. Where possible it uses
"HOT" updates on the same page (if there is room and no indexed value
is changed). For non-HOT updates the old tuple is expired in place and
a new tuple is inserted at a new location. Because of this
difference, a tuple lock in PostgreSQL doesn't automatically lock any
other versions of a row. We can take the following example from the
doc to understand the situation in more detail:
T1 ---rw---> T2 ---ww--->T3
If transaction T1 reads a row version (thus acquiring a predicate lock
on it) and a second transaction T2 updates that row version (thus
creating a rw-conflict graph edge from T1 to T2), must a third
transaction T3 which re-updates the new version of the row also have a
rw-conflict in from T1 to prevent anomalies? In other words, does it
matter whether we recognize the edge T1 --rw--> T3? The document also
includes a nice proof for why we don't try to copy or expand a tuple
lock to any other versions of the row or why we don't have to
explicitly recognize the edge T1 --rw--> T3.
In PostgreSQL, the predicate locking is implemented using the tuple
id. In zheap, since we perform updates in-place, we don't change the
tuple id. So, in the above example, we easily recognize the edge
T1--rw--> T3. This may increase the number of false positives for
certain cases. In the above example, if we introduce another
transaction T4 such that T3 --rw--> T4 and T4 gets committed first,
for zheap, T3 will be rolled back because of the dangerous structure
T1 --rw--> T3 --rw--> T4. But, for heap, T3 can be committed(isolation
test case [2]). IMHO, this seems to be an acceptable behavior.
In brief, due to in-place updates, in some cases, the false positives
may increase for serializable transactions. Any thoughts?
[1] src/backend/storage/lmgr/README-SSI
[2] src/test/isolation/specs/multiple-row-versions.spec
--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2018-11-14 04:50:26 | Re: DSM segment handle generation in background workers |
Previous Message | Amit Langote | 2018-11-14 04:30:27 | Re: Speeding up INSERTs and UPDATEs to partitioned tables |