Re: Remove HeapTuple and Buffer dependency for predicate locking functions

From: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: Re: Remove HeapTuple and Buffer dependency for predicate locking functions
Date: 2019-08-06 08:21:10
Message-ID: CAGz5QCL7DrJph9YXezZO=vYB8Yk3Whj=yYQmq-kgX+G+rTRa4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Thomas,

On Tue, Aug 6, 2019 at 9:50 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:
> > > 1. Commit dafaa3efb75 "Implement genuine serializable isolation
> > > level." (2011) locked the root tuple, because it set t_self to *tid.
> > > Possibly due to confusion about the effect of the preceding line
> > > ItemPointerSetOffsetNumber(tid, offnum).
> > >
> > > 2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
> > > fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
> > > offnum).
> >
> > Hm. It's not at all sure that it's ok to report the non-root tuple tid
> > here. I.e. I'm fairly sure there was a reason to not just set it to the
> > actual tid. I think I might have written that up on the list at some
> > point. Let me dig in memory and list. Obviously possible that that was
> > also obsoleted by parallel changes.
>
> Adding Heikki and Kevin.
>
> I haven't found your earlier discussion about that yet, but would be
> keen to read it if you can find it. I wondered if your argument
> might have had something to do with heap pruning, but I can't find a
> problem there. It's not as though the TID of any visible tuples
> change, it's just that some dead stuff goes away and the root line
> pointer is changed to LP_REDIRECT if it wasn't already.
>
> As for the argument for locking the tuple we emit rather than the HOT
> root, I think the questions are: (1) How exactly do we get away with
> locking only one version in a regular non-HOT update chain? (2) Is it
> OK to do that with a HOT root?
>
> The answer to the first question is given in README-SSI[1].
> Unfortunately it doesn't discuss HOT directly, but I suspect the
> answer is no, HOT is not special here. By my reading, it relies on
> the version you lock being the version visible to your snapshot, which
> is important because later updates have to touch that tuple to write
> the next version. That doesn't apply to some arbitrarily older tuple
> that happens to be a HOT root. Concretely, heap_update() does
> CheckForSerializableConflictIn(relation, &oldtup, buffer), which is
> only going to produce a rw conflict if T1 took an SIREAD on precisely
> the version T2 locks in that path, not some arbitrarily older version
> that happens to be a HOT root. A HOT root might never be considered
> again by concurrent writers, no?
>
If I understand the problem, this is the same serialization issue as
with in-place updates for zheap. I had a discussion with Kevin
regarding the same in this thread [1]. It seems if we're locking the
hot root id, we may report some false positive serializable errors.

> > > This must be in want of an isolation test, but I haven't yet tried to
> > > get my head around how to write a test that would show the difference.
> >
> > Indeed.
>
> One practical problem is that the only way to reach
> PredicateLockTuple() is from an index scan, and the index scan locks
> the index page (or the whole index, depending on
> rd_indam->ampredlocks). So I think if you want to see a serialization
> anomaly you'll need multiple indexes (so that index page locks don't
> hide the problem), a long enough HOT chain and then probably several
> transactions to be able to miss a cycle that should be picked up by
> the logic in [1]. I'm out of steam for this problem today though.
>
> The simple test from the report[3] that resulted in commit 81fbbfe3352
> doesn't work for me (ie with permutation "r1" "r2" "w1" "w2" "c1" "c2"
> twice in a row). The best I've come up with so far is an assertion
> that we predicate-lock the same row version that we emitted to the
> user, when reached via an index lookup that visits a HOT row. The
> test outputs 'f' for master, but 't' with the change to heapam.c.
>
Here is an example from the multiple-row-versions isolation test which
fails if we perform in-place updates for zheap. I think the same will
be relevant if we lock root tuple id instead of the tuple itself.
Step 1: T1-> BEGIN; Read FROM t where id=1000000;
Step 2: T2-> BEGIN; UPDATE t where id=1000000; COMMIT; (creates T1->T2)
Step 3: T3-> BEGIN; UPDATE t where id=1000000; Read FROM t where id=500000;
Step 4: T4-> BEGIN; UPDATE t where id= 500000; Read FROM t where id=1;
COMMIT; (creates T3->T4)
Step 5: T3-> COMMIT;
Step 6: T1-> UPDATE t where id=1; COMMIT; (creates T4->T1,)

At step 6, when the update statement is executed, T1 is rolled back
because of T3->T4->T1.

But for zheap, step 3 also creates a dependency T1->T3 because of
in-place update. When T4 commits in step 4, it marks T3 as doomed
because of T1 --> T3 --> T4. Hence, in step 5, T3 is rolled back.

[1] Re: In-place updates and serializable transactions:
https://www.postgresql.org/message-id/CAGz5QCJzreUqJqHeXrbEs6xb0zCNKBHhOj6D9Tjd3btJTzydxg%40mail.gmail.com

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2019-08-06 08:31:58 Re: Global temporary tables
Previous Message Kyotaro Horiguchi 2019-08-06 08:11:40 Re: BUG #15938: Corrupted WAL segment after crash recovery