Re: [HACKERS] logical decoding of two-phase transactions

From: Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2018-04-05 06:50:42
Message-ID: CAMGcDxfAnJtJ17hejKFDa3xXS2OWpys+p-PvGS+MZj4h73QqPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tomas,

> 1) There's a race condition in LogicalLockTransaction. The code does
> roughly this:
>
> if (!BecomeDecodeGroupMember(...))
> ... bail out ...
>
> Assert(MyProc->decodeGroupLeader);
> lwlock = LockHashPartitionLockByProc(MyProc->decodeGroupLeader);
> ...
>
> but AFAICS there is no guarantee that the transaction does not commit
> (or even abort) right after the become decode group member. In which
> case LogicalDecodeRemoveTransaction might have already reset our pointer
> to a leader to NULL. In which case the Assert() and lock will fail.
>
> I've initially thought this can be fixed by setting decodeLocked=true in
> BecomeDecodeGroupMember, but that's not really true - that would fix the
> race for aborts, but not commits. LogicalDecodeRemoveTransaction skips
> the wait for commits entirely, and just resets the flags anyway.
>
> So this needs a different fix, I think. BecomeDecodeGroupMember also
> needs the leader PGPROC pointer, but it does not have the issue because
> it gets it as a parameter. I think the same thing would work for here
> too - that is, use the AssignDecodeGroupLeader() result instead.
>

That's a good catch. One of the earlier patches had a check for this
(it also had an ill-placed assert above though) which we removed as
part of the ongoing review.

Instead of doing the above, we can just re-check if the
decodeGroupLeader pointer has become NULL and if so, re-assert that
the leader has indeed gone away before returning false. I propose a
diff like below.

/*

* If we were able to add ourself, then Abort processing will

- * interlock with us.

+ * interlock with us. If the leader was done in the meanwhile

+ * it could have removed us and gone away as well.

*/

- Assert(MyProc->decodeGroupLeader);

+ if (MyProc->decodeGroupLeader == NULL)

+ {

+ Assert(BackendXidGetProc(txn->xid) == NULL);

+ return false

+ }

>
> 2) BecomeDecodeGroupMember sets the decodeGroupLeader=NULL when the
> leader does not match the parameters, despite enforcing it by Assert()
> at the beginning. Let's remove that assignment.
>

Ok, done.

>
> 3) I don't quite understand why BecomeDecodeGroupMember does the
> cross-check using PID. In which case would it help?
>

When I wrote this support, I had written it with the intention of
supporting both 2PC (in which case pid is 0) and in-progress regular
transactions. That's why the presence of PID in these functions. The
current use case is just for 2PC, so we could remove it.

>
> 4) AssignDecodeGroupLeader still sets pid, which is never read. Remove.
>

Ok, will do.

>
> 5) ReorderBufferCommitInternal does elog(LOG) about interrupting the
> decoding of aborted transaction only in one place. There are about three
> other places where we check LogicalLockTransaction. Seems inconsistent.
>

Note that I have added it for the OPTIONAL test_decoding test cases
(which AFAIK we don't plan to commit in that state) which demonstrate
concurrent rollback interlocking with the lock/unlock APIs. The first
ELOG was enough to catch the interaction. If we think these elogs
should be present in the code, then yes, I can add it elsewhere as
well as part of an earlier patch.

>
> 6) The comment before LogicalLockTransaction is somewhat inaccurate,
> because it talks about adding/removing the backend to the group, but
> that's not what's happening. We join the group on the first call and
> then we only tweak the decodeLocked flag.
>

True.

>
> 7) I propose minor changes to a couple of comments.
>

Ok, I am looking at your provided patch and incorporating relevant
changes from it. WIll submit a patch set soon.

Regards,
Nikhils

--
Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2018-04-05 06:51:26 Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key
Previous Message Masahiko Sawada 2018-04-05 06:43:55 Re: [HACKERS] GUC for cleanup indexes threshold.