Re: [HACKERS] Issues with logical replication

From: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: [HACKERS] Issues with logical replication
Date: 2017-11-29 19:11:25
Message-ID: DAA36A22-324E-46BA-9C39-2F135A08A956@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 29 Nov 2017, at 18:46, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
>
> What I don't understand is how it leads to crash (and I could not
> reproduce it using the pgbench file attached in this thread either) and
> moreover how it leads to 0 xid being logged. The only explanation I can
> come up is that some kind of similar race has to be in
> LogStandbySnapshot() but we explicitly check for 0 xid value there.
>

Zero xid isn’t logged. Loop in XactLockTableWait() does following:

for (;;)
{
Assert(TransactionIdIsValid(xid));
Assert(!TransactionIdEquals(xid, GetTopTransactionIdIfAny()));

<...>

xid = SubTransGetParent(xid);
}

So if last statement is reached for top transaction then next iteration
will crash in first assert. And it will be reached if whole this loop
happens before transaction acquired heavyweight lock.

Probability of that crash can be significantly increased be adding sleep
between xid generation and lock insertion in AssignTransactionId().

Attachment Content-Type Size
AssignTransactionId.patch application/octet-stream 605 bytes
unknown_filename text/plain 97 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2017-11-29 19:17:27 Re: [HACKERS] SQL procedures
Previous Message Nikolay Shaplov 2017-11-29 19:00:10 Re: [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM