Re: [HACKERS] Issues with logical replication

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
Cc: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: [HACKERS] Issues with logical replication
Date: 2017-11-16 15:36:40
Message-ID: CA+TgmoanD-sMvKCBi_8tqptfUnLn-4O61XQredLwFRCWGZCadg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 15, 2017 at 8:20 PM, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru> wrote:
> I did a sketch of first approach just to confirm that it solves the problem.
> But there I hold ProcArrayLock during update of flag. Since only reader is
> GetRunningTransactionData it possible to have a custom lock there. In
> this case GetRunningTransactionData will hold three locks simultaneously,
> since it already holds ProcArrayLock and XidGenLock =)

To me, it seems like SnapBuildWaitSnapshot() is fundamentally
misdesigned, and ideally Petr (who wrote the patch) or Andres (who
committed it) ought to get involved here and help fix this problem.
My own first inclination would be to rewrite this as a loop: if the
transaction ID precedes the oldest running XID, then continue; else if
TransactionIdDidCommit() || TransactionIdDidAbort() then conclude that
we don't need to wait; else XactLockTableWait() then loop. That way,
if you hit the race condition, you'll just busy-wait instead of doing
the wrong thing. Maybe insert a sleep(1) if we retry more than once.
That sucks, of course, but it seems like a better idea than trying to
redesign XactLockTableWait() or the procarray, which could affect an
awful lot of other things.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-11-16 15:38:21 Re: Further simplification of c.h's #include section
Previous Message Stephen Frost 2017-11-16 15:34:06 Re: Schedule for migration to pglister