Re: [HACKERS] Issues with logical replication

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: [HACKERS] Issues with logical replication
Date: 2017-12-01 17:31:23
Message-ID: 87261b51-0fb8-17c1-972d-863836ff009d@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30/11/17 11:48, Simon Riggs wrote:
> On 30 November 2017 at 11:30, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
>> On 30/11/17 00:47, Andres Freund wrote:
>>> On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote:
>>>> I don't understand. I mean sure the SnapBuildWaitSnapshot() can live
>>>> with it, but the problematic logic happens inside the
>>>> XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of
>>>> detecting the situation short of reimplementing the
>>>> XactLockTableInsert() instead of calling it.
>>>
>>> Right. But we fairly trivially can change that. I'm remarking on it
>>> because other people's, not yours, suggestions aimed at making this
>>> bulletproof. I just wanted to make clear that I don't think that's
>>> necessary at all.
>>>
>>
>> Okay, then I guess we are in agreement. I can confirm that the attached
>> fixes the issue in my tests. Using SubTransGetTopmostTransaction()
>> instead of SubTransGetParent() means 3 more ifs in terms of extra CPU
>> cost for other callers. I don't think it's worth worrying about given we
>> are waiting for heavyweight lock, but if we did we can just inline the
>> code directly into SnapBuildWaitSnapshot().
>
> This will still fail an Assert in TransactionIdIsInProgress() when
> snapshots are overflowed.
>

Hmm, which one, why?

I see 2 Asserts there, one is:
> Assert(nxids == 0);
Which is inside the RecoveryInProgress(), surely on standbys there will
still be no PGXACTs with assigned xids so that should be fine.

The other one is:
> Assert(TransactionIdIsValid(topxid));
Which should be again fine toplevel xid of toplevel xid is same xid
which is a valid one.

So I think we should be fine there.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Vitaliy Garnashevich 2017-12-01 17:40:08 Bitmap scan is undercosted?
Previous Message Tomas Vondra 2017-12-01 17:15:47 Re: BUG #14941: Vacuum crashes