Re: [HACKERS] Replication status in logical replication

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Replication status in logical replication
Date: 2018-01-09 04:36:11
Message-ID: CAD21AoBgZikbVB0mUzL+PsJkSbnbWw+XYU1aQD7yV9nwdWY10w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 7, 2018 at 7:50 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 26 December 2017 at 00:26, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> On Tue, Dec 26, 2017 at 1:10 AM, Petr Jelinek
>> <petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
>>> On 21/11/17 22:06, Masahiko Sawada wrote:
>>>>
>>>> After investigation, I found out that my previous patch was wrong
>>>> direction. I should have changed XLogSendLogical() so that we can
>>>> check the read LSN and set WalSndCaughtUp = true even after read a
>>>> record without wait. Attached updated patch passed 'make check-world'.
>>>> Please review it.
>>>>
>>>
>>> Hi,
>>>
>>> This version looks good to me and seems to be in line with what we do in
>>> physical replication.
>>>
>>> Marking as ready for committer.
>>>
>>
>> Thank you for reviewing this patch!
>
> The patch calls this AFTER processing the record
> if (sentPtr >= GetFlushRecPtr())
> but it seems better to call GetFlushRecPtr() before we process the
> record, otherwise the flush pointer might have moved forwards while we
> process the record and it wouldn't catch up. (Physical replication
> works like that).

Agreed.

> New patch version attached for discussion before commit. (v4)

v4 patch looks good to me.

>
> I'd rather not call it at all at that point though, so if we made
> RecentFlushPtr static at the module level rather than within
> WalSndWaitForWal we could use it here also. That's a bit more invasive
> for backpatching, so not implemented that here.
>
> Overall, I find setting WalSndCaughtUp = false at the top of
> XLogSendLogical() to be incredibly ugly and I would like to remove it.
> It can't be correct to have a static status variable that oscillates
> between false and true with every record. This seems to be done
> because of the lack of a logical initialization call. Petr? Peter?
> Version with this removed (v4alt2)
>
> I've removed the edit that fusses over English grammar: both ways are correct.
>
>> I think this patch can be
>> back-patched to 9.4 as Simon mentioned.
>
> This patch appears to cause this DEBUG1 message
>
> "standby \"%s\" has now caught up with primary"
>
> which probably isn't the right message, but might be OK to backpatch.
>
> Thoughts on better wording?
>

I think that this DEBUG1 message appears when sent any WAL after
caught up even without this patch. This patch makes this message
appear at a properly timing. Or am I missing something?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-01-09 04:47:53 Re: refactor subscription tests to use PostgresNode's wait_for_catchup
Previous Message David Rowley 2018-01-09 04:36:10 Re: [HACKERS] Proposal: Local indexes for partitioned table