Re: Replication & recovery_min_apply_delay

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication & recovery_min_apply_delay
Date: 2019-09-10 06:23:25
Message-ID: 20190910062325.GD11737@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 10, 2019 at 12:46:49AM +0300, Alexander Korotkov wrote:
> On Wed, Sep 4, 2019 at 4:37 PM Konstantin Knizhnik
> <k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>> receivedUpto is just static variable in xlog.c, maintained by WAL receiver.
>> But as I mentioned above, WAL receiver is not started at the moment when
>> we need to know LSN of last record.
>>
>> Certainly it should be possible to somehow persist receveidUpto, so we
>> do not need to scan WAL to determine the last LSN at next start.
>> By persisting last LSN introduce a lot of questions and problems.
>> For example when it needs to be flushed for the disk. If it is done
>> after each received transaction, then it can significantly suffer
>> performance.
>> If it is done more or less asynchronously, then there us a risk that we
>> requested streaming with wrong position.
>> In any case it will significantly complicate the patch and make it more
>> sensible for various errors.
>
> I think we don't necessary need exact value of receveidUpto. But it
> could be some place to start scanning WAL from. We currently call
> UpdateControlFile() in a lot of places. In particular we call it each
> checkpoint. If even we would start scanning WAL from one checkpoint
> back value of receveidUpto, we could still save a lot of resources.

A minimum to set would be the minimum consistency LSN, but there are a
lot of gotchas to take into account when it comes to crash recovery.

> As I get this patch fixes a problem with very large recovery apply
> delay. In this case, amount of accumulated WAL corresponding to that
> delay could be also huge. Scanning all this amount of WAL could be
> costly. And it's nice to evade.

Yes, I suspect that it could be very costly in some configurations if
there is a large gap between the last replayed LSN and the last LSN
the WAL receiver has flushed.

There is an extra thing, which has not been mentioned yet on this
thread, that we need to be very careful about:
<para>
When the standby is started and <varname>primary_conninfo</varname> is set
correctly, the standby will connect to the primary after replaying all
WAL files available in the archive. If the connection is established
successfully, you will see a walreceiver process in the standby, and
a corresponding walsender process in the primary.
</para>
This is a long-standing behavior, and based on the first patch
proposed we would start a WAL receiver once consistency has been
reached if there is any delay defined even if restore_command is
enabled. We cannot assume either that everybody will want to start a
WAL receiver in this configuration if there is archiving behind with a
lot of segments which allow for a larger catchup window..
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-09-10 06:38:21 Re: [patch]socket_timeout in interfaces/libpq
Previous Message Amit Kapila 2019-09-10 05:48:12 Re: pgbench - allow to create partitioned tables