Re: Replication & recovery_min_apply_delay

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication & recovery_min_apply_delay
Date: 2019-09-09 21:46:49
Message-ID: CAPpHfdt9VS-Ftry9T388q1J00D4nzw88dtXq-FPFPm7VM_fi_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 4, 2019 at 4:37 PM Konstantin Knizhnik
<k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
> receivedUpto is just static variable in xlog.c, maintained by WAL receiver.
> But as I mentioned above, WAL receiver is not started at the moment when
> we need to know LSN of last record.
>
> Certainly it should be possible to somehow persist receveidUpto, so we
> do not need to scan WAL to determine the last LSN at next start.
> By persisting last LSN introduce a lot of questions and problems.
> For example when it needs to be flushed for the disk. If it is done
> after each received transaction, then it can significantly suffer
> performance.
> If it is done more or less asynchronously, then there us a risk that we
> requested streaming with wrong position.
> In any case it will significantly complicate the patch and make it more
> sensible for various errors.

I think we don't necessary need exact value of receveidUpto. But it
could be some place to start scanning WAL from. We currently call
UpdateControlFile() in a lot of places. In particular we call it each
checkpoint. If even we would start scanning WAL from one checkpoint
back value of receveidUpto, we could still save a lot of resources.

> I wonder what is wrong with determining LSN of last record by just
> scanning WAL?
> Certainly it is not the most efficient way. But I do not expect that
> somebody will have hundreds or thousands megabytes of WAL.
> Michael, do you see some other problems with GetLastLSN() functions
> except time of its execution?

As I get this patch fixes a problem with very large recovery apply
delay. In this case, amount of accumulated WAL corresponding to that
delay could be also huge. Scanning all this amount of WAL could be
costly. And it's nice to evade.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-09-09 21:55:38 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message Alexander Korotkov 2019-09-09 21:38:17 Re: [PATCH] ltree, lquery, and ltxtquery binary protocol support