Re: Replication & recovery_min_apply_delay

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication & recovery_min_apply_delay
Date: 2019-01-30 14:32:28
Message-ID: 201901301432.p3utg64hum27@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 2019-Jan-30, Konstantin Knizhnik wrote:

> One of our customers was faced with the following problem: he has
> setup  physical primary-slave replication but for some reasons
> specified very large (~12 hours) recovery_min_apply_delay.

We also came across this exact same problem some time ago. It's pretty
nasty. I wrote a quick TAP reproducer, attached (needed a quick patch
for PostgresNode itself too.)

I tried several failed strategies:
1. setting lastSourceFailed just before sleeping for apply delay, with
the idea that for the next fetch we would try stream. But this
doesn't work because WaitForWalToBecomeAvailable is not executed.

2. split WaitForWalToBecomeAvailable in two pieces, so that we can call
the first half in the restore loop. But this causes 1s of wait
between segments (error recovery) and we never actually catch up.

What back then I thought was the *real* solution but I didn't get around
to implementing is the idea you describe to start a walreceiver at an
earlier point.

> I wonder if it can be considered as acceptable solution of the problem or
> there can be some better approach?

I didn't find one.

Álvaro Herrera
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
0001-Support-pg_basebackup-S-in-PostgresNode-backup.patch text/x-diff 1.6 KB text/x-perl 1.8 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-01-30 14:41:04 Re: Unused parameters & co in code
Previous Message Michail Nikolaev 2019-01-30 14:09:48 Re: Synchronous replay take III