Re: Unnecessary delay in streaming replication due to replay lag

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "lchch1990(at)sina(dot)cn" <lchch1990(at)sina(dot)cn>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "Hao Wu (Pivotal)" <hawu(at)pivotal(dot)io>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Ashwin Agrawal <ashwinstar(at)gmail(dot)com>
Subject: Re: Unnecessary delay in streaming replication due to replay lag
Date: 2021-11-28 02:35:52
Message-ID: CALj2ACXr0J-3Nqc5Y_SK8SDC8jLgUtwqODUwHnhAdWrCBL-sKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 23, 2021 at 1:39 AM Soumyadeep Chakraborty
<soumyadeep2007(at)gmail(dot)com> wrote:
>
> Hi Bharath,
>
> Yes, that thread has been discussed here. Asim had x-posted the patch to [1]. This thread
> was more recent when Ashwin and I picked up the patch in Aug 2021, so we continued here.
> The patch has been significantly updated by us, addressing Michael's long outstanding feedback.

Thanks for the patch. I reviewed it a bit, here are some comments:

1) A memory leak: add FreeDir(dir); before returning.
+ ereport(LOG,
+ (errmsg("Could not start streaming WAL eagerly"),
+ errdetail("There are timeline changes in the locally available WAL files."),
+ errhint("WAL streaming will begin once all local WAL and archives
are exhausted.")));
+ return;
+ }

2) Is there a guarantee that while we traverse the pg_wal directory to
find startsegno and endsegno, the new wal files arrive from the
primary or archive location or old wal files get removed/recycled by
the standby? Especially when wal_receiver_start_condition=consistency?
+ startsegno = (startsegno == -1) ? logSegNo : Min(startsegno, logSegNo);
+ endsegno = (endsegno == -1) ? logSegNo : Max(endsegno, logSegNo);
+ }

3) I think the errmsg text format isn't correct. Note that the errmsg
text starts with lowercase and doesn't end with "." whereas errdetail
or errhint starts with uppercase and ends with ".". Please check other
messages for reference.
The following should be changed.
+ errmsg("Requesting stream from beginning of: %s",
+ errmsg("Invalid WAL segment found while calculating stream start:
%s. Skipping.",
+ (errmsg("Could not start streaming WAL eagerly"),

4) I think you also need to have wal files names in double quotes,
something like below:
errmsg("could not close file \"%s\": %m", xlogfname)));

5) It is ".....stream start: \"%s\", skipping..",
+ errmsg("Invalid WAL segment found while calculating stream start:
%s. Skipping.",

4) I think the patch can make the startup process significantly slow,
especially when there are lots of wal files that exist in the standby
pg_wal directory. This is because of the overhead
StartWALReceiverEagerlyIfPossible adds i.e. doing two while loops to
figure out the start position of the
streaming in advance. This might end up the startup process doing the
loop over in the directory rather than the important thing of doing
crash recovery or standby recovery.

5) What happens if this new GUC is enabled in case of a synchronous standby?
What happens if this new GUC is enabled in case of a crash recovery?
What happens if this new GUC is enabled in case a restore command is
set i.e. standby performing archive recovery?

6) How about bgwriter/checkpointer which gets started even before the
startup process (or a new bg worker? of course it's going to be an
overkill) finding out the new start pos for the startup process and
then we could get rid of <literal>startup</literal> behaviour of the
patch? This avoids an extra burden on the startup process. Many times,
users will be complaining about why recovery is taking more time now,
after the GUC wal_receiver_start_condition=startup.

7) I think we can just have 'consistency' and 'exhaust' behaviours and
let the bgwrite or checkpointer find out the start position for the
startup process, so the startup process whenever reaches a consistent
point, it sees if the other process has calculated
start pos for it or not, if yes it starts wal receiver other wise it
goes with its usual recovery. I'm not sure if this will be a good
idea.

8) Can we have a better GUC name than wal_receiver_start_condition?
Something like wal_receiver_start_at or wal_receiver_start or some
other?

Regards,
Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-11-28 04:19:55 Re: pg_waldump stucks with options --follow or -f and --stats or -z
Previous Message Bharath Rupireddy 2021-11-28 02:23:13 Re: Deduplicate code updating ControleFile's DBState.