Re: Unnecessary delay in streaming replication due to replay lag

From: Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "lchch1990(at)sina(dot)cn" <lchch1990(at)sina(dot)cn>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "Hao Wu (Pivotal)" <hawu(at)pivotal(dot)io>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Ashwin Agrawal <ashwinstar(at)gmail(dot)com>
Subject: Re: Unnecessary delay in streaming replication due to replay lag
Date: 2021-12-16 01:01:24
Message-ID: CAE-ML+_aH-pU9ZE2Cogc=OcO8cQx6cFgb2PF1xm6wyN3fPSoCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Bharath,

Thanks for the review!

On Sat, Nov 27, 2021 at 6:36 PM Bharath Rupireddy <
bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:

> 1) A memory leak: add FreeDir(dir); before returning.
> + ereport(LOG,
> + (errmsg("Could not start streaming WAL eagerly"),
> + errdetail("There are timeline changes in the locally available WAL
files."),
> + errhint("WAL streaming will begin once all local WAL and archives
> are exhausted.")));
> + return;
> + }
>

Thanks for catching that. Fixed.

>
>
> 2) Is there a guarantee that while we traverse the pg_wal directory to
> find startsegno and endsegno, the new wal files arrive from the
> primary or archive location or old wal files get removed/recycled by
> the standby? Especially when wal_receiver_start_condition=consistency?
> + startsegno = (startsegno == -1) ? logSegNo : Min(startsegno, logSegNo);
> + endsegno = (endsegno == -1) ? logSegNo : Max(endsegno, logSegNo);
> + }
>

Even if newer wal files arrive after the snapshot of the dir listing
taken by AllocateDir()/ReadDir(), we will in effect start from a
slightly older location, which should be fine. It shouldn't matter if
an older file is recycled. If the last valid WAL segment is recycled,
we will ERROR out in StartWALReceiverEagerlyIfPossible() and the eager
start can be retried by the startup process when
CheckRecoveryConsistency() is called again.

>
>
> 3) I think the errmsg text format isn't correct. Note that the errmsg
> text starts with lowercase and doesn't end with "." whereas errdetail
> or errhint starts with uppercase and ends with ".". Please check other
> messages for reference.
> The following should be changed.
> + errmsg("Requesting stream from beginning of: %s",
> + errmsg("Invalid WAL segment found while calculating stream start:
> %s. Skipping.",
> + (errmsg("Could not start streaming WAL eagerly"),

Fixed.

> 4) I think you also need to have wal files names in double quotes,
> something like below:
> errmsg("could not close file \"%s\": %m", xlogfname)));

Fixed.

>
> 5) It is ".....stream start: \"%s\", skipping..",
> + errmsg("Invalid WAL segment found while calculating stream start:
> %s. Skipping.",

Fixed.

> 4) I think the patch can make the startup process significantly slow,
> especially when there are lots of wal files that exist in the standby
> pg_wal directory. This is because of the overhead
> StartWALReceiverEagerlyIfPossible adds i.e. doing two while loops to
> figure out the start position of the
> streaming in advance. This might end up the startup process doing the
> loop over in the directory rather than the important thing of doing
> crash recovery or standby recovery.

Well, 99% of the time we can expect that the second loop finishes after
1 or 2 iterations, as the last valid WAL segment would most likely be
the highest numbered WAL file or thereabouts. I don't think that the
overhead will be significant as we are just looking up a directory
listing and not reading any files.

> 5) What happens if this new GUC is enabled in case of a synchronous
standby?
> What happens if this new GUC is enabled in case of a crash recovery?
> What happens if this new GUC is enabled in case a restore command is
> set i.e. standby performing archive recovery?

The GUC would behave the same way for all of these cases. If we have
chosen 'startup'/'consistency', we would be starting the WAL receiver
eagerly. There might be certain race conditions when one combines this
GUC with archive recovery, which was discussed upthread. [1]

> 6) How about bgwriter/checkpointer which gets started even before the
> startup process (or a new bg worker? of course it's going to be an
> overkill) finding out the new start pos for the startup process and
> then we could get rid of <literal>startup</literal> behaviour of the
> patch? This avoids an extra burden on the startup process. Many times,
> users will be complaining about why recovery is taking more time now,
> after the GUC wal_receiver_start_condition=startup.

Hmm, then we would be needing additional synchronization. There will
also be an added dependency on checkpoint_timeout. I don't think that
the performance hit is significant enough to warrant this change.

> 8) Can we have a better GUC name than wal_receiver_start_condition?
> Something like wal_receiver_start_at or wal_receiver_start or some
> other?

Sure, that makes more sense. Fixed.

Regards,
Soumyadeep (VMware)

[1]
https://www.postgresql.org/message-id/CAE-ML%2B-8KnuJqXKHz0mrC7-qFMQJ3ArDC78X3-AjGKos7Ceocw%40mail.gmail.com

Attachment Content-Type Size
v5-0001-Introduce-feature-to-start-WAL-receiver-eagerly.patch text/x-patch 17.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2021-12-16 01:13:57 [PATCH] Accept IP addresses in server certificate SANs
Previous Message Michael Paquier 2021-12-16 00:22:40 Re: A test for replay of regression tests