Re: Switching XLog source from archive to streaming when primary available

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, cary(dot)huang(at)highgo(dot)ca, pgsql-hackers(at)lists(dot)postgresql(dot)org, satyanarlapuram(at)gmail(dot)com
Subject: Re: Switching XLog source from archive to streaming when primary available
Date: 2022-09-09 22:05:23
Message-ID: 20220909220523.GA2258997@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 09, 2022 at 11:07:00PM +0530, Bharath Rupireddy wrote:
> On Fri, Sep 9, 2022 at 10:29 PM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>> IMO the timeout approach would be more intuitive for users. When it comes
>> to archive recovery, "WAL segment" isn't a standard unit of measure. WAL
>> segment size can differ between clusters, and WAL files can have different
>> amounts of data or take different amounts of time to replay.
>
> How about the amount of WAL bytes fetched from the archive after which
> a standby attempts to connect to primary or enter streaming mode? Of
> late, we've changed some GUCs to represent bytes instead of WAL
> files/segments, see [1].

Well, for wal_keep_size, using bytes makes sense. Given you know how much
disk space you have, you can set this parameter accordingly to avoid
retaining too much of it for standby servers. For your proposed parameter,
it's not so simple. The same setting could have wildly different timing
behavior depending on the server. I still think that a timeout is the most
intuitive.

>> So I think it
>> would be difficult for the end user to decide on a value. However, even
>> the timeout approach has this sort of problem. If your parameter is set to
>> 1 minute, but the current archive takes 5 minutes to recover, you won't
>> really be testing streaming replication once a minute. That would likely
>> need to be documented.
>
> If we have configurable WAL bytes instead of timeout for standby WAL
> source switch from archive to primary, we don't have the above problem
> right?

If you are going to stop replaying in the middle of a WAL archive, then
maybe. But I don't think I'd recommend that.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-09-09 22:10:04 Re: predefined role(s) for VACUUM and ANALYZE
Previous Message Andres Freund 2022-09-09 22:05:22 Re: configure --with-uuid=bsd fails on NetBSD