Re: Switching XLog source from archive to streaming when primary available

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, cary(dot)huang(at)highgo(dot)ca, pgsql-hackers(at)lists(dot)postgresql(dot)org, satyanarlapuram(at)gmail(dot)com
Subject: Re: Switching XLog source from archive to streaming when primary available
Date: 2022-09-12 03:33:02
Message-ID: CALj2ACV0ZkXggzMrfx6pi_MGTR-rGOY3AOUwvUP1pg=iE6+LwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 10, 2022 at 3:35 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> Well, for wal_keep_size, using bytes makes sense. Given you know how much
> disk space you have, you can set this parameter accordingly to avoid
> retaining too much of it for standby servers. For your proposed parameter,
> it's not so simple. The same setting could have wildly different timing
> behavior depending on the server. I still think that a timeout is the most
> intuitive.

Hm. In v3 patch, I've used the timeout approach, but tracking the
duration server spends in XLOG_FROM_ARCHIVE as opposed to tracking
last failed time in streaming from primary.

> So I think it
> would be difficult for the end user to decide on a value. However, even
> the timeout approach has this sort of problem. If your parameter is set to
> 1 minute, but the current archive takes 5 minutes to recover, you won't
> really be testing streaming replication once a minute. That would likely
> need to be documented.

Added a note in the docs.

On Fri, Sep 9, 2022 at 10:46 AM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> Being late for the party.

Thanks for reviewing this.

> It seems to me that the function is getting too long. I think we
> might want to move the core part of the patch into another function.

Yeah, the WaitForWALToBecomeAvailable() (without this patch) has
around 460 LOC out of which WAL fetching from the chosen source is of
240 LOC, IMO, this code will be a candidate for a new function. I
think that part can be discussed separately.

Having said that, I moved the new code to a new function.

> I think it might be better if intentionalSourceSwitch doesn't need
> lastSourceFailed set. It would look like this:
>
> > if (lastSourceFailed || switchSource)
> > {
> > if (nonblocking && lastSourceFailed)
> > return XLREAD_WOULDBLOCK;

I think the above looks good, done that way in the latest patch.

> I don't think the flag first_time is needed.

Addressed this in the v4 patch.

Please review the attached v4 patch addressing above review comments.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v4-0001-Allow-standby-to-switch-WAL-source-from-archive-t.patch application/octet-stream 17.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2022-09-12 04:03:24 Re: list of acknowledgments for PG15
Previous Message Michael Paquier 2022-09-12 02:53:14 Re: pg15b4: FailedAssertion("TransactionIdIsValid(xmax)