Re: pg_receivewal - couple of improvements

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_receivewal - couple of improvements
Date: 2022-02-03 13:10:55
Message-ID: CALj2ACVpsdjKADxabq4+PDJroVETGD8d6w3OS0d74sZGN0ZvxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 2, 2022 at 9:28 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Wed, Feb 02, 2022 at 09:14:03PM +0530, Bharath Rupireddy wrote:
> >
> > FYI that thread is closed, it committed the change (f61e1dd [1]) that
> > pg_receivewal can read from its replication slot restart lsn.
> >
> > I know that providing the start pos as an option came up there [2],
> > but I wanted to start the discussion fresh as that thread got closed.
>
> Ah sorry I misunderstood your email.
>
> I'm not sure it's a good idea. If you have missing WALs in your target
> directory but have an alternative backup location, you will have to restore the
> WAL from that alternative location anyway, so I'm not sure how accepting a
> different start position is going to help in that scenario. On the other hand
> allowing a position at the command line can also lead to accepting a bogus
> position, which could possibly make things worse.

Isn't complex for anyone to go to the archive location which involves
extra steps - getting authentication tokens, searching there for the
required WAL file, downloading it, unzipping it, copying back to
pg_receivewal node etc. in production environments? You know, this
will just be problematic and adds more time for bringing up the
pg_receivewal. Instead if I know that the latest checkpoint LSN and
archived WAL file from the primary, I can just provide the startpos
(probably the last checkpoint LSN) to pg_receivewal so that it can
continue getting the WAL records from primary, avoiding the whole
bunch of the manual work that I had to do.

> > 2) Currently, RECONNECT_SLEEP_TIME is 5sec - but I may want to have
> > more reconnect time as I know that the primary can go down at any time
> > for whatever reasons in production environments which can take some
> > time till I bring up primary and I don't want to waste compute cycles
> > in the node on which pg_receivewal is running
>
> I don't think that attempting a connection is really costly. Also, increasing
> this retry time also increases the amount of time you're not streaming WALs,
> and thus the amount of data you can lose so I'm not sure that's actually a good
> idea. But you might also want to make it more aggressive, so no objection to
> make it configurable.

Yeah, making it configurable helps tune the reconnect time as per the
requirements.

Regards,
Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-02-03 13:51:24 Re: Replace pg_controldata output fields with macros for better code manageability
Previous Message Bharath Rupireddy 2022-02-03 13:04:53 Re: Replace pg_controldata output fields with macros for better code manageability