Re: Standby trying "restore_command" before local WAL

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>
Cc: cyberdemn(at)gmail(dot)com, sk(at)zsrv(dot)org, emre(at)hasegeli(dot)com, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, berge(at)trivini(dot)no, ben(at)gurkan(dot)in, raimund(dot)schlichtiger(at)innogames(dot)com, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, bernhard(dot)schrader(at)innogames(dot)com, Simon Riggs <simon(at)2ndquadrant(dot)com>, vik(at)2ndquadrant(dot)fr
Subject: Re: Standby trying "restore_command" before local WAL
Date: 2018-08-06 17:12:46
Message-ID: 20180806171246.GO27724@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Jaime Casanova (jaime(dot)casanova(at)2ndquadrant(dot)com) wrote:
> On Mon, 6 Aug 2018 at 11:01, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > > What about the following cases?
> > > 1. replica host crashed, and in pg_wal we have a few thousands WAL files.
> >
> > If this is the case then the replica was very far behind on replay,
> > presumably, and in some of those cases rebuilding the replica might
> > very well be faster than replaying all of that WAL. This case does
> > sound like it should be alright though.
>
> it could also be a delayed standby, and in that case we will have in
> the replica lots of valid -delayed apply on porpouse, not on master
> anymore- WALs, restarting from archive in that case is a poor
> solution...

That's true but it doesn't really change the base question that wasn't
discussed in your response- we have to *know* if the pg_wal directory is
completely valid and can be used (or we have some way of knowing what
can be used and what can't be) if we're going to use it. I realize
that's not how things work today and that strikes me as an issue that we
need to fix. If we can fix that by deciding that our current checks are
sufficient regardless of what's in pg_wal, then fine, let's always use
pg_wal first and only after we've used as much as we're able go to
restore command, but I haven't seen anyone demonstrate that to be the
case so far. If we can come up with some way to know if pg_wal is safe
or not, great, then once we've got that implemented we can use it. If
there's no way to do that, then let's push back on the user to tell us
if it's safe or not.

Thanks!

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Arseny Sher 2018-08-06 18:06:13 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Jaime Casanova 2018-08-06 17:07:49 Re: Standby trying "restore_command" before local WAL