Re: [bug fix] pg_rewind creates corrupt WAL files, and the standby cannot catch up the primary

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] pg_rewind creates corrupt WAL files, and the standby cannot catch up the primary
Date: 2018-03-01 08:06:00
Message-ID: 20180301080600.GE1178@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 01, 2018 at 07:49:06AM +0000, Tsunakawa, Takayuki wrote:
> It's a regret that Chen's patch, which limits the WAL to be copied, is
> not committed yet. It looks good to be ready for committer.

The message I sent provides reasons about why it should not be
integrated. Particularly since the prior last checkpoint has been
removed in v11, there is always going to be a whole in WAL segments as
you need to create a checkpoint on the ex-standby after it has been
promoted so as its control file data is correctly reflected on disk.

> > > Related to this, shouldn't pg_rewind avoid copying more files and
> > > directories like pg_basebackup? Currently, pg_rewind doesn't copy
> > > postmaster.pid, postmaster.opts, and temporary files/directories
> > > (pg_sql_tmp/).
> >
> > Yes, it should not copy those files. I have a patch in the current CF to
> > do that:
> > https://commitfest.postgresql.org/17/1507/
>
> Wow, what a great patch. I think I should look at it. But I'm afraid
> it won't be backpatched because it's big...

That's a new feature. This won't get backpatch'ed anyway.

> Even with your patch and Chen's one, my small patch is probably
> necessary to avoid leaving 0-byte or half-baked files. I'm not sure
> whether those strangely sized files would cause actual trouble, but
> maybe it would be healthy to try to clean things up as much as
> possible. (files in pg_twophase/ might emit WARNING messages, garbage
> server log files might make the DBA worried, etc.; yes, these may be
> just FUD.)

Yeah, I'd like to double-check what you are proposing here anyway.
Sorry but I do not have an opinion about what you have sent yet :(
The only thing I am sure of though is that for HEAD not copying files
from pg_wal from the origin is the way to do it. For back-branches
that's another story.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-03-01 08:07:42 Re: Synchronous replay take III
Previous Message Andres Freund 2018-03-01 08:04:01 Re: Online enabling of checksums