Re: reorder pg_rewind control file sync

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: reorder pg_rewind control file sync
Date: 2019-03-25 07:14:21
Message-ID: 20190325071421.GF2558@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 23, 2019 at 06:18:27AM +0100, Fabien COELHO wrote:
> Here it is.

Thanks.

> The attached patch reorders the cluster fsyncing and control file changes in
> "pg_rewind" so that the later is done after all data are committed to disk,
> so as to reflect the actual cluster status, similarly to what is done by
> "pg_checksums", per discussion in the thread about offline enabling of
> checksums:

It would be an interesting property to see that it is possible to
retry a rewind of a node which has been partially rewound already,
but the operation failed in the middle. Because that's the real deal
here: as long as we know that its control file is in its previous
state, we can rely on it for retrying the operation. Logically, I
think that it should work, because we would still try to fetch the
same blocks from the source server since WAL has forked by looking at
the records of the target up from the last checkpoint before WAL has
forked up to the last shutdown checkpoint, and the operation is lossy
by design when it comes to deal with file differences.

Have you tried to see if pg_rewind is able to repeat its operation for
specific scenarios? One is for example a database created on the
promoted standby, used as a source, and a second, different database
created on the primary after the standby has been promoted. You could
make the tool exit() before the rewind finishes, just before updating
the control file, and see if the operation is repeatable.
Interrupting the tool would be fine as well, still less controllable.

It would be good to mention in the patch why the order matters.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2019-03-25 07:19:08 Re: current_logfiles not following group access and instead follows log_file_mode permissions
Previous Message Nagaura, Ryohei 2019-03-25 05:26:25 RE: Timeout parameters