Re: run pg_rewind on an uncleanly shut down cluster.

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Oleksii Kliukin <alexk(at)hintbits(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: run pg_rewind on an uncleanly shut down cluster.
Date: 2015-10-06 09:32:15
Message-ID: CAB7nPqTfgJmRREeHWJ1e9+YG9F2SR-VwGfSrspzY053bm1kvHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 6, 2015 at 6:04 PM, Oleksii Kliukin <alexk(at)hintbits(dot)com> wrote:
> Does pg_rewind actually rely on the cluster being rewound to finish
> recovery?

That's not mandatory AFAIK. I think that Heikki has just implemented
it in the safest way possible for a first shot. That's something we
could relax in the future.

> If not, than it would be a good idea to add —force flag to force the
> pg_rewind to ignore the state check, as you suggested in this thread:
> http://www.postgresql.org/message-id/flat/CAF8Q-Gw1HBKzpSEVtotLg=DR+Ee-6q59qQfhY5tor3FYAenyrA(at)mail(dot)gmail(dot)com#CAF8Q-Gw1HBKzpSEVtotLg=DR+Ee-6q59qQfhY5tor3FYAenyrA@mail.gmail.com

Another one would be to remove this check of pg_control by something
closer to what pg_ctl status does with postmaster.pid for example. And
to perhaps add a safeguard to prevent a concurrent user to start the
target node when pg_rewind run begins.

> Well, checking the source node looks like an option that does not require
> providing any additional information by DBA, as the connection string or the
> path to the data dir is already there. It would be nice if pg_rewind could
> fetch WAL from the given restore_command though, or even use the command
> already there in recovery.conf (if the node being recovered is a replica,
> which I guess is a pretty common case).

Kind of. Except that we would want a user to be able to pass a custom
restore_command for more flexibility that would be used by pg_rewind
itself.

> Anyway, thank you for describing the issue. In my case, it seems I solved it
> by removing the files from the archive_status directory of the former master
> (the node being rewound). This makes PostgreSQL forget that it has to remove
> an already archived (but still required for pg_rewind) segment (I guess it
> does it during stop when the checkpoint is issued). Afterwards, postgres
> starts it in a single user mode with archive_command=false and
> archive_mode=on, to make sure no segments are archived/removed, and stopped
> right afterwards with:

Interesting. That's one way to go.

> Afterwards, pg_rewind runs on the cluster without any noticeable issues.
> Since the node is not going to continue as a master and the contents of
> pg_xlog/archive_status is changed after pg_rewind anyway, I don’t think any
> data is lost after initial removal of archive_status files.

Yep. Its content is replaced by everything from the source node.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Syed, Rahila 2015-10-06 09:34:44 Re: [PROPOSAL] VACUUM Progress Checker.
Previous Message Oleksii Kliukin 2015-10-06 09:04:28 Re: run pg_rewind on an uncleanly shut down cluster.