Re: Use pg_rewind when target timeline was switched

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Use pg_rewind when target timeline was switched
Date: 2015-09-09 10:13:11
Message-ID: CAPpHfdtLQXKVudoe3EkyL5E0adb1TUVqayz5ba+J8JNNBU7EiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 9, 2015 at 9:01 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

>
>
> On Wed, Sep 9, 2015 at 3:27 AM, Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
>
>> On Tue, Sep 8, 2015 at 10:28 AM, Michael Paquier
>> <michael(dot)paquier(at)gmail(dot)com>wrote:
>>
>>> I am planning to do as well a detailed code review rather soon.
>>>
>>
>> Good, thanks.
>>
>
> When testing a bit more complex structures, it occurred to me that we may
> want as well to use as a source node a standby. For example based on the
> case of my cluster above:
> master (5432)
> / \
> 1 (5433) 2 (5434)
> |
> 3 (5435)
> If I try for example to rebuild the cluster as follows there will be
> failures:
> 1) Rewind with source = 3, target = 2
> 2) Start 3 and 2
> 3) Shutdown 2
> 3) Rewind source = 2, target = 1, failure with:
> source data directory must be shut down cleanly
>
> It seems to me that we should allow a node that has been shutdowned in
> recovery to be used as a source for rewind as well, as follows:
> - if (datadir_source && ControlFile_source.state != DB_SHUTDOWNED)
> + if (datadir_source &&
> + ControlFile_source.state != DB_SHUTDOWNED &&
> + ControlFile_source.state != DB_SHUTDOWNED_IN_RECOVERY)
> pg_fatal("source data directory must be shut down
> cleanly\n");
> At least your patch justifies in my eyes such a change.
>

It's source is not required to be √ in recovery if we specify it by
connection string. This is why I think DB_SHUTDOWNED_IN_RECOVERY is OK when
we specify source by data directory. Included in attached patch.

/*
> + * Find minimum from two xlog pointers assuming invalid pointer is
> greatest
> + * possible pointer.
> + */
> +static XLogRecPtr
> +xlPtrMin(XLogRecPtr a, XLogRecPtr b)
> +{
> + if (XLogRecPtrIsInvalid(a))
> + return b;
> + else if (XLogRecPtrIsInvalid(b))
> + return a;
> + else
> + return Min(a, b);
> +}
> This is true as timeline.h tells so, still I think that it would be better
> to mention that this is this assumption is held in this header file, or
> simply that timeline history entries at the top have their end position set
> as InvalidXLogRecPtr which is a synonym of infinity.
>

Agree. Comment is adjusted.

> The code building the target history file is a duplicate of what is done
> with the source. Perhaps we could refactor that as a single routine in
> pg_rewind.c?
>

Do you mean duplication between rewind_parseTimeLineHistory()
and readTimeLineHistory(). We could put readTimeLineHistory() into common
with some refactoring. This is the subject of separate patch, though.

Except that, the patch looks pretty neat to me. I was wondering as well:
> what tests did you run up to now with this patch? I am attaching an updated
> version of my test script I used for some more complex scenarios. Feel free
> to play with it.
>

I was running manual tests like a noob :) When you attached you bash
script, I've switched to it.

A also added additional check in findCommonAncestorTimeline(). Two standbys
could be independently promoted and get the same new timeline ID. Now, it's
checked that timelines, that we assume to be the same, have equal begins.
Begins could still match by coincidence. But the same risk exists in
original pg_rewind, though.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
pg-rewind-target-switch-4.patch application/octet-stream 17.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2015-09-09 11:21:51 Re: Waits monitoring
Previous Message Etsuro Fujita 2015-09-09 10:08:30 Re: Minor code improvements to create_foreignscan_plan/ExecInitForeignScan