Re: Use pg_rewind when target timeline was switched

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Use pg_rewind when target timeline was switched
Date: 2015-09-09 06:01:36
Message-ID: CAB7nPqTRJ1Gt=x_=L8LXYXaokuDYpukqBqxSn8eXNwJ+ezfD8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 9, 2015 at 3:27 AM, Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:

> On Tue, Sep 8, 2015 at 10:28 AM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com>wrote:
>
>> I am planning to do as well a detailed code review rather soon.
>>
>
> Good, thanks.
>

When testing a bit more complex structures, it occurred to me that we may
want as well to use as a source node a standby. For example based on the
case of my cluster above:
master (5432)
/ \
1 (5433) 2 (5434)
|
3 (5435)
If I try for example to rebuild the cluster as follows there will be
failures:
1) Rewind with source = 3, target = 2
2) Start 3 and 2
3) Shutdown 2
3) Rewind source = 2, target = 1, failure with:
source data directory must be shut down cleanly

It seems to me that we should allow a node that has been shutdowned in
recovery to be used as a source for rewind as well, as follows:
- if (datadir_source && ControlFile_source.state != DB_SHUTDOWNED)
+ if (datadir_source &&
+ ControlFile_source.state != DB_SHUTDOWNED &&
+ ControlFile_source.state != DB_SHUTDOWNED_IN_RECOVERY)
pg_fatal("source data directory must be shut down
cleanly\n");
At least your patch justifies in my eyes such a change.

/*
+ * Find minimum from two xlog pointers assuming invalid pointer is greatest
+ * possible pointer.
+ */
+static XLogRecPtr
+xlPtrMin(XLogRecPtr a, XLogRecPtr b)
+{
+ if (XLogRecPtrIsInvalid(a))
+ return b;
+ else if (XLogRecPtrIsInvalid(b))
+ return a;
+ else
+ return Min(a, b);
+}
This is true as timeline.h tells so, still I think that it would be better
to mention that this is this assumption is held in this header file, or
simply that timeline history entries at the top have their end position set
as InvalidXLogRecPtr which is a synonym of infinity.

The code building the target history file is a duplicate of what is done
with the source. Perhaps we could refactor that as a single routine in
pg_rewind.c?

Except that, the patch looks pretty neat to me. I was wondering as well:
what tests did you run up to now with this patch? I am attaching an updated
version of my test script I used for some more complex scenarios. Feel free
to play with it.
--
Michael

Attachment Content-Type Size
rewind_test.bash application/octet-stream 4.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2015-09-09 06:17:54 Re: Parallel Seq Scan
Previous Message Andrew Dunstan 2015-09-09 04:23:07 Re: Counting lines correctly in psql help displays