Re: Use pg_rewind when target timeline was switched

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Use pg_rewind when target timeline was switched
Date: 2015-09-16 16:47:41
Message-ID: CAPpHfdtuXTmSNrB2q+MyhB2G5BaCUYB0QtihNNtRV3cj1gMn9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 10, 2015 at 8:33 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> On Wed, Sep 9, 2015 at 7:13 PM, Alexander Korotkov wrote:
> > On Wed, Sep 9, 2015 at 9:01 AM, Michael Paquier wrote:
> >> The code building the target history file is a duplicate of what is done
> >> with the source. Perhaps we could refactor that as a single routine in
> >> pg_rewind.c?
> >
> >
> > Do you mean duplication between rewind_parseTimeLineHistory() and
> > readTimeLineHistory(). We could put readTimeLineHistory() into common
> with
> > some refactoring. This is the subject of separate patch, though.
>
> Well, there is that :) But I was referring to this part (beware of
> useless newlines in your code btw):
> + if (targettli == 1)
> {
> - TimeLineHistoryEntry *entry = &sourceHistory[i];
> + targetHistory = (TimeLineHistoryEntry *)
> pg_malloc(sizeof(TimeLineHistoryEntry));
> + targetHistory->tli = targettli;
> + targetHistory->begin = targetHistory->end =
> InvalidXLogRecPtr;
> + targetNentries = 1;
> +
> + }
> Your patch generates a timeline history array for the target node, and
> HEAD does exactly the same for the source node, so IMO your patch is
> duplicating code, the only difference being that fetchFile is used for
> the source history file and slurpFile is used for the target history
> file. Up to now the code duplication caused by the difference that the
> target is always a local instance, and the source may be either local
> or remote has created the interface that we have now, but I think that
> we should refactor fetch.c such as the routines it contains do not
> rely anymore on datadir_source, and use instead a datadir argument. If
> datadir is NULL, the remote code path is used. If it is not NULL,
> local code path is used. This way we can make the target node use
> fetchFile only, and remove the duplication your patch is adding, as
> well as future ones like that. Thoughts?
>

OK, I get it. I tried to get rid of code duplication in the attached patch.
There is getTimelineHistory() function which gets control file as argument
and fetches history from appropriate path.

> >> Except that, the patch looks pretty neat to me. I was wondering as well:
> >> what tests did you run up to now with this patch? I am attaching an
> updated
> >> version of my test script I used for some more complex scenarios. Feel
> free
> >> to play with it.
> >
> > I was running manual tests like a noob :) When you attached you bash
> script,
> > I've switched to it.
>
> [product_placement]The patch to improve the recovery test suite
> submitted in this CF would have eased the need of bash-ing test cases
> here, and we could have test cases directly included in your
> patch.[/product_placement]
>
> > A also added additional check in findCommonAncestorTimeline(). Two
> standbys
> > could be independently promoted and get the same new timeline ID. Now,
> it's
> > checked that timelines, that we assume to be the same, have equal begins.
> > Begins could still match by coincidence. But the same risk exists in
> > original pg_rewind, though.
>
> Really? pg_rewind blocks attempts to rewind two nodes that are already
> on the same timeline before checking if their timeline history map at
> some point or not:
> /*
> * If both clusters are already on the same timeline, there's
> nothing to
> * do.
> */
> if (ControlFile_target.checkPointCopy.ThisTimeLineID ==
> ControlFile_source.checkPointCopy.ThisTimeLineID)
> pg_fatal("source and target cluster are on the same
> timeline\n");
> And this seems really justified to me. Imagine that you have one
> master, with two standbys linked to it. If both standbys are promoted
> to the same timeline, you could easily replug them to the master, but
> I fail to see the point to be able to replug one promoted standby with
> the other in this case: those nodes have segment and history files
> that map with each other, an operator could easily mess up things in
> such a configuration.
>

Imagine following configuration of server.
1
/ \
2 3
|
4

Initially, node 1 is master, nodes 2 and 3 are standbys for node 1. node 4
is cascaded standby for node 3.
Then node 2 and node 3 are promoted. They both got timeline number 2. Then
node 3 is promoted and gets timeline number 3.
Then we could try to rewind node 4 with node 2 as source. How pg_rewind
could know that timeline number 2 for those nodes is not the same?
We can only check that those timelines are forked from timeline 1 at the
same place. But coincidence is still possible.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
pg-rewind-target-switch-5.patch application/octet-stream 17.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2015-09-16 17:01:41 Re: statistics for array types
Previous Message Peter Geoghegan 2015-09-16 16:45:40 Re: BUFFER_LOCK_* synonyms