Re: Use pg_rewind when target timeline was switched

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Use pg_rewind when target timeline was switched
Date: 2015-09-08 07:28:02
Message-ID: CAB7nPqStazE2SPt0jkXcByP_FVdykMNVf7A7B_uFtaXOM4SfqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 8, 2015 at 1:14 AM, Alexander Korotkov wrote:
> On Thu, Aug 20, 2015 at 9:57 AM, Michael Paquier wrote:
>> The code above looks correct to me when scanning the WAL history onwards
>> though, which is what is done when extracting the page map, but not
>> backwards when we try to find the last common checkpoint record. This code
>> actually fails trying to open 2/0/2 that does not exist in the promoted
>> standby's pg_xlog in my test case.
>>
>> Attached is a small script I have used to reproduce the failure.
>
>
> Right, thanks! It should be fixed in the attached version of patch.

So, instead of a code review, I have been torturing your patch and did
advanced tests on it with the attached script, that creates a cluster
as follows:
master (5432)
/ \
1 (5433) 2 (5434)
|
3 (5435)
Once cluster is created, nodes are promoted in a certain order giving
them different timeline jump properties:
- master, stays on tli 1
- standby 1, tli 1->2
- standby 2, tli 1->3
- standby 3, tli 1->2->4
And data is inserted on each of them to make WAL fork properly.
Finally the script tries to rewind one node using another node as
source, and then tries to link this target node back to the source
node via streaming replication.

I have tested all the one-one permutations possible in the structure
above (see commented portions at the bottom of my script), and all of
them worked. I have to say that from the testing prospective this
patch looks in great shape, and will greatly improve the use cases of
pg_rewind!

I am planning to do as well a detailed code review rather soon.
Regards,
--
Michael

Attachment Content-Type Size
rewind_test.bash application/octet-stream 3.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-09-08 07:37:18 Re: [HACKERS] 答复:[HACKERS] about fsync in CLOG buffer write
Previous Message Fabien COELHO 2015-09-08 06:25:19 Re: pgbench progress with timestamp