Quick Links

Re: Strange issues with 9.2 pg_basebackup & replication

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Thom Brown <thom(at)linux(dot)com>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Strange issues with 9.2 pg_basebackup & replication
Date:	2012-05-16 23:29:00
Message-ID:	CAHGQGwF86Mbp7z9Heuc93hWsrJ46JMvcLikDXc6O0adXG0V=+w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, May 17, 2012 at 1:07 AM, Thom Brown <thom(at)linux(dot)com> wrote:
> On 16 May 2012 11:36, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> On Wed, May 16, 2012 at 2:29 AM, Thom Brown <thom(at)linux(dot)com> wrote:
>>> On 15 May 2012 13:15, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>>> On Wed, May 16, 2012 at 1:36 AM, Thom Brown <thom(at)linux(dot)com> wrote:
>>>>> However, this isn't true when I restart the standby. I've been
>>>>> informed that this should work fine if a WAL archive has been
>>>>> configured (which should be used anyway).
>>>>
>>>> The WAL archive should be shared by master-replica and replica-replica,
>>>> and recovery_target_timeline should be set to latest in replica-replica.
>>>> If you configure that way, replica-replica would successfully reconnect to
>>>> master-replica with no need to restart it.
>>>
>>> I had set the archive_command on the primary, then produced a base
>>> backup which would have copied the archive settings, but I also added
>>> a corresponding recovery_command setting, so everything was pointing
>>> at the same archive.
>>
>> Hmm.. when doing the same, the replica-replica successfully reconnected
>> to the master-replica after I shutdown the master-master and promoted the
>> master-replica. archive_command is the same in three servers,
>> restore_command is the same in two standby servers (i.e., master-replica
>> and replica-replica), and recovery_target_timeline is set to 'latest' in two
>> standby servers.
>
> I didn't shut down the master-master, but I didn't expect to need to.
>
> I also had recovery_target_timeline set to latest. I also tried
> explicitly setting it to the new timeline, and got an error saying
> there was no such timeline.

What did the replica-replica do after you got such an error? Repeated
such an error? Emit PANIC error and exited? Got stuck? Successfully
reconnected to the master-replica? ....

In theory, the gap of timeline should be resolved as follows:

1. promote master-replica, which terminates cascade replication.
2. while replica-replica is repeating to reconnect to master-replica,
if it finds new timeline history file in the archive, it adjusts
its timeline
to new one.
3. as the result of promotion, master-replica increments its timeline,
creates the timeline history file and archives it.
4. finally replica-replica finds new timeline history file in the archive,
adjusts its timeline to new one, and successfully reconnects to the
master-replica.

Note that you might see the timeline mismatch error some times
before replication is successfully restarted because of the timing
problem.

>
>>> But in any case, shouldn't the replication connection be
>>> terminated when pg_basebackup is terminated?
>>
>> +1 To do this, we would need to define SIGINT signal handler and make it
>> send QueryCancel packet when Ctrl-C is typed.
>
> Also could we provide some feedback when using the -c spread option,
> when there isn't progress within a short period of time? Something
> like "Waiting for checkpoint. This can take up to
> %checkpoint_timeout%", or something similar, rather than seeing
> nothing happening and wondering if something has gone wrong.

+1, at least for the case where -P option is specified in pg_basebackup.

Regards,

--
Fujii Masao

In response to

Re: Strange issues with 9.2 pg_basebackup & replication at 2012-05-16 16:07:16 from Thom Brown

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2012-05-16 23:58:52	Re: psql bug
Previous Message	Bruce Momjian	2012-05-16 21:30:27	Re: Draft release notes complete