Re: pg_rewind, a tool for resynchronizing an old master after failover

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_rewind, a tool for resynchronizing an old master after failover
Date: 2013-05-23 17:51:56
Message-ID: CABOikdMbJwdPsJhbS+1cTDqtC2+S_zedrh5FHzHWr5EUVD-SNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 23, 2013 at 11:10 PM, Heikki Linnakangas <
hlinnakangas(at)vmware(dot)com> wrote:

> On 23.05.2013 07:55, Robert Haas wrote:
>
>> On Thu, May 23, 2013 at 7:10 AM, Heikki Linnakangas
>> <hlinnakangas(at)vmware(dot)com> wrote:
>>
>>> 1. Scan the WAL log of the old cluster, starting from the point where
>>> the new cluster's timeline history forked off from the old cluster. For
>>> each
>>> WAL record, make a note of the data blocks that are touched. This yields
>>> a
>>> list of all the data blocks that were changed in the old cluster, after
>>> the
>>> new cluster forked off.
>>>
>>
>> Suppose that a transaction is open and has written tuples at the point
>> where WAL forks. After WAL forks, the transaction commits. Then, it
>> hints some of the tuples that it wrote. There is no record in WAL
>> that those blocks are changed, but failing to revert them leads to
>> data corruption.
>>
>
> Bummer, you're right. Hmm, if you have checksums enabled, however, we'll
> WAL log a full-page every time a page is dirtied for setting a hint bit,
> which fixes the problem. So, there's a caveat with pg_rewind; you must have
> checksums enabled.
>
>
I was quite impressed with the idea, but hint bits indeed are problem. I
realised the same issue also applies to the other idea that Fujii-san and
others have suggested about waiting for dirty buffers to be written until
the WAL is received at the standby. But since that idea would anyways need
to be implemented in the core, we could teach SetHintBits() to return false
unless the corresponding commit WAL records are written to the standby
first.

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-05-23 18:11:46 Re: getting rid of freezing
Previous Message Andres Freund 2013-05-23 17:51:48 getting rid of freezing