Re: Proposal for 9.1: WAL streaming from WAL buffers

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for 9.1: WAL streaming from WAL buffers
Date: 2010-06-21 09:08:57
Message-ID: AANLkTikXyvTatNV4-6gTP7VDsaH1Icege-SsYr051FNl@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 16, 2010 at 5:06 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jun 15, 2010 at 3:57 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> I wonder if it would be possible to jigger things so that we send the
>>> WAL to the standby as soon as it is generated, but somehow arrange
>>> things so that the standby knows the last location that the master has
>>> fsync'd and never applies beyond that point.
>>
>> I can't think of any way which would not require major engineering.  And
>> you'd be slowing down replication *in general* to deal with a fairly
>> unlikely corner case.
>>
>> I think the panic is the way to go.
>
> I have yet to convince myself of how likely this is to occur.  I tried
> to reproduce this issue by crashing the database, but I think in 9.0
> you need an actual operating system crash to cause this problem, and I
> haven't yet set up an environment in which I can repeatedly crash the
> OS.  I believe, though, that in 9.1, we're going to want to stream
> from WAL buffers as proposed in the patch that started out this
> thread, and then I think this issue can be triggered with just a
> database crash.
>
> In 9.0, I think we can fix this problem by (1) only streaming WAL that
> has been fsync'd and (2) PANIC-ing if the problem occurs anyway.  But
> in 9.1, with sync rep and the performance demands that entails, I
> think that we're going to need to rethink it.

The problem is not that the master streams non-fsync'd WAL, but that the
standby can replay that. So I'm thinking that we can send non-fsync'd WAL
safely if the standby makes the recovery wait until the master has fsync'd
WAL. That is, walsender sends not only non-fsync'd WAL but also WAL flush
location to walreceiver, and the standby applies only the WAL which the
master has already fsync'd. Thought?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2010-06-21 09:19:49 Re: Using multidimensional indexes in ordinal queries
Previous Message Greg Stark 2010-06-21 08:37:18 Re: beta3 & the open items list