Re: Proposal for 9.1: WAL streaming from WAL buffers

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for 9.1: WAL streaming from WAL buffers
Date: 2010-06-30 02:26:50
Message-ID: AANLkTikfE9Th80v-g3nYEs-JM9c2TSkkkk6iNXhfMZZj@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 29, 2010 at 10:06 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Simon Riggs wrote:
>> On Mon, 2010-06-21 at 18:08 +0900, Fujii Masao wrote:
>>
>> > The problem is not that the master streams non-fsync'd WAL, but that the
>> > standby can replay that. So I'm thinking that we can send non-fsync'd WAL
>> > safely if the standby makes the recovery wait until the master has fsync'd
>> > WAL. That is, walsender sends not only non-fsync'd WAL but also WAL flush
>> > location to walreceiver, and the standby applies only the WAL which the
>> > master has already fsync'd. Thought?
>>
>> Yes, good thought. The patch just applied seems too much.
>>
>> I had the same thought, though it would mean you'd need to send two xlog
>> end locations, one for write, one for fsync. Though not really clear why
>> we send the "current end of WAL on the server" anyway, so maybe we can
>> just alter that.
>
> Is this a TODO?

Maybe. As Heikki pointed out upthread, the standby can't even write
the WAL to back to the OS until it's been fsync'd on the master
without risking the problem under discussion. So we can stream the
WAL from master to standby as long as the standby just buffers it in
memory (or somewhere other than the usual location in pg_xlog).

Before we get too busy frobnicating this gonkulator, I'd like to see a
little more discussion of what kind of performance people are
expecting from sync rep. Sounds to me like the best we can expect
here is, on every commit: (a) wait for master fsync to complete, (b)
send message to standby, (c) wait for reply for reply from standby
indicating that fsync is complete on standby. Even assuming that the
network overhead is minimal, that halves the commit rate. Are the
people who want sync rep OK with that? Is there any way to do better?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2010-06-30 02:45:12 Re: pg_archive_bypass
Previous Message Robert Haas 2010-06-30 02:11:06 Re: warning message in standby