Re: Proposal for 9.1: WAL streaming from WAL buffers

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for 9.1: WAL streaming from WAL buffers
Date: 2010-06-11 14:31:30
Message-ID: 7262.1276266690@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fujii Masao <masao(dot)fujii(at)gmail(dot)com> writes:
> In 9.0, walsender reads WAL always from the disk and sends it to the standby.
> That is, we cannot send WAL until it has been written (and flushed) to the disk.

I believe the above statement to be incorrect: walsender does *not* wait
for an fsync to occur.

I agree with the idea of trying to read from WAL buffers instead of the
file system, but the main reason why is that the current behavior makes
FADVISE_DONTNEED for WAL pretty dubious. It'd be a good idea to still
(artificially) limit replication to not read ahead of the written-out
data.

> ... Since we can write and send WAL simultaneously, in synchronous
> replication, a transaction commit has only to wait for either of them. So the
> performance would significantly increase.

That performance claim, frankly, is ludicrous. There is no way that
round trip network delay plus write+fsync on the slave is faster than
local write+fsync. Furthermore, I would say that you are thinking
exactly backwards about the requirements for synchronous replication:
what that would mean is that transaction commit waits for *both*,
not whichever one finishes first.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stefan Kaltenbrunner 2010-06-11 14:38:26 Re: Proposal for 9.1: WAL streaming from WAL buffers
Previous Message Robert Haas 2010-06-11 14:24:03 Re: Proposal for 9.1: WAL streaming from WAL buffers