Re: write ahead logging in standby (streaming replication)

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: write ahead logging in standby (streaming replication)
Date: 2009-11-12 12:45:35
Message-ID: 3f0b79eb0911120445h6bf69c4dlbf31e3b39ca2c36a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 12, 2009 at 6:27 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> I agree with you, though it has taken some time to understand what you
> said and at first my reaction was to disagree. I think the responses you
> got on this are because you dived straight in with a question before
> explaining other things around this.

Thanks for clarifying this topic ;)

> If recovery starts reading WAL records that have not been fsynced then
> we may need to flush a shared buffer to disk that depends upon a
> non-fsynced(yet) WAL record. Fsyncing WAL after *every* WAL record is
> going to make performance suck even worse and is completely out of the
> question. So implementing the fsync-WAL-before-buffer-flush rule during
> recovery makes much more sense. It's also only small change during
> XlogFlush().

Agreed. This approach has lesser impact on the performance.

But, as I said on my first post on this thread, even such low-frequent
fsync-WAL-before-buffer-flush might cause a response time spike on the
primary because the walreceiver must sleep during that fsync. I think
that leaving the WAL-logging business to another process like walwriter
is a good idea for reducing further the impact on the walreceiver; In
typical case,

* The walreceiver receives WAL records, returns the ACK to the primary,
saves them in the wal_buffers, and lets the startup process know
the arrival.

* The walwriter writes and fsyncs the WAL records in the wal_buffers.

* The startup process applies the WAL records in the wal_buffers
when it receives the notice of the arrival.

* The startup process and bgwriter fsyncs the WAL before the buffer
flush.

Of course, since this approach is too complicated, it's out of the scope
of the development for v8.5.

> But I also agree with Heikki. Let's plan to do this later in this
> release.

Okey. I implement nothing around this topic until the core part of
asynchronous replication will have been committed.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2009-11-12 12:53:12 Re: write ahead logging in standby (streaming replication)
Previous Message Marko Kreen 2009-11-12 12:19:51 Re: recovery is stuck when children are not processing SIGQUIT from previous crash