Re: Synchronous replication, reading WAL for sending

From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Pavan Deolasee" <pavan(dot)deolasee(at)enterprisedb(dot)com>
Subject: Re: Synchronous replication, reading WAL for sending
Date: 2008-12-24 05:34:49
Message-ID: 2e78013d0812232134j1e4d2474n28b2c294fe2c41f8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 23, 2008 at 9:12 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> As the patch stands, whenever XLOG segment is switched in XLogInsert, we
> wait for the segment to be sent to the standby server. That's not good.
> Particularly in asynchronous mode, you'd expect the standby to not have any
> significant ill effect on the master. But in case of a flaky network
> connection, or a busy or dead standby, it can take a long time for the
> standby to respond, or the primary to give up. During that time, all WAL
> insertions on the primary are blocked. (How long is the default TCP timeout
> again?)
>
> Another point is that in the future, we really shouldn't require setting up
> archiving and file-based log shipping using external scripts, when all you
> want is replication. It should be enough to restore a base backup on the
> standby, and point it to the IP address of the primary, and have it catch
> up. This is very important, IMHO. It's quite a lot of work to set up
> archiving and log-file shipping, for no obvious reason. It's really only
> needed at the moment because we're building this feature from spare parts.
>

I had similar suggestions when I first wrote the high level design doc.
>From the wiki page:

- WALSender reads from WAL buffers and/or WAL files and sends the
buffers to WALReceiver. In phase one, we may assume that WALSender can
only read from WAL buffers and WAL files in pg_xlog directory. Later
on, this can be improved so that WALSender can temporarily restore
archived files and read from that too.

I am not so sure about whether we must support archive files or not,
but I agree that at least supporting pg_xlog files will be necessary
if we want to support seamless catchup after restart.

> For those reasons, we need a way to send arbitrary ranges of WAL from
> primary to standby. The current method where the WAL is read from
> wal_buffers obviously only works for very recent WAL pages that are still in
> wal_buffers. The design should be changed so that instead of reading from
> wal_buffers, the WAL is read from filesystem.
>
> Sending directly from wal_buffers can be provided as a fastpath when sending
> recent enough WAL range, but I wouldn't bother complicating the code for
> now.
>

How would that work for sync replication ? Or are you suggesting that
the WAL first written to the disk and then again read back to be sent
to the standby ? I think the reading from files is addition work in
the sync path when we already have access to the WAL buffers.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2008-12-24 07:07:56 Re: reloptions and toast tables
Previous Message Robert Treat 2008-12-24 04:59:24 Re: Hot standby and b-tree killed items