Re: Re: Synch Rep: direct transfer of WAL file from the primary to the standby

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Synch Rep: direct transfer of WAL file from the primary to the standby
Date: 2009-07-08 06:04:15
Message-ID: 3f0b79eb0907072304h70a969d7m3f150a775414c934@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for the brilliant comments!

On Wed, Jul 8, 2009 at 4:00 AM, Heikki
Linnakangas<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> There are still some interesting questions in this about exactly how you
>> switch over from "catchup mode" to following the live WAL broadcast.
>> With the above design it would be the master's responsibility to manage
>> that, since presumably the requested start position will almost always
>> be somewhat behind the live end of WAL.  It might be nicer to push that
>> complexity to the slave side, but then you do need two data paths
>> somehow (ie, retrieving the slightly-stale WAL is separated from
>> tracking live events).  Which is what you're saying we should avoid,
>> and I do see the point there.
>
> Yeah, that logic belongs to the master.
>
> We'll want to send message from the master to the slave when the catchup
> is done, so that the slave knows it's up-to-date. For logging, if for no
> other reason.

This seems to be a main difference between us. You and Tom think
that the catchup (transferring the old XLOG file) and WAL streaming
(shipping the latest XLOG records continuously) are performed in
serial by using the same connection. I think that in parallel by using
more than one connection. I'd like to build consensus which design
should be chosen. If my design is worse, I'll change the patch
according to the other design.

In my design, WAL streaming is performed between walsender and
walreceiver. In parallel with that, the startup process requests the
old XLOG file to a normal backend if it's not found during recovery.
If the startup process has reached the WAL streaming start position,
it's guaranteed that all the XLOG files required for recovery exist in
the standby, which means that it's up-to-date. After that, the startup
process replays only the records shipped by WAL streaming.

The advantage of my design is:

- It's guaranteed that the standby can catch up with the primary
within a reasonable period.
- We can keep walsender simple. It has only to take care of the
latest XLOG records (ie. doesn't need to control the old records
and some history files). And, it doesn't need to calculate whether
the standby is already up-to-date or not by comparing some LSNs.
- In the future, in order to make the standby catch up more quickly,
we can easily extend the mechanism so that two or more old
XLOG files might be transferred concurrently by using multiple
connections.

What is your opinion?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2009-07-08 06:35:20 Re: pg_migrator mention in documentation
Previous Message Greg Stark 2009-07-08 05:56:59 Re: New types for transparent encryption