Re: time-delayed standbys

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: time-delayed standbys
Date: 2011-06-29 17:50:22
Message-ID: BANLkTi=ufQk9i2Ria1U-c0VDUMUNxKjPgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 29, 2011 at 1:24 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Jun 29, 2011 at 4:00 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On Thu, Jun 16, 2011 at 7:29 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Wed, Jun 15, 2011 at 1:58 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>>> When the replication connection is terminated, the standby tries to read
>>>> WAL files from the archive. In this case, there is no walreceiver process,
>>>> so how does the standby calculate the clock difference?
>>>
>>> Good question.  Also, just because we have streaming replication
>>> available doesn't mean that we should force people to use it.  It's
>>> still perfectly legit to set up a standby that only use
>>> archive_command and restore_command, and it would be nice if this
>>> feature could still work in such an environment.  I anticipate that
>>> most people want to use streaming replication, but a time-delayed
>>> standby is a good example of a case where you might decide you don't
>>> need it.  It could be useful to have all the WAL present (but not yet
>>> applied) if you're thinking you might want to promote that standby -
>>> but my guess is that in many cases, the time-delayed standby will be
>>> *in addition* to one or more regular standbys that would be the
>>> primary promotion candidates.  So I can see someone deciding that
>>> they'd rather not have the load of another walsender on the master,
>>> and just let the time-delayed standby read from the archive.
>>>
>>> Even if that were not an issue, I'm still more or less of the opinion
>>> that trying to solve the time synchronization problem is a rathole
>>> anyway.  To really solve this problem well, you're going to need the
>>> standby to send a message containing a timestamp, get a reply back
>>> from the master that contains that timestamp and a master timestamp,
>>> and then compute based on those two timestamps plus the reply
>>> timestamp the maximum and minimum possible lag between the two
>>> machines.  Then you're going to need to guess, based on several cycles
>>> of this activity, what the actual lag is, and adjust it over time (but
>>> not too quckly, unless of course a large manual step has occurred) as
>>> the clocks potentially drift apart from each other.  This is basically
>>> what ntpd does, except that it can be virtually guaranteed that our
>>> implementation will suck by comparison.  Time synchronization is
>>> neither easy nor our core competency, and I think trying to include it
>>> in this feature is going to result in a net loss of reliability.
>>
>>
>> This begs the question of why we need this feature at all, in the way proposed.
>>
>> Streaming replication is designed for immediate transfer of WAL. File
>> based is more about storing them for some later use.
>>
>> It seems strange to pollute the *immediate* transfer route with a
>> delay, when that is easily possible with a small patch to pg_standby
>> that can wait until the filetime delay is > X before returning.
>>
>> The main practical problem with this is that most people's WAL
>> partitions aren't big enough to store the delayed WAL files, which is
>> why we provide the file archiving route anyway. So in practical terms
>> this will be unusable, or at least dangerous to use.
>>
>> +1 for the feature concept, but -1 for adding this to streaming replication.
>
> As implemented, the feature will work with either streaming
> replication or with file-based replication.

That sounds like the exact opposite of yours and Fujii's comments
above. Please explain.

> I don't see any value in
> restricting to work ONLY with file-based replication.

As explained above, it won't work in practice because of the amount of
file space required.

Or, an alternative question: what will you do when it waits so long
that the standby runs out of disk space?

If you hard-enforce the time delay specified then you just make
replication fail under during heavy loads.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Radosław Smogura 2011-06-29 17:57:03 Review of patch Bugfix for XPATH() if expression returns a scalar value
Previous Message Robert Haas 2011-06-29 17:42:34 Re: default privileges wording