Skip site navigation (1) Skip section navigation (2)

Re: time-delayed standbys

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: time-delayed standbys
Date: 2011-06-29 12:24:31
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
On Wed, Jun 29, 2011 at 4:00 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Thu, Jun 16, 2011 at 7:29 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Wed, Jun 15, 2011 at 1:58 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> When the replication connection is terminated, the standby tries to read
>>> WAL files from the archive. In this case, there is no walreceiver process,
>>> so how does the standby calculate the clock difference?
>> Good question.  Also, just because we have streaming replication
>> available doesn't mean that we should force people to use it.  It's
>> still perfectly legit to set up a standby that only use
>> archive_command and restore_command, and it would be nice if this
>> feature could still work in such an environment.  I anticipate that
>> most people want to use streaming replication, but a time-delayed
>> standby is a good example of a case where you might decide you don't
>> need it.  It could be useful to have all the WAL present (but not yet
>> applied) if you're thinking you might want to promote that standby -
>> but my guess is that in many cases, the time-delayed standby will be
>> *in addition* to one or more regular standbys that would be the
>> primary promotion candidates.  So I can see someone deciding that
>> they'd rather not have the load of another walsender on the master,
>> and just let the time-delayed standby read from the archive.
>> Even if that were not an issue, I'm still more or less of the opinion
>> that trying to solve the time synchronization problem is a rathole
>> anyway.  To really solve this problem well, you're going to need the
>> standby to send a message containing a timestamp, get a reply back
>> from the master that contains that timestamp and a master timestamp,
>> and then compute based on those two timestamps plus the reply
>> timestamp the maximum and minimum possible lag between the two
>> machines.  Then you're going to need to guess, based on several cycles
>> of this activity, what the actual lag is, and adjust it over time (but
>> not too quckly, unless of course a large manual step has occurred) as
>> the clocks potentially drift apart from each other.  This is basically
>> what ntpd does, except that it can be virtually guaranteed that our
>> implementation will suck by comparison.  Time synchronization is
>> neither easy nor our core competency, and I think trying to include it
>> in this feature is going to result in a net loss of reliability.
> This begs the question of why we need this feature at all, in the way proposed.
> Streaming replication is designed for immediate transfer of WAL. File
> based is more about storing them for some later use.
> It seems strange to pollute the *immediate* transfer route with a
> delay, when that is easily possible with a small patch to pg_standby
> that can wait until the filetime delay is > X before returning.
> The main practical problem with this is that most people's WAL
> partitions aren't big enough to store the delayed WAL files, which is
> why we provide the file archiving route anyway. So in practical terms
> this will be unusable, or at least dangerous to use.
> +1 for the feature concept, but -1 for adding this to streaming replication.

As implemented, the feature will work with either streaming
replication or with file-based replication.  I don't see any value in
restricting to work ONLY with file-based replication.

Also, if we were to do it by making pg_standby wait, then the whole
thing would be much less accurate, and the delay would become much
harder to predict, because you'd be operating on the level of entire
WAL segments, rather than individual commit records.

Robert Haas
The Enterprise PostgreSQL Company

In response to


pgsql-hackers by date

Next:From: Robert HaasDate: 2011-06-29 12:35:46
Subject: Re: Inconsistency between postgresql.conf and docs
Previous:From: Albe LaurenzDate: 2011-06-29 12:23:45
Subject: Bug in SQL/MED?

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group