Re: Keepalive for max_standby_delay

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Keepalive for max_standby_delay
Date: 2010-07-02 20:20:46
Message-ID: AANLkTinlTL_dQogSzNWaf72-ovl3d8PeWE8ZRwrErxTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 2, 2010 at 4:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> [ Apologies for the very slow turnaround on this --- I got hit with
> another batch of non-postgres security issues this week. ]
>
> Attached is a draft patch for revising the max_standby_delay behavior into
> something I think is a bit saner.  There is some unfinished business:
>
> * I haven't touched the documentation yet
>
> * The code in xlog.c needs to be reverted to its former behavior so that
> recoveryLastXTime means what it's supposed to mean, ie just the last
> commit/abort timestamp.
>
> However neither of these affects the testability of the patch.
>
> Basically the way that it works is that the standby delay is computed with
> reference to XLogReceiptTime rather than any timestamp obtained from WAL.
> XLogReceiptTime is set to current time whenever we obtain a WAL segment
> from the archives or when we begin processing a fresh batch of WAL from
> walreceiver.  There's a subtlety in the streaming case: we don't advance
> XLogReceiptTime if we are not caught up, that is if the startup process
> is more than one flush cycle behind walreceiver.  In the normal case
> we'll advance XLogReceiptTime on each flush cycle, but once we start
> falling behind, it doesn't move so the grace time alloted to conflicting
> queries begins to decrease.
>
> I split max_standby_delay into two GUC variables, as previously
> threatened: max_standby_archive_delay and max_standby_streaming_delay.
> The former applies when processing data read from archive, the latter
> when processing data received from walreceiver.  I think this is really
> quite important given the new behavior, because max_standby_archive_delay
> ought to be set with reference to the expected time to process one WAL
> segment, whereas max_standby_streaming_delay doesn't depend on that value
> at all.  I'm not sure what good defaults are for these values, so I left
> them at 30 seconds for the moment.  I'm inclined to think
> max_standby_archive_delay ought to be quite a bit less though.
>
> It might be worth adding a minimum-grace-time limit as we previously
> discussed, but this patch doesn't attempt to do that.
>
> Comments?

I haven't been able to wrap my head around why the delay should be
LESS in the archive case than in the streaming case. Can you attempt
to hit me with the clue-by-four?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-07-02 20:36:22 Re: Keepalive for max_standby_delay
Previous Message Robert Haas 2010-07-02 20:18:54 Re: [COMMITTERS] pgsql: Allow copydir() to be interrupted.