Re: Keepalive for max_standby_delay

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Keepalive for max_standby_delay
Date: 2010-07-02 20:11:35
Message-ID: 26561.1278101495@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

[ Apologies for the very slow turnaround on this --- I got hit with
another batch of non-postgres security issues this week. ]

Attached is a draft patch for revising the max_standby_delay behavior into
something I think is a bit saner. There is some unfinished business:

* I haven't touched the documentation yet

* The code in xlog.c needs to be reverted to its former behavior so that
recoveryLastXTime means what it's supposed to mean, ie just the last
commit/abort timestamp.

However neither of these affects the testability of the patch.

Basically the way that it works is that the standby delay is computed with
reference to XLogReceiptTime rather than any timestamp obtained from WAL.
XLogReceiptTime is set to current time whenever we obtain a WAL segment
from the archives or when we begin processing a fresh batch of WAL from
walreceiver. There's a subtlety in the streaming case: we don't advance
XLogReceiptTime if we are not caught up, that is if the startup process
is more than one flush cycle behind walreceiver. In the normal case
we'll advance XLogReceiptTime on each flush cycle, but once we start
falling behind, it doesn't move so the grace time alloted to conflicting
queries begins to decrease.

I split max_standby_delay into two GUC variables, as previously
threatened: max_standby_archive_delay and max_standby_streaming_delay.
The former applies when processing data read from archive, the latter
when processing data received from walreceiver. I think this is really
quite important given the new behavior, because max_standby_archive_delay
ought to be set with reference to the expected time to process one WAL
segment, whereas max_standby_streaming_delay doesn't depend on that value
at all. I'm not sure what good defaults are for these values, so I left
them at 30 seconds for the moment. I'm inclined to think
max_standby_archive_delay ought to be quite a bit less though.

It might be worth adding a minimum-grace-time limit as we previously
discussed, but this patch doesn't attempt to do that.

Comments?

regards, tom lane

Attachment Content-Type Size
maxstandbydelay-1.patch text/x-patch 22.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-07-02 20:18:54 Re: [COMMITTERS] pgsql: Allow copydir() to be interrupted.
Previous Message David E. Wheeler 2010-07-02 18:39:23 Re: hstore ==> and deprecate =>