Re: Keepalive for max_standby_delay

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Keepalive for max_standby_delay
Date: 2010-07-02 21:01:15
Message-ID: AANLkTilNSI5N-3OE2y7IoNL_1QdoAI0Ebix0VM-dUfuM@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 2, 2010 at 4:52 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Fri, Jul 2, 2010 at 4:36 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> In the archive case, you're presumably trying to catch up, and so it
>>> makes sense to kill queries faster so you can catch up.
>
>> On the flip side, the timeout for the WAL segment is for 16MB of WAL,
>> whereas the timeout for SR is normally going to be for a much smaller
>> chunk (right?).  So even with the same value for both, it seems like
>> queries will be killed more aggressively during archive recovery.
>
> True, but I suspect useful settings will be significantly larger than
> those times anyway, so it kind of comes out in the wash.  For
> walreceiver the expected time to process each new chunk is less than
> wal_sender_delay (if it's not, you better get a faster slave machine).
> For the archive case, it probably takes more than 200ms to process a
> 16MB segment, but still only a few seconds.  So if you have both the
> max-delay parameters set to 30 seconds, the minimum "normal" grace
> periods are 29.8 sec vs maybe 25 sec, not all that different.

I guess I was thinking more in terms of conflicts-per-WAL-record than
actual replay time. If we assume that ever 100th WAL record produces
a conflict, SR will pause every once in a while, and then catch up
(either because it canceled some queries or because they finished
within the allowable grace period), whereas archive recovery will be
patient for a bit and then start nuking 'em from orbit. Maybe my
mental model is all wrong though.

Anyway, I don't object to having two separate settings, and maybe
we'll know more about how to tune them after this gets out in the
field.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-07-02 21:02:18 Re: hstore ==> and deprecate =>
Previous Message Tom Lane 2010-07-02 21:00:26 Re: hstore ==> and deprecate =>