Re: Allow "snapshot too old" error, to prevent bloat

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Allow "snapshot too old" error, to prevent bloat
Date: 2015-02-18 21:57:38
Message-ID: 393710814.1004435.1424296659004.JavaMail.yahoo@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Feb 17, 2015 12:26 AM, "Andres Freund" <andres(at)2ndquadrant(dot)com> wrote:
>> On 2015-02-16 16:35:46 -0500, Bruce Momjian wrote:

>>> It seems we already have a mechanism in place that allows
>>> tuning of query cancel on standbys vs. preventing standby
>>> queries from seeing old data, specifically
>>> max_standby_streaming_delay/max_standby_archive_delay. We
>>> obsessed about how users were going to react to these odd
>>> variables, but there has been little negative feedback.
>>
>> FWIW, I think that's a somewhat skewed perception. I think it
>> was right to introduce those, because we didn't really have any
>> alternatives.

As far as I can see, the "alternatives" suggested so far are all
about causing heap bloat to progress more slowly, but still without
limit. I suggest, based on a lot of user feedback (from the
customer I've talked about at some length on this thread, as well
as numerous others), that unlimited bloat based on the activity of
one connection is a serious deficiency in the product; and that
there is no real alternative to something like a "snapshot too old"
error if we want to fix that deficiency. Enhancements to associate
a snapshot with a database and using a separate vacuum xmin per
database, not limiting vacuum of a particular object by snapshots
that cannot see that snapshot, etc., are all Very Good Things and I
hope those changes are made, but none of that fixes a very
fundamental flaw.

>> But max_standby_streaming_delay, max_standby_archive_delay and
>> hot_standby_feedback are among the most frequent triggers for
>> questions and complaints that I/we see.
>>
> Agreed.
> And a really bad one used to be vacuum_defer_cleanup_age, because
> of confusing units amongst other things. Which in terms seems
> fairly close to Kevins suggestions, unfortunately.

Particularly my initial suggestion, which was to base snapshot
"age" it on the number of transaction IDs assigned. Does this look
any better to you if it is something that can be set to '20min' or
'1h'? Just to restate, that would not automatically cancel the
snapshots past that age; it would allow vacuum of any tuples which
became "dead" that long ago, and would cause a "snapshot too old"
message for any read of a page modified more than that long ago
using a snapshot which was older than that.

Unless this idea gets some +1 votes Real Soon Now, I will mark the
patch as Rejected and move on.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-02-18 22:10:19 Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
Previous Message Peter Geoghegan 2015-02-18 21:43:55 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0