Re: snapshot too old, configured by time

From: Kevin Grittner <kgrittn(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Steve Singer <steve(at)ssinger(dot)info>, Kevin Grittner <kgrittn(at)ymail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: snapshot too old, configured by time
Date: 2016-03-30 18:12:55
Message-ID: CACjxUsMN+dUMajXER_tX2wxt6CjCR=h=_krweEefU6K7Z3FKkA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 30, 2016 at 11:37 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> I think a safer proposition would be to replace all current
>> BufferGetPage() calls (there are about 500) by adding the necessary
>> arguments: buffer, snapshot, rel, and an integer "flags". All this
>> without adding the feature. Then a subsequent commit would add the
>> TestForOldSnapshot inside BufferGetPage, *except* when a
>> BUFFER_NO_SNAPSHOT_TEST flag is passed. That way, new code always get
>> the snapshot test by default.
>
> That seems awfully invasive,

That's the argument I made, which Álvaro described as "dismissing"
his suggestion. In this post from October of 2015, I pointed out
that there are 36 calls where we need a snapshot and 450 where we
don't.

http://www.postgresql.org/message-id/56479263.1140984.1444945639606.JavaMail.yahoo@mail.yahoo.com

> not to mention performance-killing if the expectation is that
> most such calls are going to need a snapshot check.

This patch is one which has allowed a customer where we could not
meet their performance requirements to pass them. It is the
opposite of performance-killing.

> (Quite aside from the calls themselves, are they all in
> routines that are being passed the right snapshot today?)

I went over that very carefully when the patch was first proposed
in January of 2015, and have kept an eye on things to try to avoid
bit-rot which might introduce new calls which need to be touched.
The customer for which this was initially developed uses a 30 day
test run with very complex production releases driven by a
simulated user load with large numbers of users. EDB has
back-patched it to 9.4 where an earlier version of it is being used
in production by this (large) customer.

> TBH, I think that shoving in something like this at the end of the last
> commitfest would be a bad idea even if there were widespread consensus
> that we wanted the feature ... which I am not sure there is.

I don't recall anyone opposing the feature itself it except you,
and it has had enthusiastic support from many. Before I was made
aware of a relevant isolation tester feature, there were many
objections to my efforts at regression testing, and Álvaro has
argued for touching 450 locations in the code that otherwise don't
need it, just as "reminders" to people to consider whether newly
added calls might need a snapshot, and he doesn't like the new
header dependencies. Simon seemed to want it in 9.5, but that was
clearly premature, IMO.

Characterizing this as being shoved in at the last moment seems
odd, since the big hang-up from the November CF was the testing
methodology in the patch. It has been feature-complete since the
September CF, and modified based on feedback. Granted, some
additional testing in this CF brought up a couple things that merit
a look, but this patch is hardly unique in that regard.

> I think it might be time to bounce this one to 9.7.

If there is a consensus for that, sure, or if I can't sort out the
latest issues by feature freeze (which is admittedly looming).

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-03-30 18:23:53 Re: Using quicksort for every external sort run
Previous Message Jesper Pedersen 2016-03-30 18:08:05 Re: Speedup twophase transactions