Re: pgsql: Add the "snapshot too old" feature

From: Kevin Grittner <kgrittn(at)gmail(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Kevin Grittner <kgrittn(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robert(dot)haas(at)enterprisedb(dot)com>, "pgsql-committers(at)postgresql(dot)org" <pgsql-committers(at)postgresql(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql: Add the "snapshot too old" feature
Date: 2016-04-11 17:31:09
Message-ID: CACjxUsMwh980686LfmqGOGv58XsTpyUqqHPcsD3NGF5CrnXs+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Mon, Apr 11, 2016 at 8:20 AM, Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:

> 1) We uglify buffer manager interface a lot. This patch adds 3 more
> arguments to BufferGetPage(). It looks weird. Should we try to find less
> invasive way for doing this?

As already pointed out, I originally touched about 450 fewer places
in the code, and did not change the signature of BufferGetPage().
I was torn on the argument that we needed a "forced choice" on
checking the snapshot age built into BufferGetPage() -- it is a bit
annoying, but it might prevent a later bug which could silently
return incorrect data. In the end, I caved to the argument that
the annoyance was worth the improved chances of avoiding such a
bug.

> 2) Did you ever try to examine performance regression?

Yes. Our customer used big machines for extensive performance
testing -- although they used "paced" input to simulate real users,
and saw nothing but gains. On my own i7 box, your test scaled to
100 (so it would fit in memory) yielded this:

unpatched:

number of transactions actually processed: 40534737
latency average = 0.739 ms
latency stddev = 0.359 ms
tps = 134973.430374 (including connections establishing)
tps = 135039.395041 (excluding connections establishing)

with the "snapshot too old" patch:

number of transactions actually processed: 40679654
latency average = 0.735 ms
latency stddev = 0.486 ms
tps = 135463.614244 (including connections establishing)
tps = 135866.160933 (excluding connections establishing)

> I tried simple read
> only test on 4x18 Intel server.
> pgbench -s 1000 -c 140 -j 100 -M prepared -S -T 300 -P 1 postgres (data fits
> to shared_buffers)
>
> master - 193 740.3 TPS
> snapshot too old reverted - 1 065 732.6 TPS
>
> So, for read-only benchmark this patch introduces more than 5 times
> regression on big machine.

I did not schedule a saturation test on a big machine. I guess I
should have done.

I'm confident this can be fixed; your suggestion for a wrapping
"if" is probably sufficient. I am looking at this and the misuse
of "volatile" now.

Are you able to easily test that or should I book time on one (or
more) of our big machines on my end?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Eisentraut 2016-04-11 18:10:15 pgsql: Add directory created during build to gitignore
Previous Message Tom Lane 2016-04-11 15:50:06 pgsql: Fix missing "volatile" in PLy_output().

Browse pgsql-hackers by date

  From Date Subject
Next Message Karl O. Pinc 2016-04-11 17:37:34 Re: PQsendQuery+PQgetResult+PQsetSingleRowMode limitations and support
Previous Message Shulgin, Oleksandr 2016-04-11 17:25:20 Re: PQsendQuery+PQgetResult+PQsetSingleRowMode limitations and support