Re: Proposal for CSN based snapshots

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andres Freund <andres(at)2ndquadrant(dot)com>, Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Ants Aasma <ants(at)cybertec(dot)at>, Bruce Momjian <bruce(at)momjian(dot)us>, obartunov <obartunov(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)postgrespro(dot)ru>
Subject: Re: Proposal for CSN based snapshots
Date: 2016-08-22 17:32:42
Message-ID: ccbbd4e3-2999-cf29-96d0-66935c4ca9aa@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08/22/2016 07:49 PM, Robert Haas wrote:
> Nice to see you working on this again.
>
> On Mon, Aug 22, 2016 at 12:35 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>> A sequential scan of a table like that with 10 million rows took about 700
>> ms on my laptop, when the hint bits are set, without this patch. With this
>> patch, if there's a snapshot holding back the xmin horizon, so that we need
>> to check the CSN log for every XID, it took about 30000 ms. So we have some
>> optimization work to do :-). I'm not overly worried about that right now, as
>> I think there's a lot of room for improvement in the SLRU code. But that's
>> the next thing I'm going to work.
>
> So the worst case for this patch is obviously bad right now and, as
> you say, that means that some optimization work is needed.
>
> But what about the best case? If we create a scenario where there are
> no open read-write transactions at all and (somehow) lots and lots of
> ProcArrayLock contention, how much does this help?

I ran some quick pgbench tests on my laptop, but didn't see any
meaningful benefit. I think the best I could see is about 5% speedup,
when running "pgbench -S", with 900 idle connections sitting in the
background. On the positive side, I didn't see much slowdown either.
(Sorry, I didn't record the details of those tests, as I was testing
many different options and I didn't see a clear difference either way.)

It seems that Amit's PGPROC batch clearing patch was very effective. I
remember seeing ProcArrayLock contention very visible earlier, but I
can't hit that now. I suspect you'd still see contention on bigger
hardware, though, my laptop has oly 4 cores. I'll have to find a real
server for the next round of testing.

> Because there's only a purpose to trying to minimize the losses if
> there are some gains to which we can look forward.

Aside from the potential performance gains, this slashes a lot of
complicated code:

70 files changed, 2429 insertions(+), 6066 deletions(-)

That removed code is quite mature at this point, and I'm sure we'll add
some code back to this patch as it evolves, but still.

Also, I'm looking forward for a follow-up patch, to track snapshots in
backends at a finer level, so that vacuum could remove tuples more
aggressively, if you have pg_dump running for days. CSN snapshots isn't
a strict requirement for that, but it makes it simpler, when you can
represent a snapshot with a small fixed-size integer.

Yes, seeing some direct performance gains would be nice too.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-08-22 17:38:52 Re: Proposal for CSN based snapshots
Previous Message Robert Haas 2016-08-22 17:19:32 Re: distinct estimate of a hard-coded VALUES list