POC: Cache data in GetSnapshotData()

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: POC: Cache data in GetSnapshotData()
Date: 2015-02-02 15:27:06
Message-ID: 20150202152706.GD9201@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I've, for a while, pondered whether we couldn't find a easier way than
CSN to make snapshots cheaper as GetSnapshotData() very frequently is
one of the top profile entries. Especially on bigger servers, where the
pretty much guaranteed cachemisses are quite visibile.

My idea is based on the observation that even in very write heavy
environments the frequency of relevant PGXACT changes is noticeably
lower than GetSnapshotData() calls.

My idea is to simply cache the results of a GetSnapshotData() result in
shared memory and invalidate it everytime something happens that affects
the results. Then GetSnapshotData() can do a couple of memcpy() calls to
get the snapshot - which will be significantly faster in a large number
of cases. For one often enough there's many transactions without an xid
assigned (and thus xip/subxip are small), for another, even if that's
not the case it's linear copies instead of unpredicable random accesses

Now, that idea is pretty handwavy. After talking about it with a couple
of people I've decided to write a quick POC to check whether it's
actually beneficial. That POC isn't anything close to being ready or
complete. I just wanted to evaluate whether the idea has some merit or
not. That said, it survives make installcheck-parallel.

Some very preliminary performance results indicate a growth of between
25% (pgbench -cj 796 -m prepared -f 'SELECT 1'), 15% (pgbench -s 300 -S
-cj 796), 2% (pgbench -cj 96 -s 300) on a 4 x E5-4620 system. Even on my
laptop I can measure benefits in a readonly, highly concurrent,
workload; although unsurprisingly much smaller.

Now, these are all somewhat extreme workloads, but still. It's a nice
improvement for a quick POC.

So far the implemented idea is to just completely wipe the cached
snapshot everytime somebody commits. I've afterwards not been able to
see GetSnapshotData() in the profile at all - so that possibly is
actually sufficient?

This implementation probably has major holes. Like it probably ends up
not really increasing the xmin horizon when a longrunning readonly
transaction without an xid commits...

Comments about the idea?


Andres Freund

Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Heavily-WIP-Cache-snapshots-in-GetSnapshotData.patch text/x-patch 6.1 KB


Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-02-02 15:32:53 Re: Proposal : REINDEX xxx VERBOSE
Previous Message Magnus Hagander 2015-02-02 14:48:08 Re: Release note bloat is getting out of hand