Re: POC: Cache data in GetSnapshotData()

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cache data in GetSnapshotData()
Date: 2015-07-24 15:15:13
Message-ID: CABOikdN+1P1WPt15SyGqjXfi9n4LQyqGAmGQdGcF0Xjyvb5T_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 2, 2015 at 8:57 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
wrote:

> Hi,
>
> I've, for a while, pondered whether we couldn't find a easier way than
> CSN to make snapshots cheaper as GetSnapshotData() very frequently is
> one of the top profile entries. Especially on bigger servers, where the
> pretty much guaranteed cachemisses are quite visibile.
>
> My idea is based on the observation that even in very write heavy
> environments the frequency of relevant PGXACT changes is noticeably
> lower than GetSnapshotData() calls.
>
> My idea is to simply cache the results of a GetSnapshotData() result in
> shared memory and invalidate it everytime something happens that affects
> the results. Then GetSnapshotData() can do a couple of memcpy() calls to
> get the snapshot - which will be significantly faster in a large number
> of cases. For one often enough there's many transactions without an xid
> assigned (and thus xip/subxip are small), for another, even if that's
> not the case it's linear copies instead of unpredicable random accesses
> through PGXACT/PGPROC.
>
> Now, that idea is pretty handwavy. After talking about it with a couple
> of people I've decided to write a quick POC to check whether it's
> actually beneficial. That POC isn't anything close to being ready or
> complete. I just wanted to evaluate whether the idea has some merit or
> not. That said, it survives make installcheck-parallel.
>
> Some very preliminary performance results indicate a growth of between
> 25% (pgbench -cj 796 -m prepared -f 'SELECT 1'), 15% (pgbench -s 300 -S
> -cj 796), 2% (pgbench -cj 96 -s 300) on a 4 x E5-4620 system. Even on my
> laptop I can measure benefits in a readonly, highly concurrent,
> workload; although unsurprisingly much smaller.
>
> Now, these are all somewhat extreme workloads, but still. It's a nice
> improvement for a quick POC.
>
> So far the implemented idea is to just completely wipe the cached
> snapshot everytime somebody commits. I've afterwards not been able to
> see GetSnapshotData() in the profile at all - so that possibly is
> actually sufficient?
>
> This implementation probably has major holes. Like it probably ends up
> not really increasing the xmin horizon when a longrunning readonly
> transaction without an xid commits...
>
> Comments about the idea?
>
>
FWIW I'd presented somewhat similar idea and also a patch a few years back
and from what I remember, I'd seen good results with the patch. So +1 for
the idea.

http://www.postgresql.org/message-id/CABOikdMsJ4OsxtA7XBV2quhKYUo_4105fJF4N+uyRoyBAzSuuQ@mail.gmail.com

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2015-07-24 15:59:13 Re: creating extension including dependencies
Previous Message Kouhei Kaigai 2015-07-24 14:51:48 Re: We need to support ForeignRecheck for late row locking, don't we?