Re: [RFC, POC] Don't require a NBuffer sized PrivateRefCount array of local buffer pins

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC, POC] Don't require a NBuffer sized PrivateRefCount array of local buffer pins
Date: 2014-08-26 23:52:29
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 2014-03-21 19:22:31 +0100, Andres Freund wrote:
> Hi,
> I've been annoyed at the amount of memory used by the backend local
> PrivateRefCount array for a couple of reasons:
> a) The performance impact of AtEOXact_Buffers() on Assert() enabled
> builds is really, really annoying.
> b) On larger nodes, the L1/2/3 cache impact of randomly accessing
> several megabyte big array at a high frequency is noticeable. I've
> seen the access to that to be the primary (yes, really) source of
> pipeline stalls.
> c) On nodes with significant shared_memory the sum of the per-backend
> arrays is a significant amount of memory, that could very well be
> used more beneficially.
> So what I have done in the attached proof of concept is to have a small
> (8 currently) array of (buffer, pincount) that's searched linearly when
> the refcount of a buffer is needed. When more than 8 buffers are pinned
> a hashtable is used to lookup the values.
> That seems to work fairly well. On the few tests I could run on my
> laptop - I've done this during a flight - it's a small performance win
> in all cases I could test. While saving a fair amount of memory.

Here's the next version of this patch. The major change is that newly
pinned/looked up buffers always go into the array, even when we're
spilling into the array. To get a free slot a preexisting entry (chosen
via PrivateRefCountArray[PrivateRefCountClock++ %
REFCOUNT_ARRAY_ENTRIES]) is displaced into the hash table. That way the
concern that frequently used buffers get 'stuck' in the hashtable while
unfrequently used are in the array is ameliorated.

The biggest concern previously were some benchmarks. I'm not entirely
sure where to get a good testcase for this that's not completely
artificial - most simpler testcases don't pin many buffers. I've played
a bit around and it's a slight performance win in pgbench read only and
mixed workloads, but not enough to get excited about alone.

When asserts are enabled, the story is different. The admittedly extreme
case of readonly pgbench scale 350, with 6GB shared_buffers and 128
clients goes from 3204.489825 39277.077448 TPS. So a) above is
definitely improved :)

The memory savings are clearly visible. During a pgbench scale 350, -cj
128 readonly run the following awk
for pid in $(pgrep -U andres postgres); do
grep VmData /proc/$pid/status;
done | \
awk 'BEGIN { sum = 0 } {sum += $2;} END { if (NR > 0) print sum/NR; else print 0;print sum;print NR}'


AVG: 4626.06
TOT: 619892
NR: 134

AVG: 1610.37
TOT: 217400
NR: 135

So, the patch is succeeding on c).

On it's own, in pgbench scale 350 -cj 128 -S -T10 the numbers are:
166171.039778, 165488.531353, 165045.182215, 161492.094693 (excluding connections establishing)
175812.388869, 171600.928377, 168317.370893, 169860.008865 (excluding connections establishing)

so, a bit of a performance win.

-j 16, -c 16 -S -T10:
159757.637878 161287.658276 164003.676018 160687.951017 162941.627683
160628.774342 163981.064787 151239.151102 164763.851903 165219.220209

I'm too tired to do continue with write tests now, but I don't see a
reason why they should be more meaningful... We really need a test with
more complex queries I'm afraid.

Anyway, I think at this stage this needs somebody to closely look at the
code. I don't think there's going to be any really surprising
performance revelations here.


Andres Freund

Andres Freund
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Make-backend-local-tracking-of-buffer-pins-memory-ef.patch text/x-patch 19.6 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Rukh Meski 2014-08-27 00:47:22 Re: pgbench throttling latency limit
Previous Message Kevin Grittner 2014-08-26 23:26:13 Re: delta relations in AFTER triggers