Re: Speed up Clog Access by increasing CLOG buffers

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2016-10-31 13:32:19
Message-ID: 5960ada5-98f5-dacf-903f-6e153aed76ce@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/30/2016 07:32 PM, Tomas Vondra wrote:
> Hi,
>
> On 10/27/2016 01:44 PM, Amit Kapila wrote:
>> On Thu, Oct 27, 2016 at 4:15 AM, Tomas Vondra
>> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>>
>>> FWIW I plan to run the same test with logged tables - if it shows
>>> similar
>>> regression, I'll be much more worried, because that's a fairly typical
>>> scenario (logged tables, data set > shared buffers), and we surely can't
>>> just go and break that.
>>>
>>
>> Sure, please do those tests.
>>
>
> OK, so I do have results for those tests - that is, scale 3000 with
> shared_buffers=16GB (so continuously writing out dirty buffers). The
> following reports show the results slightly differently - all three "tps
> charts" next to each other, then the speedup charts and tables.
>
> Overall, the results are surprisingly positive - look at these results
> (all ending with "-retest"):
>
> [1] http://tvondra.bitbucket.org/index2.html#dilip-3000-logged-sync-retest
>
> [2]
> http://tvondra.bitbucket.org/index2.html#pgbench-3000-logged-sync-noskip-retest
>
>
> [3]
> http://tvondra.bitbucket.org/index2.html#pgbench-3000-logged-sync-skip-retest
>
>
> All three show significant improvement, even with fairly low client
> counts. For example with 72 clients, the tps improves 20%, without
> significantly affecting variability variability of the results( measured
> as stdddev, more on this later).
>
> It's however interesting that "no_content_lock" is almost exactly the
> same as master, while the other two cases improve significantly.
>
> The other interesting thing is that "pgbench -N" [3] shows no such
> improvement, unlike regular pgbench and Dilip's workload. Not sure why,
> though - I'd expect to see significant improvement in this case.
>
> I have also repeated those tests with clog buffers increased to 512 (so
> 4x the current maximum of 128). I only have results for Dilip's workload
> and "pgbench -N":
>
> [4]
> http://tvondra.bitbucket.org/index2.html#dilip-3000-logged-sync-retest-512
>
> [5]
> http://tvondra.bitbucket.org/index2.html#pgbench-3000-logged-sync-skip-retest-512
>
>
> The results are somewhat surprising, I guess, because the effect is
> wildly different for each workload.
>
> For Dilip's workload increasing clog buffers to 512 pretty much
> eliminates all benefits of the patches. For example with 288 client,
> the group_update patch gives ~60k tps on 128 buffers [1] but only 42k
> tps on 512 buffers [4].
>
> With "pgbench -N", the effect is exactly the opposite - while with
> 128 buffers there was pretty much no benefit from any of the patches
> [3], with 512 buffers we suddenly get almost 2x the throughput, but
> only for group_update and master (while the other two patches show no
> improvement at all).
>

The remaining benchmark with 512 clog buffers completed, and the impact
roughly matches Dilip's benchmark - that is, increasing the number of
clog buffers eliminates all positive impact of the patches observed on
128 buffers. Compare these two reports:

[a] http://tvondra.bitbucket.org/#pgbench-3000-logged-sync-noskip-retest

[b] http://tvondra.bitbucket.org/#pgbench-3000-logged-sync-noskip-retest-512

With 128 buffers the group_update and granular_locking patches achieve
up to 50k tps, while master and no_content_lock do ~30k tps. After
increasing number of clog buffers, we get only ~30k in all cases.

I'm not sure what's causing this, whether we're hitting limits of the
simple LRU cache used for clog buffers, or something else. But maybe
there's something in the design of clog buffers that make them work less
efficiently with more clog buffers? I'm not sure whether that's
something we need to fix before eventually committing any of them.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-10-31 13:35:57 Re: DML and column cound in aggregated subqueries
Previous Message Craig Ringer 2016-10-31 13:28:59 Re: Logical decoding and walsender timeouts