Re: Speed up Clog Access by increasing CLOG buffers

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2016-09-18 21:11:58
Message-ID: 9e877db7-4dc2-88c1-67ae-034ad1a5cafe@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/18/2016 06:08 AM, Amit Kapila wrote:
> On Sat, Sep 17, 2016 at 11:25 PM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> On 09/17/2016 07:05 AM, Amit Kapila wrote:
>>>
>>> On Sat, Sep 17, 2016 at 9:17 AM, Tomas Vondra
>>> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>>>
>>>> On 09/14/2016 05:29 PM, Robert Haas wrote:
>>
>> ...
>>>>>
>>>>> Sure, but you're testing at *really* high client counts here.
>>>>> Almost nobody is going to benefit from a 5% improvement at 256
>>>>> clients. You need to test 64 clients and 32 clients and 16
>>>>> clients and 8 clients and see what happens there. Those cases are
>>>>> a lot more likely than these stratospheric client counts.
>>>>>
>>>>
>>>> Right. My impression from the discussion so far is that the patches
>>>> only improve performance with very many concurrent clients - but as
>>>> Robert points out, almost no one is running with 256 active
>>>> clients, unless they have 128 cores or so. At least not if they
>>>> value latency more than throughput.
>>>>
>>>
>>> See, I am also not in favor of going with any of these patches, if
>>> they doesn't help in reduction of contention. However, I think it is
>>> important to understand, under what kind of workload and which
>>> environment it can show the benefit or regression whichever is
>>> applicable.
>>
>>
>> Sure. Which is why I initially asked what type of workload should I be
>> testing, and then done the testing with multiple savepoints as that's what
>> you suggested. But apparently that's not a workload that could benefit from
>> this patch, so I'm a bit confused.
>>
>>> Just FYI, couple of days back one of EDB's partner who was doing the
>>> performance tests by using HammerDB (which is again OLTP focussed
>>> workload) on 9.5 based code has found that CLogControlLock has the
>>> significantly high contention. They were using synchronous_commit=off
>>> in their settings. Now, it is quite possible that with improvements
>>> done in 9.6, the contention they are seeing will be eliminated, but
>>> we have yet to figure that out. I just shared this information to you
>>> with the intention that this seems to be a real problem and we should
>>> try to work on it unless we are able to convince ourselves that this
>>> is not a problem.
>>>
>>
>> So, can we approach the problem from this direction instead? That is,
>> instead of looking for workloads that might benefit from the patches, look
>> at real-world examples of CLOG lock contention and then evaluate the impact
>> on those?
>>
>
> Sure, we can go that way as well, but I thought instead of testing
> with a new benchmark kit (HammerDB), it is better to first get with
> some simple statements.
>

IMHO in the ideal case the first message in this thread would provide a
test case, demonstrating the effect of the patch. Then we wouldn't have
the issue of looking for a good workload two years later.

But now that I look at the first post, I see it apparently used a plain
tpc-b pgbench (with synchronous_commit=on) to show the benefits, which
is the workload I'm running right now (results sometime tomorrow).

That workload clearly uses no savepoints at all, so I'm wondering why
you suggested to use several of them - I know you said that it's to show
differences between the approaches, but why should that matter to any of
the patches (and if it matters, why I got almost no differences in the
benchmarks)?

Pardon my ignorance, CLOG is not my area of expertise ...

>> Extracting the workload from benchmarks probably is not ideal, but
>> it's still better than constructing the workload on our own to fit
>> the patch.
>>
>> FWIW I'll do a simple pgbench test - first with
>> synchronous_commit=on and then with synchronous_commit=off.
>> Probably the workloads we should have started with anyway, I
>> guess.
>>
>
> Here, synchronous_commit = off case could be interesting. Do you see
> any problem with first trying a workload where Dilip is seeing
> benefit? I am not suggesting we should not do any other testing, but
> just first lets try to reproduce the performance gain which is seen
> in Dilip's tests.
>

I plan to run Dilip's workload once the current benchmarks complete.

regard

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vladimir Gordiychuk 2016-09-18 23:12:04 Re: Stopping logical replication protocol
Previous Message Steve Singer 2016-09-18 20:17:35 Re: Logical Replication WIP