Re: Speed up Clog Access by increasing CLOG buffers

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2016-09-29 14:14:44
Message-ID: 3f169562-4544-7b2b-9d25-b058da029ffb@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/29/2016 03:47 PM, Robert Haas wrote:
> On Wed, Sep 28, 2016 at 9:10 PM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>> I feel like we must be missing something here. If Dilip is seeing
>>> huge speedups and you're seeing nothing, something is different, and
>>> we don't know what it is. Even if the test case is artificial, it
>>> ought to be the same when one of you runs it as when the other runs
>>> it. Right?
>>>
>> Yes, definitely - we're missing something important, I think. One difference
>> is that Dilip is using longer runs, but I don't think that's a problem (as I
>> demonstrated how stable the results are).
>
> It's not impossible that the longer runs could matter - performance
> isn't necessarily stable across time during a pgbench test, and the
> longer the run the more CLOG pages it will fill.
>

Sure, but I'm not doing just a single pgbench run. I do a sequence of
pgbench runs, with different client counts, with ~6h of total runtime.
There's a checkpoint in between the runs, but as those benchmarks are on
unlogged tables, that flushes only very few buffers.

Also, the clog SLRU has 128 pages, which is ~1MB of clog data, i.e. ~4M
transactions. On some kernels (3.10 and 3.12) I can get >50k tps with 64
clients or more, which means we fill the 128 pages in less than 80 seconds.

So half-way through the run only 50% of clog pages fits into the SLRU,
and we have a data set with 30M tuples, with uniform random access - so
it seems rather unlikely we'll get transaction that's still in the SLRU.

But sure, I can do a run with larger data set to verify this.

>> I wonder what CPU model is Dilip using - I know it's x86, but not which
>> generation it is. I'm using E5-4620 v1 Xeon, perhaps Dilip is using a newer
>> model and it makes a difference (although that seems unlikely).
>
> The fact that he's using an 8-socket machine seems more likely to
> matter than the CPU generation, which isn't much different. Maybe
> Dilip should try this on a 2-socket machine and see if he sees the
> same kinds of results.
>

Maybe. I wouldn't expect a major difference between 4 and 8 sockets, but
I may be wrong.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Berg 2016-09-29 14:18:20 Re: Set log_line_prefix and application name in test drivers
Previous Message Christoph Berg 2016-09-29 14:13:26 Re: Set log_line_prefix and application name in test drivers