Re: Speed up Clog Access by increasing CLOG buffers

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2016-10-31 14:28:54
Message-ID: 8efd9956-059a-78f3-32ff-f1e1a4dd09c8@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/31/2016 02:51 PM, Amit Kapila wrote:
> On Mon, Oct 31, 2016 at 12:02 AM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> Hi,
>>
>> On 10/27/2016 01:44 PM, Amit Kapila wrote:
>>
>> I've read that analysis, but I'm not sure I see how it explains the "zig
>> zag" behavior. I do understand that shifting the contention to some other
>> (already busy) lock may negatively impact throughput, or that the
>> group_update may result in updating multiple clog pages, but I don't
>> understand two things:
>>
>> (1) Why this should result in the fluctuations we observe in some of the
>> cases. For example, why should we see 150k tps on, 72 clients, then drop to
>> 92k with 108 clients, then back to 130k on 144 clients, then 84k on 180
>> clients etc. That seems fairly strange.
>>
>
> I don't think hitting multiple clog pages has much to do with
> client-count. However, we can wait to see your further detailed test
> report.
>
>> (2) Why this should affect all three patches, when only group_update has to
>> modify multiple clog pages.
>>
>
> No, all three patches can be affected due to multiple clog pages.
> Read second paragraph ("I think one of the probable reasons that could
> happen for both the approaches") in same e-mail [1]. It is basically
> due to frequent release-and-reacquire of locks.
>
>>
>>
>>>> On logged tables it usually looks like this (i.e. modest increase for
>>>> high
>>>> client counts at the expense of significantly higher variability):
>>>>
>>>> http://tvondra.bitbucket.org/#pgbench-3000-logged-sync-skip-64
>>>>
>>>
>>> What variability are you referring to in those results?
>>
>>>
>>
>> Good question. What I mean by "variability" is how stable the tps is during
>> the benchmark (when measured on per-second granularity). For example, let's
>> run a 10-second benchmark, measuring number of transactions committed each
>> second.
>>
>> Then all those runs do 1000 tps on average:
>>
>> run 1: 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000
>> run 2: 500, 1500, 500, 1500, 500, 1500, 500, 1500, 500, 1500
>> run 3: 0, 2000, 0, 2000, 0, 2000, 0, 2000, 0, 2000
>>
>
> Generally, such behaviours are seen due to writes. Are WAL and DATA
> on same disk in your tests?
>

Yes, there's one RAID device on 10 SSDs, with 4GB of the controller.
I've done some tests and it easily handles > 1.5GB/s in sequential
writes, and >500MB/s in sustained random writes.

Also, let me point out that most of the tests were done so that the
whole data set fits into shared_buffers, and with no checkpoints during
the runs (so no writes to data files should really happen).

For example these tests were done on scale 3000 (45GB data set) with
64GB shared buffers:

[a]
http://tvondra.bitbucket.org/index2.html#pgbench-3000-unlogged-sync-noskip-64

[b]
http://tvondra.bitbucket.org/index2.html#pgbench-3000-logged-async-noskip-64

and I could show similar cases with scale 300 on 16GB shared buffers.

In those cases, there's very little contention between WAL and the rest
of the data base (in terms of I/O).

And moreover, this setup (single device for the whole cluster) is very
common, we can't just neglect it.

But my main point here really is that the trade-off in those cases may
not be really all that great, because you get the best performance at
36/72 clients, and then the tps drops and variability increases. At
least not right now, before tackling contention on the WAL lock (or
whatever lock becomes the bottleneck).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kouhei Kaigai 2016-10-31 14:33:04 ParallelFinish-hook of FDW/CSP (Re: Steps inside ExecEndGather)
Previous Message Kouhei Kaigai 2016-10-31 14:20:32 PassDownLimitBound for ForeignScan/CustomScan [take-2]