Quick Links

Re: Speed up Clog Access by increasing CLOG buffers

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Speed up Clog Access by increasing CLOG buffers
Date:	2016-10-31 13:51:52
Message-ID:	CAA4eK1Ksd6D0H9HPmMS3S7UpL2G8JMJ0kvRCDz=4=AqFn790sg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Oct 31, 2016 at 12:02 AM, Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> Hi,
>
> On 10/27/2016 01:44 PM, Amit Kapila wrote:
>
> I've read that analysis, but I'm not sure I see how it explains the "zig
> zag" behavior. I do understand that shifting the contention to some other
> (already busy) lock may negatively impact throughput, or that the
> group_update may result in updating multiple clog pages, but I don't
> understand two things:
>
> (1) Why this should result in the fluctuations we observe in some of the
> cases. For example, why should we see 150k tps on, 72 clients, then drop to
> 92k with 108 clients, then back to 130k on 144 clients, then 84k on 180
> clients etc. That seems fairly strange.
>

I don't think hitting multiple clog pages has much to do with
client-count. However, we can wait to see your further detailed test
report.

> (2) Why this should affect all three patches, when only group_update has to
> modify multiple clog pages.
>

No, all three patches can be affected due to multiple clog pages.
Read second paragraph ("I think one of the probable reasons that could
happen for both the approaches") in same e-mail [1]. It is basically
due to frequent release-and-reacquire of locks.

>
>
>>> On logged tables it usually looks like this (i.e. modest increase for
>>> high
>>> client counts at the expense of significantly higher variability):
>>>
>>> http://tvondra.bitbucket.org/#pgbench-3000-logged-sync-skip-64
>>>
>>
>> What variability are you referring to in those results?
>
>>
>
> Good question. What I mean by "variability" is how stable the tps is during
> the benchmark (when measured on per-second granularity). For example, let's
> run a 10-second benchmark, measuring number of transactions committed each
> second.
>
> Then all those runs do 1000 tps on average:
>
> run 1: 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000
> run 2: 500, 1500, 500, 1500, 500, 1500, 500, 1500, 500, 1500
> run 3: 0, 2000, 0, 2000, 0, 2000, 0, 2000, 0, 2000
>

Generally, such behaviours are seen due to writes. Are WAL and DATA
on same disk in your tests?

[1] - https://www.postgresql.org/message-id/CAA4eK1J9VxJUnpOiQDf0O%3DZ87QUMbw%3DuGcQr4EaGbHSCibx9yA%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Re: Speed up Clog Access by increasing CLOG buffers at 2016-10-30 18:32:48 from Tomas Vondra

Responses

Re: Speed up Clog Access by increasing CLOG buffers at 2016-10-31 14:28:54 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2016-10-31 13:53:26	Re: DML and column cound in aggregated subqueries
Previous Message	Robert Haas	2016-10-31 13:44:16	Re: Dumb mistakes in WalSndWriteData()