Re: Speed up Clog Access by increasing CLOG buffers

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2016-11-02 17:18:38
Message-ID: 04e39ac9-af28-0fda-8f72-7268197d281f@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/02/2016 05:52 PM, Amit Kapila wrote:
> On Wed, Nov 2, 2016 at 9:01 AM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> On 11/01/2016 08:13 PM, Robert Haas wrote:
>>>
>>> On Mon, Oct 31, 2016 at 5:48 PM, Tomas Vondra
>>> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>>>
>>
>> The one remaining thing is the strange zig-zag behavior, but that might
>> easily be a due to scheduling in kernel, or something else. I don't consider
>> it a blocker for any of the patches, though.
>>
>
> The only reason I could think of for that zig-zag behaviour is
> frequent multiple clog page accesses and it could be due to below
> reasons:
>
> a. transaction and its subtransactions (IIRC, Dilip's case has one
> main transaction and two subtransactions) can't fit into same page, in
> which case the group_update optimization won't apply and I don't think
> we can do anything for it.
> b. In the same group, multiple clog pages are being accessed. It is
> not a likely scenario, but it can happen and we might be able to
> improve a bit if that is happening.
> c. The transactions at same time tries to update different clog page.
> I think as mentioned upthread we can handle it by using slots an
> allowing multiple groups to work together instead of a single group.
>
> To check if there is any impact due to (a) or (b), I have added few
> logs in code (patch - group_update_clog_v9_log). The log message
> could be "all xacts are not on same page" or "Group contains
> different pages".
>
> Patch group_update_clog_v9_slots tries to address (c). So if there
> is any problem due to (c), this patch should improve the situation.
>
> Can you please try to run the test where you saw zig-zag behaviour
> with both the patches separately? I think if there is anything due
> to postgres, then you can see either one of the new log message or
> performance will be improved, OTOH if we see same behaviour, then I
> think we can probably assume it due to scheduler activity and move
> on. Also one point to note here is that even when the performance is
> down in that curve, it is equal to or better than HEAD.
>

Will do.

Based on the results with more client counts (increment by 6 clients
instead of 36), I think this really looks like something unrelated to
any of the patches - kernel, CPU, or something already present in
current master.

The attached results show that:

(a) master shows the same zig-zag behavior - No idea why this wasn't
observed on the previous runs.

(b) group_update actually seems to improve the situation, because the
performance keeps stable up to 72 clients, while on master the
fluctuation starts way earlier.

I'll redo the tests with a newer kernel - this was on 3.10.x which is
what Red Hat 7.2 uses, I'll try on 4.8.6. Then I'll try with the patches
you submitted, if the 4.8.6 kernel does not help.

Overall, I'm convinced this issue is unrelated to the patches.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
image/png 40.0 KB
zig-zag.ods application/vnd.oasis.opendocument.spreadsheet 25.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2016-11-02 17:42:49 Re: pageinspect: Hash index support
Previous Message Tomas Vondra 2016-11-02 17:18:30 Re: Speed up Clog Access by increasing CLOG buffers