Re: Speed up Clog Access by increasing CLOG buffers

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2016-11-03 15:08:02
Message-ID: CA+TgmoYC_tSGgZHWajuC8kwu_ZPrttZew0OwnH6Fcrs+UigS+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 1, 2016 at 11:31 PM, Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> I don't think I've suggested not committing any of the clog patches (or
> other patches in general) because shifting the contention somewhere else
> might cause regressions. At the end of the last CF I've however stated that
> we need to better understand the impact on various wokloads, and I think
> Amit agreed with that conclusion.
>
> We have that understanding now, I believe - also thanks to your idea of
> sampling wait events data.
>
> You're right we can't fix all the contention points in one patch, and that
> shifting the contention may cause regressions. But we should at least
> understand what workloads might be impacted, how serious the regressions may
> get etc. Which is why all the testing was done.

OK.

> Sure, I understand that. My main worry was that people will get worse
> performance with the next major version that what they get now (assuming we
> don't manage to address the other contention points). Which is difficult to
> explain to users & customers, no matter how reasonable it seems to us.
>
> The difference is that both the fast-path locks and msgNumLock went into
> 9.2, so that end users probably never saw that regression. But we don't know
> if that happens for clog and WAL.
>
> Perhaps you have a working patch addressing the WAL contention, so that we
> could see how that changes the results?

I don't think we do, yet. Amit or Kuntal might know more. At some
level I think we're just hitting the limits of the hardware's ability
to lay bytes on a platter, and fine-tuning the locking may not help
much.

> I might be wrong, but I doubt the kernel guys are running particularly wide
> set of tests, so how likely is it they will notice issues with specific
> workloads? Wouldn't it be great if we could tell them there's a bug and
> provide a workload that reproduces it?
>
> I don't see how "it's a Linux issue" makes it someone else's problem. The
> kernel guys can't really test everything (and are not obliged to). It's up
> to us to do more testing in this area, and report issues to the kernel guys
> (which is not happening as much as it should).

I don't exactly disagree with any of that. I just want to find a
course of action that we can agree on and move forward. This has been
cooking for a long time, and I want to converge on some resolution.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-11-03 15:15:24 Re: Patch to implement pg_current_logfile() function
Previous Message Peter Eisentraut 2016-11-03 15:05:00 Re: [COMMITTERS] pgsql: Add make rules to download raw Unicode mapping files