Re: Speed up Clog Access by increasing CLOG buffers

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2016-09-23 01:47:19
Message-ID: 15e6e88e-3f8c-ce4b-0782-c279511815ea@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/23/2016 03:20 AM, Robert Haas wrote:
> On Thu, Sep 22, 2016 at 7:44 PM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> I don't dare to suggest rejecting the patch, but I don't see how
>> we could commit any of the patches at this point. So perhaps
>> "returned with feedback" and resubmitting in the next CF (along
>> with analysis of improvedworkloads) would be appropriate.
>
> I think it would be useful to have some kind of theoretical analysis
> of how much time we're spending waiting for various locks. So, for
> example, suppose we one run of these tests with various client
> counts - say, 1, 8, 16, 32, 64, 96, 128, 192, 256 - and we run
> "select wait_event from pg_stat_activity" once per second throughout
> the test. Then we see how many times we get each wait event,
> including NULL (no wait event). Now, from this, we can compute the
> approximate percentage of time we're spending waiting on
> CLogControlLock and every other lock, too, as well as the percentage
> of time we're not waiting for lock. That, it seems to me, would give
> us a pretty clear idea what the maximum benefit we could hope for
> from reducing contention on any given lock might be.
>

Yeah, I think that might be a good way to analyze the locks in general,
not just got these patches. 24h run with per-second samples should give
us about 86400 samples (well, multiplied by number of clients), which is
probably good enough.

We also have LWLOCK_STATS, that might be interesting too, but I'm not
sure how much it affects the behavior (and AFAIK it also only dumps the
data to the server log).

>
> Now, we could also try that experiment with various patches. If we
> can show that some patch reduces CLogControlLock contention without
> increasing TPS, they might still be worth committing for that
> reason. Otherwise, you could have a chicken-and-egg problem. If
> reducing contention on A doesn't help TPS because of lock B and
> visca-versa, then does that mean we can never commit any patch to
> reduce contention on either lock? Hopefully not. But I agree with you
> that there's certainly not enough evidence to commit any of these
> patches now. To my mind, these numbers aren't convincing.
>

Yes, the chicken-and-egg problem is why the tests were done with
unlogged tables (to work around the WAL lock).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2016-09-23 01:54:22 Re: pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)
Previous Message Robert Haas 2016-09-23 01:32:28 Re: Tracking wait event for latches