Re: measuring lwlock-related latency spikes

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <robertmhaas(at)gmail(dot)com>
Cc: <simon(at)2ndquadrant(dot)com>,<stark(at)mit(dot)edu>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: measuring lwlock-related latency spikes
Date: 2012-04-03 12:28:26
Message-ID: 4F7AA69A0200002500046AE4@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Robert Haas wrote:
> Kevin Grittner wrote:
>
>> I can't help thinking that the "background hinter" I had ideas
>> about writing would prevent many of the reads of old CLOG pages,
>> taking a lot of pressure off of this area. It just occurred to me
>> that the difference between that idea and having an autovacuum
>> thread which just did first-pass work on dirty heap pages is slim
>> to none.
>
> Yeah. Marking things all-visible in the background seems possibly
> attractive, too. I think the trick is to figuring out the control
> mechanism. In this case, the workload fits within shared_buffers,
> so it's not helpful to think about using buffer eviction as the
> trigger for doing these operations, though that might have some
> legs in general. And a simple revolving scan over shared_buffers
> doesn't really figure to work out well either, I suspect, because
> it's too undirected. I think what you'd really like to have is a
> list of buffers that were modified by transactions which have
> recently committed or rolled back.

Yeah, that's what I was thinking. Since we only care about dirty
unhinted tuples, we need some fairly efficient way to track those to
make this pay.

> but that seems more like a nasty benchmarking kludge that something
> that's likely to solve real-world problems.

I'm not so sure. Unfortunagely, it may be hard to know without
writing at least a crude form of this to test, but there are several
workloads where hint bit rewrites and/or CLOG contention caused by
the slow tapering of usage of old pages contribute to problems.

>> I know how much time good benchmarking can take, so I hesitate to
>> suggest another permutation, but it might be interesting to see
>> what it does to the throughput if autovacuum is configured to what
>> would otherwise be considered insanely aggressive values (just for
>> vacuum, not analyze). To give this a fair shot, the whole database
>> would need to be vacuumed between initial load and the start of
>> the benchmark.
>
> If you would like to provide a chunk of settings that I can splat
> into postgresql.conf, I'm happy to run 'em through a test cycle and
> see what pops out.

Might as well jump in with both feet:

autovacuum_naptime = 1s
autovacuum_vacuum_threshold = 1
autovacuum_vacuum_scale_factor = 0.0

If that smooths the latency peaks and doesn't hurt performance too
much, it's decent evidence that the more refined technique could be a
win.

-Kevin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2012-04-03 12:29:04 Re: Command Triggers patch v18
Previous Message Robert Haas 2012-04-03 12:22:31 Re: Switching to Homebrew as recommended Mac install?