On Wed, Feb 8, 2012 at 9:38 AM, Alvaro Herrera
> I think that (part of) the underlying problem is that we have no clear
> way to specify "how much I/O do you want autovacuum to use". That's
> what this patch is all about, AFAIU; it has nothing to do with
> monitoring. Right now, as has been said, the only way to tweak this is
> to change vacuum_cost_delay; the problem with that setting is that
> making the calculation is not straightforward.
> (Now, I disagree that it's so complex that it cannot ever be explain to
> a class; or that it's so obscure that the only way to make it work is to
> leave it alone and never touch it. It's complex, okay, but it's not
> exactly rocket science either.)
I emphatically agree.
> If the only real downside to this patch is that some people have already
> changed vacuum_cost_delay and they will want to migrate those settings
> forward, maybe we shouldn't be looking at _replacing_ that one with a
> new setting, but rather just add the new setting; and in the code for
> each, make sure that only one of them is set, and throw an error if the
> other one is.
I think that part of the confusion here is that the current settings
aren't strictly trying to regulate vacuum's I/O rate so much as the
total amount of work it can do per unit time. Even if autovac dirties
no pages and has no misses, the pages it hits will eventually fill up
the cost limit and it will sleep; but the work in that case is all CPU
utilization, not I/O. Similarly, we don't have separate control knobs
for read I/O (misses) and write I/O (dirty pages). The sleeps happen
due to a blend of factors (CPU, read I/O, write I/O) which are
combined to produce an estimate of the total impact of vacuum on the
rest of the system.
Now, an argument could be made that we ought not to care about that.
We could say: trying to use this blended algorithm to figure out when
vacuum is getting too expensive is a loser. We just want to limit the
amount of dirty data that's being generated, because that's the only
thing that matters to us. In that case, I could see adding a new
setting, sitting alongside the existing settings: amount of dirty data
that can be generated per second. But that's not what Greg is
proposing. He is proposing, essentially, that we keep the blended
algorithm, but then stack another calculation on top that backs into
the blended cost limit based on the user's tolerance for dirtying
data. So there will still be limits on the amount of read I/O and CPU
usage, but they'll be derived from the allowable rate of data dirtying
and the ratios between the different cost parameters. The current
settings aren't exactly intuitive, but I really don't want to have to
explain to a customer that setting the dirty data limit to 8MB/s will
actually limit it to just 5.3MB/s if all the pages are not in shared
buffers (because the page miss costs will account for a third of the
budget) and to 7.8MB/s if all the pages are in shared buffers (because
the page miss costs will account for a twenty-first of the budget) and
somewhere in between if only some of the pages are resident; and that,
further, by setting the dirty data rate to 8MB/s, they've implicitly
set the max read rate from disk at 16MB/s, again because of the 2:1
dirty:miss cost ratio. Yikes!
Honestly, I think the best place for this work is in something like
pgtune. It's totally useful to have a calculator for this stuff (for
many of the same reasons that it's useful to have explain.depesz.com)
but the abstraction being proposed is leaky enough that I think it's
bound to cause confusion.
> This is all fine, but what does it have to do with the current patch? I
> mean, if we change vacuum to do some stuff differently, it's still going
> to have to read and dirty pages and thus account for I/O.
Yeah, I drifted off topic there a bit. I think the only relevant
point in all that is that even if we all agreed that this is an
improvement, I'd be reluctant to slap a band-aid on something that I
think needs surgery.
The Enterprise PostgreSQL Company
In response to
pgsql-hackers by date
|Next:||From: Robert Haas||Date: 2012-02-08 15:15:36|
|Subject: Re: Progress on fast path sorting, btree index creation time|
|Previous:||From: Tom Lane||Date: 2012-02-08 14:51:05|
|Subject: Re: Progress on fast path sorting, btree index creation time |