Re: [PERFORM] encouraging index-only scans

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PERFORM] encouraging index-only scans
Date: 2014-02-05 00:14:51
Message-ID: CAMkU=1zShBFuzryeX0eibVt4fAfxnX_-drfZGRRB8jxBSN0HKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Mon, Feb 3, 2014 at 8:55 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> I've also had some further thoughts about the right way to drive
> vacuum scheduling. I think what we need to do is tightly couple the
> rate at which we're willing to do vacuuming to the rate at which we're
> incurring "vacuum debt". That is, if we're creating 100kB/s of pages
> needing vacuum, we vacuum at 2-3MB/s (with default settings).

If we can tolerate 2-3MB/s without adverse impact on other work, then we
can tolerate it. Do we gain anything substantial by sand-bagging it?

> If
> we're creating 10MB/s of pages needing vacuum, we *still* vacuum at
> 2-3MB/s. Not shockingly, vacuum gets behind, the database bloats, and
> everything goes to heck.

(Your reference to bloat made be me think your comments here are about
vacuuming in general, not specific to IOS. If that isn't the case, then
please ignore.)

If we can only vacuum at 2-3MB/s without adversely impacting other
activity, but we are creating 10MB/s of future vacuum need, then there are
basically two possibilities I can think of. Either the 10MB/s represents a
spike, and vacuum should tolerate it and hope to catch up on the debt
later. Or it represents a new permanent condition, in which case I bought
too few hard drives for the work load, and no scheduling decision that
autovacuum can make will save me from my folly. Perhaps there is some
middle ground between those possibilities, but I don't see room for much
middle ground.

I guess there might be entirely different possibilities not between those
two; for example, I don't realize I'm doing something that is generating
10MB/s of vacuum debt, and would like to have this thing I'm doing be
automatically throttled to the point it doesn't interfere with other
processes (either directly, or indirectly by bloat)

> The rate of vacuuming needs to be tied
> somehow to the rate at which we're creating stuff that needs to be
> vacuumed. Right now we don't even have a way to measure that, let
> alone auto-regulate the aggressiveness of autovacuum on that basis.
>

There is the formula used to decide when a table gets vacuumed. Isn't the
time delta in this formula a measure of how fast we are creating stuff that
needs to be vacuumed for bloat reasons? Is your objection that it doesn't
include other reasons we might want to vacuum, or that it just doesn't work
very well, or that is not explicitly exposed?

> Similarly, for marking of pages as all-visible, we currently make the
> same decision whether the relation is getting index-scanned (in which
> case the failure to mark those pages all-visible may be suppressing
> the use of index scans or making them less effective) or whether it's
> not being accessed at all (in which case vacuuming it won't help
> anything, and might hurt by pushing other pages out of cache).

If it is not getting accessed at all because the database is not very
active right now, that would be the perfect time to vacuum it. Between "I
can accurately project current patterns of (in)activity into the future"
and "People don't build large tables just to ignore them forever", I think
the latter is more likely to be true. If the system is busy but this
particular table is not, then that would be a better reason to
de-prioritise vacuuming that table. But can this degree of reasoning
really be implemented in a practical way? In core?

> Again,
> if we had better statistics, we could measure this - counting heap
> fetches for actual index-only scans plus heap fetches for index scans
> that might have been planned index-only scans but for the relation
> having too few all-visible pages doesn't sound like an impossible
> metric to gather.

My experience has been that if too few pages are all visible, it generally
switches to a seq scan, not an index scan of a different index. But many
things that are semantically possible to be index-only-scans would never be
planned that way even if allvisible were 100%, so I think it would have to
do two planning passes, one with the real allvisible, and a hypothetical
one with allvisible set to 100%. And then there is the possibility that,
while a high allvisible would be useful, the table is so active that no
amount of vacuuming could ever keep it high.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-02-05 00:24:02 Re: Re: Misaligned BufferDescriptors causing major performance problems on AMD
Previous Message Michael Paquier 2014-02-04 23:59:30 Re: should we add a XLogRecPtr/LSN SQL type?

Browse pgsql-performance by date

  From Date Subject
Next Message Claudio Freire 2014-02-05 01:32:46 Re: Re: Increased memory utilization by pgsql backend after upgrade from 9.1.3 to 9.2.6
Previous Message Pweaver (Paul Weaver) 2014-02-03 21:35:43 Postgres Query Plan Live Lock