Re: index-only scans

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Marti Raudsepp <marti(at)juffo(dot)org>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Kääriäinen Anssi <anssi(dot)kaariainen(at)thl(dot)fi>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "postgres(at)cybertec(dot)at" <postgres(at)cybertec(dot)at>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: index-only scans
Date: 2011-09-26 00:46:22
Message-ID: CA+TgmoYzA-NNDWYh_DU+C+LX_+Ho5SeNJDQnrzkFYq3OCu0CVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 25, 2011 at 2:43 PM, Marti Raudsepp <marti(at)juffo(dot)org> wrote:
> On Sun, Aug 14, 2011 at 00:31, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> That is somewhat compensated by the fact that tuples that are accessed more
>> often are also more likely to be in cache. Fetching the heap tuple to check
>> visibility is very cheap when the tuple is in cache.
>>
>> I'm not sure how far that compensates it, though. I'm sure there's typically
>> nevertheless a fairly wide range of pages that have been modified since the
>> last vacuum, but not in cache anymore.
>
> Would it make sense to re-evaluate the visibility bit just before a
> page gets flushed out from shared buffers? On a system with no long
> transactions, it seems likely that a dirty page is already all-visible
> by the time bgwriter (or shared buffers memory pressure) gets around
> to writing it out. That way we don't have to wait for vacuum to do it
> and would make your observation hold more often.

This has been suggested before, and, sure, there might be cases where
it helps. But you need to choose your test case fairly carefully.
For example, if you're doing a large sequential scan on a table, the
ring-buffer logic causes processes to evict their own pages, and so
the background writer doesn't get a chance to touch any of those
pages. You need some kind of a workload where pages are being evicted
from shared buffers slowly enough that it ends up being the background
writer, rather than the individual backends, that do the work. But if
you have that kind of workload, then we can infer that most of your
working set fits into shared buffers. And in that case you don't
really need index-only scans in the first place. The main point of
index only scans is to optimize the case where you have a gigantic
table and you're trying to avoid swamping the system with random I/O.
I'm not saying that such a change would be a particularly bad idea,
but I'm not personally planning to work on it any time soon because I
can't convince myself that it really helps all that much.

I think the real solution to getting visibility map bits set is to
vacuum more frequently, but have it be cheaper each time. Our default
autovacuum settings vacuum the table when the number of updates and
deletes reaches 20% of the table size. But those settings were put in
place under the assumption that we'll have to scan the entire heap,
dirtying every page that contains dead tuples, scan all the indexes
and remove the associated index pointers, and then scan and dirty the
heap pages that contain now-dead line pointers a second time to remove
those. The visibility map has eroded those assumptions to some
degree, because now we probably won't have to scan the entire heap
every time we vacuum; and I'm hoping we're going to see some further
erosion. Pavan has a pending patch which, if we can work out the
details, will eliminate the second heap scan; and we've also talked
about doing the index scan only when there are enough dead line
pointers to justify the effort. That, it seems to me, would open the
door to lowering the scale factor, maybe by quite a bit - which, in
turn, would help us control bloat better and get visibility map bits
set sooner.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-09-26 00:50:42 Re: [v9.2] Fix Leaky View Problem
Previous Message Kevin Grittner 2011-09-25 22:30:18 Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)