Re: Automatic free space map filling

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Christopher Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Automatic free space map filling
Date: 2006-03-02 16:07:25
Message-ID: 200603021607.k22G7PW03723@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Christopher Browne wrote:
> What is unclear to me in the discussion is whether or not this is
> invalidating the item on the TODO list...
>
> -------------------
> Create a bitmap of pages that need vacuuming
>
> Instead of sequentially scanning the entire table, have the background
> writer or some other process record pages that have expired rows, then
> VACUUM can look at just those pages rather than the entire table. In
> the event of a system crash, the bitmap would probably be
> invalidated. One complexity is that index entries still have to be
> vacuumed, and doing this without an index scan (by using the heap
> values to find the index entry) might be slow and unreliable,
> especially for user-defined index functions.
> -------------------
>
> It strikes me as a non-starter to draw vacuum work directly into the
> foreground; there is a *clear* loss in that the death of the tuple
> can't actually take place at that point, due to MVCC and the fact that
> it is likely that other transactions will be present, keeping the
> tuple from being destroyed.
>
> But it would *seem* attractive to do what is in the TODO, above.
> Alas, the user defined index functions make cleanout of indexes much
> more troublesome :-(. But what's in the TODO is still "wholesale,"
> albeit involving more targetted selling than the usual Kirby VACUUM
> :-).

What bothers me about the TODO item is that if we have to sequentially
scan indexes, are we really gaining much by not having to sequentially
scan the heap? If the heap is large enough to gain from a bitmap, the
index is going to be large too. Is disabling per-index cleanout for
expression indexes the answer?

The entire expression index problem is outlined in this thread:

http://archives.postgresql.org/pgsql-hackers/2006-02/msg01127.php

I don't think it is a show-stopper because if we fail to find the index
that matches the heap, we know we have a problem and can report it and
fall back to an index scan.

Anyway, as I remember, if you have a 20gig table, a vacuum / sequential
scan is painful, but if we have to sequential scan the all indexes, that
is probably just as painful. If we can't make headway there and we
can't cleanout indexes without an sequential index scan, I think we
should just remove the TODO item and give up on improving vacuum
performance.

For the bitmaps, index-only scans require a bit that says "all page
tuples are visible" while vacuum wants "some tuples are expired".
DELETE would clear both bits, while INSERT would clear just the first,
and update is a mix of INSERT and UPDATE, though perhaps on different
pages.

--
Bruce Momjian http://candle.pha.pa.us
SRA OSS, Inc. http://www.sraoss.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2006-03-02 16:09:04 Re: Automatic free space map filling
Previous Message Michael Fuhr 2006-03-02 16:04:38 Re: pg_config --pgxs