Re: Dead Space Map

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dead Space Map
Date: 2006-02-27 19:48:22
Message-ID: Pine.OSF.4.61.0602272114460.130558@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 27 Feb 2006, Tom Lane wrote:

> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
>> Vacuum will need to be modified to use index lookups to find index tuples
>> corresponding the dead heap tuples. Otherwise you have to scan through
>> all the indexes anyway.
>
> This strikes me as a fairly bad idea, because it makes VACUUM dependent
> on correct functioning of user-written code --- consider a functional
> index involving a user-written function that was claimed to be immutable
> and is not.

If the user-defined function is broken, you're in more or less trouble
anyway. A VACUUM FULL or REINDEX can be used to recover.

> There are concurrency-safety issues too, I think, having to
> do with the way that btree ensures we don't delete any index tuple that
> some scan is stopped on.

Hmm, I see. I'll have to study the btree implementation more thoroughly.

>> * implementation of index-only scans
>
>> An index scan would not have to check the visibility information of heap
>> tuples on those heap pages that are marked as clean in the dead space map.
>> This requires that the dead space map is implemented so that a page is
>> reliably marked as dirty in all circumstances when it contains any tuples
>> that are not visible to all backends.
>
> The "reliably" part of this is likely to make it a non-starter.

AFAICS all heap access goes through the functions in heapam.c. They need
to be modified to update the dead space map. Also on recovery. The
locking semantics of the dead space map need to be thought out, but I
don't see any insurmountable problems.

> Another
> problem is that the semantics needed by this are not quite the same as
> the semantics of whether a page needs to be visited by vacuum.

True. We might have to have two bits per page. It's still not that
bad, though, compared to the benefit.

>> My current implementation stores a bitmap of 32k bits in the special space
>> of every 32k heap pages. Each bit in the bitmap corresponds one heap page.
>> The bit is set every time a tuple is updated, and it's cleared by vacuum.
>> This is a very simple approach, and doesn't take much space.
>
> I thought the plan was to use out-of-line storage associated with each
> table "segment" file.

You're probably referring to Alvaro's auto-vacuum todo list from July:

http://archives.postgresql.org/pgsql-hackers/2005-07/msg01029.php

I find it more attractive to put the bitmap in the special space, for the
reasons stated earlier by Jan Wieck:

http://archives.postgresql.org/pgsql-hackers/2004-03/msg00957.php

That is, so that you can utilize the common buffer management code. Jan
also had a plan there for the locking.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeffrey W. Baker 2006-02-27 20:01:09 Re: Any conclusion on the Xeon context-switching issue?
Previous Message Josh Berkus 2006-02-27 19:41:14 Re: pg_config, pg_service.conf, postgresql.conf ....