Re: Dead Space Map

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dead Space Map
Date: 2006-02-27 23:44:56
Message-ID: 20060227234456.GU82012@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 27, 2006 at 03:05:41PM -0500, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> > On Mon, 27 Feb 2006, Tom Lane wrote:
> >> This strikes me as a fairly bad idea, because it makes VACUUM dependent
> >> on correct functioning of user-written code --- consider a functional
> >> index involving a user-written function that was claimed to be immutable
> >> and is not.
>
> > If the user-defined function is broken, you're in more or less trouble
> > anyway.
>
> Less. A non-immutable function might result in lookup failures (not
> finding the row you need) but not in database corruption, which is what
> would ensue if VACUUM fails to remove an index tuple. The index entry
> would eventually point to a wrong table entry, after the table item slot
> gets recycled for another tuple.

Is there some (small) metadata that could be stored in the index to
protect against this, perhaps XID? Granted, it's another 4 bytes, but it
would only need to be in functional indexes. And there could still be a
means to turn it off, if you're 100% certain that the function is
immutable. lower() is probably the biggest use-case here...

> Moreover, you haven't pointed to any strong reason to adopt this
> methodology. It'd only be a win when vacuuming pretty small numbers
> of tuples, which is not the design center for VACUUM, and isn't likely
> to be the case in practice either if you're using autovacuum. If you're
> removing say 1% of the tuples, you are likely to be hitting every index
> page to do it, meaning that the scan approach will be significantly
> *more* efficient than retail lookups.

The use case is any large table that sees updates in 'hot spots'.
Anything that's based on current time is a likely candidate, since often
most activity only concerns the past few days of data.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marc G. Fournier 2006-02-28 00:36:17 In case nobody has seen this survey from Sun ...
Previous Message Jim C. Nasby 2006-02-27 23:39:06 Re: Dead Space Map