Skip site navigation (1) Skip section navigation (2)

Re: Dead Space Map

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dead Space Map
Date: 2006-02-27 23:44:56
Message-ID: 20060227234456.GU82012@pervasive.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Mon, Feb 27, 2006 at 03:05:41PM -0500, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> > On Mon, 27 Feb 2006, Tom Lane wrote:
> >> This strikes me as a fairly bad idea, because it makes VACUUM dependent
> >> on correct functioning of user-written code --- consider a functional
> >> index involving a user-written function that was claimed to be immutable
> >> and is not.
> 
> > If the user-defined function is broken, you're in more or less trouble 
> > anyway.
> 
> Less.  A non-immutable function might result in lookup failures (not
> finding the row you need) but not in database corruption, which is what
> would ensue if VACUUM fails to remove an index tuple.  The index entry
> would eventually point to a wrong table entry, after the table item slot
> gets recycled for another tuple.
 
Is there some (small) metadata that could be stored in the index to
protect against this, perhaps XID? Granted, it's another 4 bytes, but it
would only need to be in functional indexes. And there could still be a
means to turn it off, if you're 100% certain that the function is
immutable. lower() is probably the biggest use-case here...

> Moreover, you haven't pointed to any strong reason to adopt this
> methodology.  It'd only be a win when vacuuming pretty small numbers
> of tuples, which is not the design center for VACUUM, and isn't likely
> to be the case in practice either if you're using autovacuum.  If you're
> removing say 1% of the tuples, you are likely to be hitting every index
> page to do it, meaning that the scan approach will be significantly
> *more* efficient than retail lookups.

The use case is any large table that sees updates in 'hot spots'.
Anything that's based on current time is a likely candidate, since often
most activity only concerns the past few days of data.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby(at)pervasive(dot)com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

In response to

Responses

pgsql-hackers by date

Next:From: Marc G. FournierDate: 2006-02-28 00:36:17
Subject: In case nobody has seen this survey from Sun ...
Previous:From: Jim C. NasbyDate: 2006-02-27 23:39:06
Subject: Re: Dead Space Map

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group