Skip site navigation (1) Skip section navigation (2)

Re: Dead Space Map

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dead Space Map
Date: 2006-02-28 10:38:51
Message-ID: Pine.OSF.4.61.0602280908490.76425@kosh.hut.fi (view raw or flat)
Thread:
Lists: pgsql-hackers
On Mon, 27 Feb 2006, Tom Lane wrote:

> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
>> On Mon, 27 Feb 2006, Tom Lane wrote:
>>> This strikes me as a fairly bad idea, because it makes VACUUM dependent
>>> on correct functioning of user-written code --- consider a functional
>>> index involving a user-written function that was claimed to be immutable
>>> and is not.
>
>> If the user-defined function is broken, you're in more or less trouble
>> anyway.
>
> Less.  A non-immutable function might result in lookup failures (not
> finding the row you need) but not in database corruption, which is what
> would ensue if VACUUM fails to remove an index tuple.  The index entry
> would eventually point to a wrong table entry, after the table item slot
> gets recycled for another tuple.

It's easy to detect when it happens (that you don't find the row in the 
index). You can then complain loudly or fall back to the full scan.

> Moreover, you haven't pointed to any strong reason to adopt this
> methodology.  It'd only be a win when vacuuming pretty small numbers
> of tuples, which is not the design center for VACUUM, and isn't likely
> to be the case in practice either if you're using autovacuum.  If you're
> removing say 1% of the tuples, you are likely to be hitting every index
> page to do it, meaning that the scan approach will be significantly
> *more* efficient than retail lookups.

It really depends a lot on what kind of indexes the table has, clustering 
order, and what kind of deletions and updates happen. Assuming a random 
distribution of dead tuples, a scan is indeed going to be more efficient 
as you say.

Assuming a tightly packed bunch of dead tuples, however, say after a 
"delete from log where timestamp < now() - 1 month", you would benefit 
from a partial vacuum.

That's certainly something that needs to be tested. It's important to 
implement so that it falls back to full scan when it's significantly 
faster.

- Heikki

In response to

pgsql-hackers by date

Next:From: Hans-Juergen SchoenigDate: 2006-02-28 12:02:04
Subject: Re: new feature: LDAP database name resolution
Previous:From: Hannu KrosingDate: 2006-02-28 08:17:41
Subject: Re: Dead Space Map

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group