Re: Dead Space Map

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dead Space Map
Date: 2006-02-28 06:04:00
Message-ID: 10601.1141106640@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> On Mon, Feb 27, 2006 at 03:05:41PM -0500, Tom Lane wrote:
>> Moreover, you haven't pointed to any strong reason to adopt this
>> methodology. It'd only be a win when vacuuming pretty small numbers
>> of tuples, which is not the design center for VACUUM, and isn't likely
>> to be the case in practice either if you're using autovacuum. If you're
>> removing say 1% of the tuples, you are likely to be hitting every index
>> page to do it, meaning that the scan approach will be significantly
>> *more* efficient than retail lookups.

> The use case is any large table that sees updates in 'hot spots'.
> Anything that's based on current time is a likely candidate, since often
> most activity only concerns the past few days of data.

I'm unmoved by that argument too. If the updates are clustered then
another effect kicks in: the existing btbulkdelete approach is able to
collapse all the deletions on a given index page into one WAL record.
With retail deletes it'd be difficult if not impossible to do that,
resulting in a significant increase in WAL traffic during a vacuum.
(We know it's significant because we saw a good improvement when we
fixed btbulkdelete to work that way, instead of issuing a separate
WAL record per deleted index entry as it once did.)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2006-02-28 06:18:14 Re: Dead Space Map
Previous Message Bruce Momjian 2006-02-28 05:45:18 Re: [HACKERS] how solve diff of API counstruct_md_array between