btbulkdelete

From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: pgsql-hackers(at)postgresql(dot)org
Subject: btbulkdelete
Date: 2004-04-25 21:34:13
Message-ID: lf9o801of48h46conuhvbj5p0jr6tbtiar@email.aon.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On -performance we have been discussing a configuration where a bulk
delete run takes almost a day (and this is not due to crappy hardware or
apparent misconfiguration). Unless I misinterpreted the numbers,
btbulkdelete() processes 85 index pages per second, while lazy vacuum is
able to clean up 620 heap pages per second.

Is there a special reason for scanning the leaf pages in *logical*
order, i.e. by following the opaque->btpo_next links? Now that FSM
covers free btree index pages this access pattern might be highly
nonsequential.

I'd expect the following scheme to be faster:

for blknum = 1 to nblocks {
read block blknum;
if (block is a leaf) {
process it;
}
}

As there is no free lunch this has the downside that it pollutes the
cache with unneeded inner nodes and free pages.

OTOH there are far less inner pages than leaf pages (even a balanced
binary tree has more leaves than inner nodes), and if free pages become
a problem it's time to re-index.

Did I miss something else?

Servus
Manfred

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Browne 2004-04-25 22:21:30 Re: Usability, MySQL, Postgresql.org, gborg, contrib, etc.
Previous Message Rob 2004-04-25 21:29:11 Re: [HACKERS] What can we learn from MySQL?