Re: Page at a time index scan

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Page at a time index scan
Date: 2006-05-08 18:46:02
Message-ID: 28697.1147113962@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> So we just optimised for slowly-but-continually churning tables (i.e.
> DELETEs match INSERTs, or just UPDATEs). i.e. we just improved VACUUM
> performance for those that don't need it that often. That might be the
> common case, but it isn't the one thats hurting most.

No, we just improved VACUUM performance for those that need it most.
I was just doing some desultory experiments with today's code versus
yesterday's (it's easy to make a direct comparison because they're
initdb-compatible ... just stop one postmaster executable and start
another on the same database). I made a table of 16M rows with an
index over a random-data integer column. With a thoroughly disordered
index (built on-the-fly as the random data was inserted), the time to
VACUUM after deleting a small number of rows was 615 seconds with
yesterday's code, 31 seconds today. With a perfectly-ordered index
(identical table, but CREATE INDEX after all the data is in place), the
times were about 28 and 26 seconds respectively. (I wouldn't put a
*whole* lot of faith in these numbers, since they're from a Dell machine
with a toy ATA drive, but anyway they do show that sequential access to
the index makes a huge difference.) But perfectly-ordered indexes are
impossible to maintain unless your data is either static or insert-at-
the-end-only.

Anyway, while the extra LWLock and shared-memory access during a split
is annoying, I think it's really negligible (and so does pgbench).
A page split is going to involve many times that much work --- you've
got to acquire lock on at least four different buffers, visit the FSM,
possibly ask the kernel for another disk page, do XLogInsert twice, etc.
Any one of those things involves significantly more CPU effort and
contention than _bt_vacuum_cycleid(). So while I'd be happy to get rid
of it, not at the price of slowing down VACUUM again.

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Alvaro Herrera 2006-05-08 19:02:21 Re: [WIP] The relminxid addition, try 3
Previous Message Martijn van Oosterhout 2006-05-08 18:03:12 [PATCH] Round 2: Magic block for modules