Re: Equivalent praxis to CLUSTERED INDEX?

From: Mischa Sandberg <ischamay(dot)andbergsay(at)activestateway(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Equivalent praxis to CLUSTERED INDEX?
Date: 2004-08-26 21:00:47
Message-ID: 3csXc.56326$X12.25148@edtnps84
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Ummm ... not quite. In MSSQL/Sybase/Oracle, a clustered index maintains
its space saturation as part of each update operation. High activity
does indeed result in less-full pages (typically 60-80% full for tables
with heavy deletions or rowsize changes). To bring the percentage back
up, you run DBCC INDEXDEFRAG, which also does what you'd expect of a
normal file defragmenter -- put related disk pages together on the platter.

But the performance difference is hardly as severe as I gather it can be
if you neglect to vacuum.

As for SQL Server being a 'single-user database' ... ummm ... no, I
don't think so. I'm REALLY happy to be shut of the Microsoft world, but
MSSQL 7/2000/2005 is a serious big DB engine. It also has some serious
bright heads behind it. They hired Goetz Graefe and Paul (aka Per-Ake)
Larsen away from academia, and it shows, in the join and aggregate
processing. I'll be a happy camper if I manage to contribute something
to PG that honks the way their stuff does. Happy to discuss, too.

Josh Berkus wrote:
> Bruce,
>
>
>>How do vendors actually implement auto-clustering? I assume they move
>>rows around during quiet periods or have lots of empty space in each
>>value bucket.
>
>
> That's how SQL Server does it. In old versions (6.5) you had to manually
> send commands to update the cluster, same as PG. Also, when you create a
> cluster (or an index or table for that matter) you can manually set an amount
> of "space" to be held open on each data page for updates.
>
> Also keep in mind that SQL Server, as a "single-user database" has a much
> easier time with this. They don't have to hold several versions of an index
> in memory and collapse it into a single version at commit time.
>
> All that being said, we could do a better job of "auto-balancing" clustered
> tables. I believe that someone was working on this in Hackers through what
> they called "B-Tree Tables". What happened to that?
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gaetano Mendola 2004-08-26 21:09:35 Re: Equivalent praxis to CLUSTERED INDEX?
Previous Message J. Andrew Rogers 2004-08-26 20:44:03 Re: Equivalent praxis to CLUSTERED INDEX?