Re: Equivalent praxis to CLUSTERED INDEX?

From: Mischa Sandberg <ischamay(dot)andbergsay(at)activestateway(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Equivalent praxis to CLUSTERED INDEX?
Date: 2004-09-08 15:38:51
Message-ID: fIF%c.159841$X12.48929@edtnps84
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Mischa Sandberg wrote:
> Coming from the MSSQL world, I'm used to the first step in optimization
> to be, choose your clustered index and choose it well.
> I see that PG has a one-shot CLUSTER command, but doesn't support
> continuously-updated clustered indexes.
> What I infer from newsgroup browsing is, such an index is impossible,
> given the MVCC versioning of records (happy to learn I'm wrong).
> I'd be curious to know what other people, who've crossed this same
> bridge from MSSQL or Oracle or Sybase to PG, have devised,
> faced with the same kind of desired performance gain for retrieving
> blocks of rows with the same partial key.

Just to let people know, after trying various options, this looks the
most promising:

- segment the original table into four tables (call them A,B,C,D)

- all insertions go into A.
- longterm data lives in B.

- primary keys of all requests to delete rows from (B) go into D -- no
actual deletions are done against B. Deletions against A happen as normal.

- all queries are made against a view: a union of A and B and (not
exists) D.

- daily merge A,B and (where not exists...) D, into C
- run cluster on C, then swap names on B and C, truncate A and D.

Not rocket science, but it seems to give the payback of normal
clustering without locking the table for long periods of time. It also
saves on VACUUM FULL time.

At present, we're only at 1M rows in B on this. More when I know it.
Advance warning on any gotchas with this approach would be much
appreciated. Making a complete copy of (B) is a bit of an ouch.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Vic Cekvenich 2004-09-09 12:56:20 Text Search vs MYSQL vs Lucene
Previous Message Marc Cousin 2004-09-08 15:17:47 Re: Problem with large query