Skip site navigation (1) Skip section navigation (2)

Re: CLUSTER all tables

From: Alvaro Herrera <alvherre(at)atentus(dot)com>
To: Alvaro Herrera <alvherre(at)atentus(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: CLUSTER all tables
Date: 2002-09-01 05:30:21
Message-ID: 20020901013021.738fd248.alvherre@atentus.com (view raw or flat)
Thread:
Lists: pgsql-patches
En Fri, 30 Aug 2002 19:02:47 -0400 (CLT)
Alvaro Herrera <alvherre(at)atentus(dot)com> escribió:

> What it does:
> 
> - if CLUSTER is called with no arguments, cluster all indexes that
> have  indisclustered set (in the current database).  There's no "ALL"
>   argument: that's just pollution IMHO.
> 
> - Gets a list of such indexes (checking ownership of each) and passes
>   them one by one to the standard cluster routine (modified a little
>   so it accepts OIDs of table and index, not names).

The same as before, but I do include documentation patch (feel free to
rewrite or suggest improvements) and regression test.


Now I'm thinking about concurrency: suppose table A and B have
indisclustered set on indexes ind_a and ind_b respectively.  The
user fires a CLUSTER without arguments; the backend will begin
clustering table A on ind_a.

Now, while this is going on, the user fires a CLUSTER on table B on
index ind_b_2, on a separate session.  This table is shorter than table
A and finishes first.

When the first cluster finishes clustering table A, it will start
clustering table B on ind_b.  This is because the cluster-all creates a
list of the tables to be clustered, and _then_ it clusters them one by
one.  So the info saved about table B is old and overrides the new
cluster that the user has done on another session.

The question is: is this a situation worth to protect against? and what
is the best way to do it?  I can see two ways:

1. allow only one cluster operation at the same time, with some kind of
  lock (can the lightweight lock manager be used for this?)

2. generate the list of tables as it goes.  This requires keeping
  pg_index open (with AccessShareLock) for a potentially long time (I
  don't need to tell you that cluster can be *slow*).  Is this
  acceptable?

Is there another?

-- 
Alvaro Herrera (<alvherre[a]atentus.com>)
"Granting software the freedom to evolve guarantees only different
results, not better ones." (Zygo Blaxell)

In response to

Responses

pgsql-patches by date

Next:From: Alvaro HerreraDate: 2002-09-01 06:29:25
Subject: Re: CLUSTER all tables
Previous:From: VishnuDate: 2002-09-01 04:55:41
Subject: syntax to access/retrieve data from multiple databases

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group