Skip site navigation (1) Skip section navigation (2)

Re: cluster table by two-column index ?

From: pgsql(at)mohawksoft(dot)com
To: "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Cc: "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cluster table by two-column index ?
Date: 2005-03-03 16:23:55
Message-ID: 16412.24.91.171.78.1109867035.squirrel@mail.mohawksoft.com (view raw or flat)
Thread:
Lists: pgsql-hackers
> I'm wondering,
> is there any sense to cluster table using two-column index ?
>
>
We've had this discussion a few weeks ago. Look at the archives for my
post "One Big Trend ...."

The problem is that while the statistics can resonably deal with the
primary column it completely misses the trends produced in the secondary
column. This situation can be seen quite clearly using the US Census TIGER
database.

I imagine the primary and secondary columns both have a discrete index and
the combined index is for the cluser or more complex queries.

If you execute a query based on the secondary column's index that should
return about 100 rows. The "smaller" trends in the column produced by the
cluster are not detected. So, rather then seeing that its probably a few
index seeks and a few table seeks because the data is fairly well grouped,
it opts, instead, to do a table scan because it doesn't see any
correlation.

Increasing the number of samples in ANALIZE helps a bit, but the solution
is better statistics or maybe hints that can be embedded into the query.

In response to

pgsql-hackers by date

Next:From: pgsqlDate: 2005-03-03 16:41:20
Subject: Re: bitmap AM design
Previous:From: Greg StarkDate: 2005-03-03 16:08:36
Subject: Re: 8.0.X and the ARC patent

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group