Re: TPC-H Scaling Factors X PostgreSQL Cluster Command

From: "Nelson Kotowski" <nkotowski(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: TPC-H Scaling Factors X PostgreSQL Cluster Command
Date: 2007-04-23 15:52:36
Message-ID: d34b24380704230852p2fe52a05qbd7397a4293ce5a9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Heikki,

Thanks for answering! :)

I don't get how creating only the indexes i cluster on would improve my
cluster command perfomance. I believed that all other indexes wouldn't
interfere because so far they're created in a fashionable time and they
don't refer to any field/column in the orders/lineitem table. Could you
explain me again?

As for the load, when you say the right order to start, you mean i should
order the load file by the index field in the table before loading it?

Thanks in advance,
Nelson P Kotowski Filho.

On 4/23/07, Heikki Linnakangas <heikki(at)enterprisedb(dot)com> wrote:
>
> Nelson Kotowski wrote:
> > So far, i need to do it in three different scale factors (1, 2 and 5GB
> > databases).
> >
> > My build process comprehends creating the tables without any foreign
> keys,
> > indexes, etc. - Running OK!
> > Then, i load the data from the flat files generated through DBGEN
> software
> > into these tables. - Running OK!
> >
> > Finally, i run a "optimize" script that does the following:
> >
> > - Alter the tables to add the mandatory foreign keys;
> > - Create all mandatory indexes;
> > - Cluster the orders table by the orders table index;
> > - Cluster the lineitem table by the lineitem table index;
> > - Vacuum the database;
> > - Analyze statistics.
>
> Cluster will completely rewrite the table and indexes. On step 2, you
> should only create the indexes you're clustering on, and create the rest
> of them after clustering.
>
> Or even better, generate and load the data in the right order to start
> with, so you don't need to cluster at all.
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2007-04-23 16:09:05 Re: postgres: 100% CPU utilization
Previous Message Ron 2007-04-23 15:06:32 Re: postgres: 100% CPU utilization