Re: [HACKERS] Need some help on code

From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: M(dot)Boekhold(at)et(dot)tudelft(dot)nl
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Need some help on code
Date: 1998-06-07 20:26:55
Message-ID: 199806072026.QAA23211@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Hi,
>
> I was trying to change to cluster command to do the its writes clustered
> by a 100 tuples, thus hoping to improve performance. However, the code
> I've written crashes. This has certainly to do with some internal states
> of pgsql that aren't preserved in a HeapTuple.
>
> Could somebody with knowledge have a brief glimpse on my code and perhaps
> tell me how to do it properly?

I did not look at the code, but I can pretty much tell you that bunching
the write will not help performance. We already do that pretty well
with the cache.

THe problem with the cluster is the normal problem of using an index to
seek into a data table, where the data is not clustered on the index.
Every entry in the index requires a different page, and each has to be
read in from disk.

Often the fastest way is to discard the index, and just read the table,
sorting each in pieces, and merging them in. That is what psort does,
which is our sort code. That is why I recommend the SELECT INTO
solution if you have enough disk space.

Once it is clustered, subsequent clusters should be very fast, because
only the out-of-order entries cause random disk seeks.

--
Bruce Momjian | 830 Blythe Avenue
maillist(at)candle(dot)pha(dot)pa(dot)us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1998-06-08 02:48:54 Re: [HACKERS] NEW POSTGRESQL LOGOS
Previous Message Maarten Boekhold 1998-06-07 19:27:13 Need some help on code