From: | Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | scrappy(at)hub(dot)org (The Hermit Hacker) |
Cc: | mimo(at)interdata(dot)com(dot)pl, hackers(at)postgreSQL(dot)org |
Subject: | Re: [HACKERS] sorting big tables :( |
Date: | 1998-05-20 15:02:10 |
Message-ID: | 199805201502.LAA14967@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> > I have an idea. Can he run CLUSTER on the data? If so, the sort will
> > not use small batches, and the disk space during sort will be reduced.
> > However, I think CLUSTER will NEVER finish on such a file, unless it is
> > already pretty well sorted.
>
> Okay...then we *do* have a table size limit problem? Tables that
> just get too large to be manageable? Maybe this is one area we should be
> looking at as far as performance is concerned?
Well, cluster moves one row at a time, so if the table is very
fragmented, the code is slow because it is seeking all over the table.
See the cluster manual pages for an alternate solution, the uses ORDER
BY.
>
> One thing that just pop'd to mind, concerning the above CLUSTER
> command...what would it take to have *auto-cluster'ng*? Maybe provide a
> means of marking a field in a table for this purpose?
Hard to do. That's what we have indexes for.
>
> One of the things that the Unix FS does is auto-defragmenting, at
> least the UFS one does. Whenever the system is idle (from my
> understanding), the kernel uses that time to clean up the file systems, to
> reduce the file system fragmentation.
>
> This is by no means SQL92, but it would be a neat
> "extension"...let me specify a "CLUSTER on" field. Then, as I'm entering
> data into the database, periodically check for fragmentation of the data
> and clean up accordingly. If done by the system, reasonably often, it
> shouldn't take up *too* much time, as most of the data should already be
> in order...
>
> That would have the side-benefit of speeding up the "ORDER by" on
> that field also...
We actually can have a CLUSTER ALL command, that does this. No one has
implemented it yet.
--
Bruce Momjian | 830 Blythe Avenue
maillist(at)candle(dot)pha(dot)pa(dot)us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 1998-05-20 15:33:34 | Re: [HACKERS] Re: [DOCS] Re: FE/BE protocol revision patch |
Previous Message | Thomas G. Lockhart | 1998-05-20 15:02:08 | Re: [HACKERS] Re: [DOCS] Re: FE/BE protocol revision patch |