Re: [HACKERS] sorting big tables :(

From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: scrappy(at)hub(dot)org (The Hermit Hacker)
Cc: mimo(at)interdata(dot)com(dot)pl, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] sorting big tables :(
Date: 1998-05-20 14:22:15
Message-ID: 199805201422.KAA14065@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> On Wed, 20 May 1998, Michal Mosiewicz wrote:
>
> > The Hermit Hacker wrote:
> >
> > > Now, as a text file, this would amount to, what...~50MB?
> > 40M of records to produce a 50MB text file? How would you sort such a
> > *compressed* file? ;-)
>
> My math off? 40M rows at 11bytes each (2xint4+int2+\n?) oops...ya, just
> off by a factor of ten...still, 500MB is a quarter of the size of the 2gig
> file we started with...

Actually, my description of the use of tape files was somewhat off.
Actually, the file is sorted by putting several batches in each tape
file, then reading the batches make another tape file with bigger
batches until there is one tape file and one big sorted batch. Also, if
the data is already sorted, it can do it in one pass, without making all
those small batches because of the way the data structure sorts them in
memory. Only Knuth can do the description justice, but suffice it to
say that the data can appear up to two places at once.

This is the first time I remember someone complaining about it.

--
Bruce Momjian | 830 Blythe Avenue
maillist(at)candle(dot)pha(dot)pa(dot)us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1998-05-20 14:23:35 Re: [HACKERS] sorting big tables :(
Previous Message Tom Lane 1998-05-20 14:07:37 Re: [DOCS] Re: FE/BE protocol revision patch