Re: [HACKERS] sorting big tables :(

From: The Hermit Hacker <scrappy(at)hub(dot)org>
To: Michal Mosiewicz <mimo(at)interdata(dot)com(dot)pl>
Cc: hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] sorting big tables :(
Date: 1998-05-20 12:24:19
Message-ID: Pine.BSF.3.96.980520082051.14056G-100000@hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 20 May 1998, Michal Mosiewicz wrote:

> The Hermit Hacker wrote:
>
> > Now, as a text file, this would amount to, what...~50MB?
> 40M of records to produce a 50MB text file? How would you sort such a
> *compressed* file? ;-)

My math off? 40M rows at 11bytes each (2xint4+int2+\n?) oops...ya, just
off by a factor of ten...still, 500MB is a quarter of the size of the 2gig
file we started with...

> > So, if I were to do a 'copy out' to a text file, a Unix sort and then a
> > 'copy in', I would use up *less* disk space (by several orders of
> > magnitude) then doing the sort inside of PostgreSQL?
>
> Well, I think it might be optimised slightly. Am I right that postgres
> uses heap (i.e. they look like tables) files during sorting? While this
> is a merge sort, those files doesn't have to be a table-like files.
> Certainly, they might variable length records without pages (aren't they
> used sequentially). Moreover we would consider packing tape files before
> writting them down if necessary. Of course it will result in some
> performance dropdown. However it's better to have less performance that
> being unable to sort it at all.
>
> Last question... What's the purpose of such a big sort? If somebody gets
> 40M of sorted records in a result of some query, what would he do with
> it? Is he going to spent next years on reading this lecture? I mean,
> isn't it worth to query the database for necessary informations only and
> then sort it?

this I don't know...I never even really thought about that,
actually...Michael? :) Only you can answer that one.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 1998-05-20 14:07:37 Re: [DOCS] Re: FE/BE protocol revision patch
Previous Message Michal Mosiewicz 1998-05-20 12:12:15 Re: [HACKERS] sorting big tables :(