Re: Merge algorithms for large numbers of "tapes"

From: "Zeugswetter Andreas DCP SD" <ZeugswetterA(at)spardat(dot)at>
To: "Dann Corbit" <DCorbit(at)connx(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Luke Lonergan" <llonergan(at)greenplum(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Merge algorithms for large numbers of "tapes"
Date: 2006-03-09 09:56:26
Message-ID: E1539E0ED7043848906A8FF995BDA579D991B7@m0143.s-mxs.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> > This amounts to an assumption that you have infinite work_mem, in
> which
> > case you hardly need an external sort at all. If your
> work_mem is in
> > fact finite, then at some point you need more than two passes. I'm
> not
> > really interested in ripping out support for sort
> operations that are
> > much larger than work_mem.
>
> No it does not. I have explained this before. You can have
> one million files and merge them all into a final output with
> a single pass. It does not matter how big they are or how
> much memory you have.

Hh ? But if you have too many files your disk access is basically
then going to be random access (since you have 1000nds of files per
spindle).
>From tests on AIX I have pretty much concluded, that if you read
256k blocks at a time though, random access does not really hurt that
much
any more.
So, if you can hold 256k per file in memory that should be sufficient.

Andreas

Browse pgsql-hackers by date

  From Date Subject
Next Message Martijn van Oosterhout 2006-03-09 10:03:03 Re: Coverity Open Source Defect Scan of PostgreSQL
Previous Message Hannu Krosing 2006-03-09 08:37:01 Re: Merge algorithms for large numbers of "tapes"