Quick Links

Re: Merge algorithms for large numbers of "tapes"

From:	"Zeugswetter Andreas DCP SD" <ZeugswetterA(at)spardat(dot)at>
To:	"Dann Corbit" <DCorbit(at)connx(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Luke Lonergan" <llonergan(at)greenplum(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Merge algorithms for large numbers of "tapes"
Date:	2006-03-09 09:56:26
Message-ID:	E1539E0ED7043848906A8FF995BDA579D991B7@m0143.s-mxs.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> > This amounts to an assumption that you have infinite work_mem, in
> which
> > case you hardly need an external sort at all. If your
> work_mem is in
> > fact finite, then at some point you need more than two passes. I'm
> not
> > really interested in ripping out support for sort
> operations that are
> > much larger than work_mem.
>
> No it does not. I have explained this before. You can have
> one million files and merge them all into a final output with
> a single pass. It does not matter how big they are or how
> much memory you have.

Hh ? But if you have too many files your disk access is basically
then going to be random access (since you have 1000nds of files per
spindle).
>From tests on AIX I have pretty much concluded, that if you read
256k blocks at a time though, random access does not really hurt that
much
any more.
So, if you can hold 256k per file in memory that should be sufficient.

Andreas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Martijn van Oosterhout	2006-03-09 10:03:03	Re: Coverity Open Source Defect Scan of PostgreSQL
Previous Message	Hannu Krosing	2006-03-09 08:37:01	Re: Merge algorithms for large numbers of "tapes"