From: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
---|---|
To: | Zeugswetter Andreas DCP SD <ZeugswetterA(at)spardat(dot)at> |
Cc: | Dann Corbit <DCorbit(at)connx(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Luke Lonergan <llonergan(at)greenplum(dot)com>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Merge algorithms for large numbers of "tapes" |
Date: | 2006-03-10 09:44:33 |
Message-ID: | 20060310094432.GA25494@svana.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Mar 10, 2006 at 09:57:28AM +0100, Zeugswetter Andreas DCP SD wrote:
>
> > Two pass will create the count of subfiles proportional to:
> > Subfile_count = original_stream_size/sort_memory_buffer_size
> >
> > The merge pass requires (sizeof record * subfile_count) memory.
>
> That is true from an algorithmic perspective. But to make the
> merge efficient you would need to have enough RAM to cache a reasonably
> large block per subfile_count. Else you would need to reread the same
> page/block from one subfile multiple times.
> (If you had one disk per subfile you could also rely on the disk's own
> cache,
> but I think we can rule that out)
But what about the OS cache? Linux will read upto the next 128KB of a
file if it's contiguous on disk, which is likely with modern
filesystems. It's likely to be much "fairer" than any way we can come
up with to share memory.
Question is, do we want our algorithm to rely on that caching?
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.
From | Date | Subject | |
---|---|---|---|
Next Message | Martijn van Oosterhout | 2006-03-10 10:02:29 | Re: Coverity Open Source Defect Scan of PostgreSQL |
Previous Message | Richard Huxton | 2006-03-10 09:43:04 | Re: Updateable views was:(Re: [HACKERS] Proposal for SYNONYMS) |