Re: Linux max on shared buffers?

From: "Glen Parker" <glenebob(at)nwlink(dot)com>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'Jan Wieck'" <JanWieck(at)Yahoo(dot)com>, "'Curt Sampson'" <cjs(at)cynic(dot)net>, "'GB Clark'" <postgres(at)vsservices(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Linux max on shared buffers?
Date: 2002-07-20 20:17:21
Message-ID: 007501c2302a$765951c0$0b01a8c0@johnpark.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Here's a rediculous hack of Martijn's program that runs on windows
(win2K in my case), using the sorta-mmap-like calls in windows.

Several runs on my box produced errors at offsets 0x048D and 0x159E.

Glen Parker.

>
> Whoopsie. Here's the program :)
>
> On Sun, Jul 21, 2002 at 12:19:43AM +1000, Martijn van
> Oosterhout wrote:
> > On Sat, Jul 20, 2002 at 09:09:59AM -0400, Tom Lane wrote:
> > > Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > > > Well, you would have to deal with the fact that writing
> changes to a mmap()
> > > > is allowed, but you have no guarentee when it will be
> finally written. Given
> > > > WAL I would suggest using mmap() for reading only and
> using write() to
> > > > update the file.
> > >
> > > This is surely NOT workable; every mmap man page I've
> looked at is very
> > > clear that you cannot expect predictable behavior if you use both
> > > filesystem and mmap access to the same file. For
> instance, HP says
> > >
> > > It is also unspecified whether write references to a
> memory region
> > > mapped with MAP_SHARED are visible to processes
> reading the file and
> > > whether writes to a file are visible to processes
> that have mapped the
> > > modified portion of that file, except for the effect
> of msync().
> > >
> > > So unless you want to msync after every write I do not
> think this can fly.
> >
> > Well ofcourse. The entire speed improvment is based on the
> fact that mmap()
> > is giving you a window into the system disk cache. If the
> OS isn't built
> > that way then it's not going to work. It does work on Linux
> and is fairly
> > easy to test for. I've even attached a simple program to try it out.
> >
> > Ofcourse it's not complete. You'd need to try multiple
> processes to see what
> > happens, but I'd be interested how diverse the mmap()
> implementations are.
> >
> > > > If in that process the kernel needed
> > > > to throw out another page, who cares?
> > >
> > > We do, because we have to control write ordering.
> >
> > Which is why you use write() to control that
> >
> > > > It is different. I beleive you would still need some
> form of shared memory
> > > > to co-ordinate write()s.
> > >
> > > The whole idea becomes less workable the more we look at it.
> >
> > I guess this is one of those cases where working code would
> be need to
> > convince anybody. In the hypothetical case someone had
> time, the approprite
> > place to add this would be src/backend/storage/buffer,
> since all buffer
> > loads go through there, right?
> >
> > The only other question is whether there is anyway to know
> when a buffer
> > will be modified. I get the impression sometimes bits are
> twiddled without
> > the buffer being marked dirty.
>
> --
> Martijn van Oosterhout <kleptog(at)svana(dot)org>
> http://svana.org/kleptog/
> > There are 10 kinds of people in
> the world, those that can do binary
> > arithmetic and those that can't.
>

Attachment Content-Type Size
test_mmap.c application/octet-stream 1.9 KB

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Birch 2002-07-20 20:29:04 Re: timestamped archive data index searches
Previous Message stefan 2002-07-20 20:01:49 Re: [SQL] id and ID in CREATE TABLE