Re: Linux max on shared buffers?

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jan Wieck <JanWieck(at)Yahoo(dot)com>, Curt Sampson <cjs(at)cynic(dot)net>, GB Clark <postgres(at)vsservices(dot)com>, glenebob(at)nwlink(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: Linux max on shared buffers?
Date: 2002-07-20 14:21:51
Message-ID: 20020721002151.B17677@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Whoopsie. Here's the program :)

On Sun, Jul 21, 2002 at 12:19:43AM +1000, Martijn van Oosterhout wrote:
> On Sat, Jul 20, 2002 at 09:09:59AM -0400, Tom Lane wrote:
> > Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > > Well, you would have to deal with the fact that writing changes to a mmap()
> > > is allowed, but you have no guarentee when it will be finally written. Given
> > > WAL I would suggest using mmap() for reading only and using write() to
> > > update the file.
> >
> > This is surely NOT workable; every mmap man page I've looked at is very
> > clear that you cannot expect predictable behavior if you use both
> > filesystem and mmap access to the same file. For instance, HP says
> >
> > It is also unspecified whether write references to a memory region
> > mapped with MAP_SHARED are visible to processes reading the file and
> > whether writes to a file are visible to processes that have mapped the
> > modified portion of that file, except for the effect of msync().
> >
> > So unless you want to msync after every write I do not think this can fly.
>
> Well ofcourse. The entire speed improvment is based on the fact that mmap()
> is giving you a window into the system disk cache. If the OS isn't built
> that way then it's not going to work. It does work on Linux and is fairly
> easy to test for. I've even attached a simple program to try it out.
>
> Ofcourse it's not complete. You'd need to try multiple processes to see what
> happens, but I'd be interested how diverse the mmap() implementations are.
>
> > > If in that process the kernel needed
> > > to throw out another page, who cares?
> >
> > We do, because we have to control write ordering.
>
> Which is why you use write() to control that
>
> > > It is different. I beleive you would still need some form of shared memory
> > > to co-ordinate write()s.
> >
> > The whole idea becomes less workable the more we look at it.
>
> I guess this is one of those cases where working code would be need to
> convince anybody. In the hypothetical case someone had time, the approprite
> place to add this would be src/backend/storage/buffer, since all buffer
> loads go through there, right?
>
> The only other question is whether there is anyway to know when a buffer
> will be modified. I get the impression sometimes bits are twiddled without
> the buffer being marked dirty.

--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.

Attachment Content-Type Size
test_mmap.c text/x-csrc 1.5 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruno Wolff III 2002-07-20 15:35:05 domain access privilege
Previous Message Martijn van Oosterhout 2002-07-20 14:19:43 Re: Linux max on shared buffers?