Re: Linux max on shared buffers?

From: Curt Sampson <cjs(at)cynic(dot)net>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Jan Wieck <JanWieck(at)Yahoo(dot)com>, GB Clark <postgres(at)vsservices(dot)com>, <glenebob(at)nwlink(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Linux max on shared buffers?
Date: 2002-07-20 09:36:52
Message-ID: Pine.NEB.4.44.0207201831310.553-100000@angelic.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, 20 Jul 2002, Martijn van Oosterhout wrote:

> Well, you would have to deal with the fact that writing changes to a mmap()
> is allowed, but you have no guarentee when it will be finally written. Given
> WAL I would suggest using mmap() for reading only and using write() to
> update the file.

You can always do an msync to force a block out. But I don't think
you'd ever bother; the transaction log is the only thing for which
you need to force writes, and that's probably better done with
regular file I/O (read/write) anyway.

The real problem is that you can't make sure a block is *not* written
until you want it to be, which is why you need to write the log entry
before you can update the block.

> If in that process the kernel needed to throw out another page, who
> cares? If another backend needs that page it'll get read back in.

Right. And all of the standard kernel strategies for deciding which
blocks to throw out will be in place, so commonly hit pages will be
thrown out after more rarely hit ones.

You also have the advantage that if you're doing, say, a sequential
scan, you can madvise the pages MADV_WILLNEED when you first map them,
and madvise each one MADV_DONTNEED after you're done with it, and avoid
blowing out your entire buffer cache and replacing it with data you know
you're not likely to read again any time soon.

> One case where this would be useful would be i386 machine with 64GB of
> memory. Then you are in effect simply mapping different parts of the cache
> at different times. No blocks are copied *ever*.

Right.

> It is different. I beleive you would still need some form of shared memory
> to co-ordinate write()s.

Sure. For that, you can just mmap an anonymous memory area and share it
amongst all your processes, or use sysv shared memory.

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Curt Sampson 2002-07-20 09:40:07 Re: Linux max on shared buffers?
Previous Message Curt Sampson 2002-07-20 09:30:42 Re: Linux max on shared buffers?