Skip site navigation (1) Skip section navigation (2)

Re: Linux max on shared buffers?

From: Curt Sampson <cjs(at)cynic(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>,Jan Wieck <JanWieck(at)Yahoo(dot)com>, GB Clark <postgres(at)vsservices(dot)com>,Martijn van Oosterhout <kleptog(at)svana(dot)org>, <glenebob(at)nwlink(dot)com>,<pgsql-general(at)postgresql(dot)org>
Subject: Re: Linux max on shared buffers?
Date: 2002-07-28 17:31:49
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-general
On Sun, 28 Jul 2002, Tom Lane wrote:

> Hm.  What's the particular syscall being used for reference here?

It's a one-byte write() to /dev/null.

> And how does it compare to the sorts of activities we'd actually be
> concerned about (open, close, mmap)?

Well, I don't see that open and close are relevant, since that part of
the file handling would be exactly the same if you continued to use the
same file handle caching code we use now.

lmbench does have a test for mmap latency which tells you how long it
takes, on average, to mmap the first given number of bytes of a file.
Unfortunately, it's not giving me output for anything smaller than about
half a megabyte (perhaps because it's too fast to measure accurately?),
but here are the times, in microseconds, for sizes from that to 1 GB on
my 1533 MHz Athlon:

    0.524288 7.688
    1.048576 15
    2.097152 22
    4.194304 40
    16.777216 169
    33.554432 358
    67.108864 740
    134.217728 2245
    268.435456 5080
    536.870912 9971
    805.306368 14927
    1073.741824 19898

It seems roughly linear, so I'm guessing that an 8k mmap would be
around 0.1-0.2 microseconds, or ten times the cost of a syscall.

Really, I need to write a better benchmark for this. I'm a bit busy
this week, but I'll try to find time to do that.

Keep in mind, though, that mmap is generally quite heavily optimized,
because it's so heavily used. Almost all programs in the system are
dynamically linked (on some systems, such as Linux and Solaris, they
essentially all are), and thus they all use mmap to map in their

> I'm not convinced that futzing with a process' memory mapping tables
> is free, however ... especially not if you're creating a large number
> of separate small mappings.

It's not free, no. But then again, memory copies are really, really

In NetBSD, at least, you probably don't want to keep a huge number of
mappings around becuase they're stored as a linked list (ordered by
address) that's searched linearly when you need to add or delete a
mapping (though there's a hint for the most recent entry).

> If mmap provokes a TLB flush for your process, it's going to be
> expensive (just how expensive will be hard to measure, too, since most
> of the cycles will be expended after returning from mmap).

True enough, though blowing out your cache with copies is also not
cheap. But measuring this should not be hard; writing a little
program to do a bunch of copies versus a bunch of mmaps of random blocks
from a file should only be a couple of hours work. I'll work on this in my
spare time and report the results.

Curt Sampson  <cjs(at)cynic(dot)net>   +81 90 7737 2974
    Don't you know, in this new Dark Age, we're all light.  --XTC

In response to

pgsql-general by date

Next:From: Alex Cheung Tin KaDate: 2002-07-29 01:41:21
Subject: questions in query on 7.1 and 7.2
Previous:From: stefanDate: 2002-07-28 16:52:21
Subject: Re: [GENERAL] The best book

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group