Re: Linux max on shared buffers?

From: Curt Sampson <cjs(at)cynic(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Jan Wieck <JanWieck(at)Yahoo(dot)com>, GB Clark <postgres(at)vsservices(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, <glenebob(at)nwlink(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Linux max on shared buffers?
Date: 2002-07-28 17:31:49
Message-ID: Pine.NEB.4.44.0207290141350.28234-100000@angelic.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, 28 Jul 2002, Tom Lane wrote:

> Hm. What's the particular syscall being used for reference here?

It's a one-byte write() to /dev/null.

> And how does it compare to the sorts of activities we'd actually be
> concerned about (open, close, mmap)?

Well, I don't see that open and close are relevant, since that part of
the file handling would be exactly the same if you continued to use the
same file handle caching code we use now.

lmbench does have a test for mmap latency which tells you how long it
takes, on average, to mmap the first given number of bytes of a file.
Unfortunately, it's not giving me output for anything smaller than about
half a megabyte (perhaps because it's too fast to measure accurately?),
but here are the times, in microseconds, for sizes from that to 1 GB on
my 1533 MHz Athlon:

0.524288 7.688
1.048576 15
2.097152 22
4.194304 40
16.777216 169
33.554432 358
67.108864 740
134.217728 2245
268.435456 5080
536.870912 9971
805.306368 14927
1073.741824 19898

It seems roughly linear, so I'm guessing that an 8k mmap would be
around 0.1-0.2 microseconds, or ten times the cost of a syscall.

Really, I need to write a better benchmark for this. I'm a bit busy
this week, but I'll try to find time to do that.

Keep in mind, though, that mmap is generally quite heavily optimized,
because it's so heavily used. Almost all programs in the system are
dynamically linked (on some systems, such as Linux and Solaris, they
essentially all are), and thus they all use mmap to map in their
libraries.

> I'm not convinced that futzing with a process' memory mapping tables
> is free, however ... especially not if you're creating a large number
> of separate small mappings.

It's not free, no. But then again, memory copies are really, really
expensive.

In NetBSD, at least, you probably don't want to keep a huge number of
mappings around becuase they're stored as a linked list (ordered by
address) that's searched linearly when you need to add or delete a
mapping (though there's a hint for the most recent entry).

> If mmap provokes a TLB flush for your process, it's going to be
> expensive (just how expensive will be hard to measure, too, since most
> of the cycles will be expended after returning from mmap).

True enough, though blowing out your cache with copies is also not
cheap. But measuring this should not be hard; writing a little
program to do a bunch of copies versus a bunch of mmaps of random blocks
from a file should only be a couple of hours work. I'll work on this in my
spare time and report the results.

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alex Cheung Tin Ka 2002-07-29 01:41:21 questions in query on 7.1 and 7.2
Previous Message stefan 2002-07-28 16:52:21 Re: [GENERAL] The best book