Re: Re: [PATCHES] A patch for xlog.c

From: ncm(at)zembu(dot)com (Nathan Myers)
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Re: [PATCHES] A patch for xlog.c
Date: 2001-02-26 08:21:25
Message-ID: 20010226002125.A2430@store.zembu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 25, 2001 at 11:28:46PM -0500, Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > It allows no backing store on disk.

I.e. it allows you to map memory without an associated inode; the memory
may still be swapped. Of course, there is no problem with mapping an
inode too, so that unrelated processes can join in. Solarix has a flag
to pin the shared pages in RAM so they can't be swapped out.

> > It is the BSD solution to SysV
> > share memory. Here are all the BSDi flags:
>
> > MAP_ANON Map anonymous memory not associated with any specific
> > file. The file descriptor used for creating MAP_ANON
> > must be -1. The offset parameter is ignored.
>
> Hmm. Now that I read down to the "nonstandard extensions" part of the
> HPUX man page for mmap(), I find
>
> If MAP_ANONYMOUS is set in flags:
>
> o A new memory region is created and initialized to all zeros.
> This memory region can be shared only with descendants of
> the current process.

This is supported on Linux and BSD, but not on Solarix 7. It's not
necessary; you can just map /dev/zero on SysV systems that don't
have MAP_ANON.

> While I've said before that I don't think it's really necessary for
> processes that aren't children of the postmaster to access the shared
> memory, I'm not sure that I want to go over to a mechanism that makes it
> *impossible* for that to be done. Especially not if the only motivation
> is to avoid having to configure the kernel's shared memory settings.

There are enormous advantages to avoiding the need to configure kernel
settings. It makes PG a better citizen. PG is much easier to drop in
and use if you don't need attention from the IT department.

But I don't know of any reason to avoid mapping an actual inode,
so using mmap doesn't necessarily mean giving up sharing among
unrelated processes.

> Besides, what makes you think there's not a limit on the size of shmem
> allocatable via mmap()?

I've never seen any mmap limit documented. Since mmap() is how
everybody implements shared libraries, such a limit would be equivalent
to a limit on how much/many shared libraries are used. mmap() with
MAP_ANONYMOUS (or its SysV /dev/zero equivalent) is a common, modern
way to get raw storage for malloc(), so such a limit would be a limit
on malloc() too.

The mmap architecture comes to us from the Mach microkernel memory
manager, backported into BSD and then copied widely. Since it was
the fundamental mechanism for all memory operations in Mach, arbitrary
limits would make no sense. That it worked so well is the reason it
was copied everywhere else, so adding arbitrary limits while copying
it would be silly. I don't think we'll see any systems like that.

Nathan Myers
ncm(at)zembu(dot)com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Schindler 2001-02-26 08:51:02 Re: [INTERFACES] IPC Shared Memory (fwd)
Previous Message Katsuyuki Tanaka 2001-02-26 05:06:56 IPC Shared Memory (fwd)