Re: Pre-allocation of shared memory ...

From: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Kurt Roeckx" <Q(at)ping(dot)be>, "Matthew Kirkwood" <matthew(at)hairy(dot)beasts(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pre-allocation of shared memory ...
Date: 2003-06-14 21:39:56
Message-ID: 001201c332bd$7fee5780$6401a8c0@DUNSLANE
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I know he does - *but* I think it has probably been wiped out by accident
somewhere along the line (like when they went to 2.4.20?)

Here's what's in RH sources - tell me after you look that I am looking in
the wrong place. (Or did RH get cute and decide to do this only for the AS
product?)

first, RH7.3/kernel 2.4.18-3 (patch present):

----------------
int vm_enough_memory(long pages, int charge)
{
/* Stupid algorithm to decide if we have enough memory: while
* simple, it hopefully works in most obvious cases.. Easy to
* fool it, but this should catch most mistakes.
*
* 23/11/98 NJC: Somewhat less stupid version of algorithm,
* which tries to do "TheRightThing". Instead of using half of
* (buffers+cache), use the minimum values. Allow an extra 2%
* of num_physpages for safety margin.
*
* 2002/02/26 Alan Cox: Added two new modes that do real accounting
*/
unsigned long free, allowed;
struct sysinfo i;

if(charge)
atomic_add(pages, &vm_committed_space);

/* Sometimes we want to use more memory than we have. */
if (sysctl_overcommit_memory == 1)
return 1;
if (sysctl_overcommit_memory == 0)
{
/* The page cache contains buffer pages these days.. */
free = atomic_read(&page_cache_size);
free += nr_free_pages();
free += nr_swap_pages;

/*
* This double-counts: the nrpages are both in the
page-cache
* and in the swapper space. At the same time, this
compensates
* for the swap-space over-allocation (ie "nr_swap_pages"
being
* too small.
*/
free += swapper_space.nrpages;

/*
* The code below doesn't account for free space in the
inode
* and dentry slab cache, slab cache fragmentation, inodes
and
* dentries which will become freeable under VM load, etc.
* Lets just hope all these (complex) factors balance out...
*/
free += (dentry_stat.nr_unused * sizeof(struct dentry)) >>
PAGE_SHIFT;
free += (inodes_stat.nr_unused * sizeof(struct inode)) >>
PAGE_SHIFT;

if(free > pages)
return 1;
atomic_sub(pages, &vm_committed_space);
return 0;
}
allowed = total_swap_pages;

if(sysctl_overcommit_memory == 2)
{
/* FIXME - need to add arch hooks to get the bits we need
without the higher overhead crap */
si_meminfo(&i);
allowed += i.totalram >> 1;
}
if(atomic_read(&vm_committed_space) < allowed)
return 1;
if(charge)
atomic_sub(pages, &vm_committed_space);
return 0;

}
---------
and here's what's in RH9/2.4.20-18 (patch absent):
--------------
int vm_enough_memory(long pages)
{
/* Stupid algorithm to decide if we have enough memory: while
* simple, it hopefully works in most obvious cases.. Easy to
* fool it, but this should catch most mistakes.
*/
/* 23/11/98 NJC: Somewhat less stupid version of algorithm,
* which tries to do "TheRightThing". Instead of using half of
* (buffers+cache), use the minimum values. Allow an extra 2%
* of num_physpages for safety margin.
*/

unsigned long free;

/* Sometimes we want to use more memory than we have. */
if (sysctl_overcommit_memory)
return 1;

/* The page cache contains buffer pages these days.. */
free = atomic_read(&page_cache_size);
free += nr_free_pages();
free += nr_swap_pages;

/*
* This double-counts: the nrpages are both in the page-cache
* and in the swapper space. At the same time, this compensates
* for the swap-space over-allocation (ie "nr_swap_pages" being
* too small.
*/
free += swapper_space.nrpages;

/*
* The code below doesn't account for free space in the inode
* and dentry slab cache, slab cache fragmentation, inodes and
* dentries which will become freeable under VM load, etc.
* Lets just hope all these (complex) factors balance out...
*/
free += (dentry_stat.nr_unused * sizeof(struct dentry)) >>
PAGE_SHIFT;
free += (inodes_stat.nr_unused * sizeof(struct inode)) >>
PAGE_SHIFT;

return free > pages;
}

----- Original Message -----
From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Kurt Roeckx" <Q(at)ping(dot)be>; "Matthew Kirkwood"
<matthew(at)hairy(dot)beasts(dot)org>; <pgsql-hackers(at)postgresql(dot)org>
Sent: Saturday, June 14, 2003 5:16 PM
Subject: Re: [HACKERS] Pre-allocation of shared memory ...

> "Andrew Dunstan" <andrew(at)dunslane(dot)net> writes:
> > I *know* the latest RH kernel docs *say* they have paranoid mode that
> > supposedly guarantees against OOM - it was me that pointed that out
> > originally :-). I just checked on the latest sources (today it's RH8,
kernel
> > 2.4.20-18.8) to be doubly sure, and can't see the patches.
>
> I think you must be looking in the wrong place. Red Hat's kernels have
> included the mode 2/3 overcommit logic since RHL 7.3, according to
> what I can find. (Don't forget Alan Cox works for Red Hat ;-).)
>
> But it is true that it's not in Linus' tree yet. This may be because
> there are still some loose ends. The copy of the overcommit document
> in my RHL 8.0 system lists some ToDo items down at the bottom:
>
> To Do
> -----
> o Account ptrace pages (this is hard)
> o Disable MAP_NORESERVE in mode 2/3
> o Account for shared anonymous mappings properly
> - right now we account them per instance
>
> I have not installed RHL 9 yet --- is the ToDo list any shorter there?
>
> regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Lamar Owen 2003-06-15 03:46:48 Re: Pre-allocation of shared memory ...
Previous Message Tom Lane 2003-06-14 21:38:29 Re: Pre-allocation of shared memory ...