Re: PG Killed by OOM Condition

From: daveg <daveg(at)sonic(dot)net>
To: Bruno Wolff III <bruno(at)wolff(dot)to>, mark(at)mark(dot)mielke(dot)cc, John Hansen <john(at)geeknet(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PG Killed by OOM Condition
Date: 2005-10-25 05:52:17
Message-ID: 20051025055217.GD8157@sonic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 24, 2005 at 11:26:52PM -0500, Bruno Wolff III wrote:
> On Mon, Oct 24, 2005 at 23:55:07 -0400,
> mark(at)mark(dot)mielke(dot)cc wrote:
> > On Mon, Oct 24, 2005 at 10:20:39PM -0500, Bruno Wolff III wrote:
> > > On Mon, Oct 03, 2005 at 23:03:06 +1000,
> > > John Hansen <john(at)geeknet(dot)com(dot)au> wrote:
> > > > Good people,
> > > > Just had a thought!
> > > > Might it be worth while protecting the postmaster from an OOM Kill on
> > > > Linux by setting /proc/{pid}/oom_adj to -17 ?
> > > > (Described vaguely in mm/oom_kill.c)
> > > Wouldn't it be better to use sysctl to tell the kernel not to over commit
> > > memory in the first place?
> >
> > Only if you don't have large processes in your system that fork()
> > frequently, pushing the reserved memory over the limit, preventing
> > PostgreSQL from allocating memory when it does need it, even though
> > copy-on-write allows plenty of memory to continue to be available -
> > it is just reserved... :-)
> >
> > There isn't a perfect answer.
>
> No, but I would think tying up some disk space as swap space would be a
> better solution. The linux oom killer is really dangerous.

I work with a client that runs 16Gb memory with 16Gb of swap on dual opterons
dedicated to postgres. They have large tables and like hash joins as they are
often the fastest way to a result, so work_mem is set fairly large. Sometimes
postgres is very inaccurate predicting real memory use verses work_mem and
will grow very much larger than expected. Which can result in two or more
postgres processes with over 10 Gb of virtual memory along with the usual 60
or so normal sized ones.

When this happens the machine runs out of memory and swap. Without the oom
killer it simply hangs the machine which is inconvenient as it is at a remote
location. The oom killer usually lets the machine recover and postgres restart
without a hard reboot.

A solution is to use ulimit to set the maximum memory available to a
process. Ideally this would be a pg_ctl or postmaster option so that all the
forked postgresql processes would inherit the ulimit. The advantage over the
oom killer is that only the overly large process fails, and it fails with an
out of memory error and exits cleanly as opposed to having the whole set
of backends restarted.

-dg

--
David Gould daveg(at)sonic(dot)net
If simplicity worked, the world would be overrun with insects.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2005-10-25 06:15:22 Re: PG Killed by OOM Condition
Previous Message Tom Lane 2005-10-25 04:37:39 Re: BUG #1993: Adding/subtracting negative time intervals