Re: Postgres server crash

From: "Craig A(dot) James" <cjames(at)modgraph-usa(dot)com>
To: pgsql-performance(at)postgresql(dot)org, mstone+postgres(at)mathom(dot)us
Subject: Re: Postgres server crash
Date: 2006-11-19 22:12:01
Message-ID: 4560D6B1.8060306@modgraph-usa.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Michael Stone wrote:
> At one point someone complained about the ability to configure, e.g.,
> IRIX to allow memory overcommit. I worked on some large IRIX
> installations where full memory accounting would have required on the
> order of 100s of gigabytes of swap, due to large shared memory
> allocations.

These were mostly scientific and graphical apps where reliability took a back seat to performance and to program complexity. They would allocate 100's of GB of swap space rather than taking the time to design proper data structures. If the program crashed every week or two, no big deal -- just run it again. Overallocating memory is a valuable technique for such applications.

But overallocating memory has no place in a server environment. When memory overcommittment is allowed, it is impossible to write a reliable application, because no matter how carefully and correctly you craft your code, someone else's program that leaks memory like Elmer Fudd's rowboat after his shotgun goes off, can kill your well-written application.

Installing Postgres on such a system makes Postgres unreliable.

Tom Lane wrote:
> That might have been right when it was written (note the reference to a
> 2.2 Linux kernel), but it's 100% wrong now.
> [Setting /proc/sys/vm/overcommit_memory to] 0 is the default, not-safe
> setting.

I'm surprised that the Linux kernel people take such a uncritical view of reliability that they set, as *default*, a feature that makes Linux an unreliable platform for servers.

And speaking of SGI, this very issue was among the things that sank the company. As the low-end graphics cards ate into their visualization market, they tried to become an Oracle Server platform. Their servers were *fast*. But they crashed -- a lot. And memory-overcommit was one of the reasons. IRIX admins would brag that their systems only crashed every couple of weeks. I had HP and Sun systems that would run for years.

Craig

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Guido Neitzer 2006-11-19 22:22:18 PostgreSQL with 64 bit was: Re: shared_buffers > 284263 on OS X
Previous Message Michael Stone 2006-11-19 21:24:54 Re: Postgres server crash