Re: out of memory woes

From: "Brandon Aiken" <BAiken(at)winemantech(dot)com>
To: "Angva" <angvaw(at)gmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: out of memory woes
Date: 2006-12-14 16:41:57
Message-ID: F8E84F0F56445B4CB39E019EF67DACBA4015F1@exchsrvr.winemantech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I wonder if this is related to the Linux memory overcommit problem:

http://www.postgresql.org/docs/current/interactive/kernel-resources.html
#AEN19361

--
Brandon Aiken
CS/IT Systems Engineer

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Angva
Sent: Wednesday, December 13, 2006 4:49 PM
To: pgsql-general(at)postgresql(dot)org
Subject: [GENERAL] out of memory woes

Hi everyone,

First, this group has been good to me, and I thank you guys for the
valuable help I've found here. I come seeking help with another
problem. I am not even sure my problem lies in Postgres, but perhaps
someone here has had a similar problem and could point me in the right
direction.

As I've mentioned in a few other posts, I run a daily job that loads
large amounts of data into a Postgres database. It must run
efficiently, so one of the tricks I do is run table loads, and commands
such as cluster, in parallel. I am having a problem where my cluster
job consistently fails with an out of memory error. It did not always
do this, but at some point a memory leak or "something" built up to the
point where clustering would always fail on one of the two largest
tables. That is, four tables are clustered in parallel. The smaller of
the two finish successfully. The remaining two - the largest - run for
several minutes. Then one of the tables - not always the same one -
gets an out of memory error and fails.

So, suspecting a memory leak, I tried bouncing Postgres, and ran the
clusters again. No luck - failed in the same manner.

I don't know if it means anything, but swap never seems to be used by
the postgres processes (I stared at top and vmstat for a while), though
the VIRT column in top definitely shows something. Here are sample
lines from top while two processed are running:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30692 postgres 25 0 1033m 854m 459m R 100 14.3 1:10.68 postgres:
secmaster dev_stage [local] SELECT
30691 postgres 23 0 1752m 1.6g 459m R 100 27.5 2:55.60 postgres:
secmaster dev_stage [local] SELECT

(You see SELECT instead of CLUSTER because I wrapped up my cluster
commands in functions. I call them from psql by selecting the
function.)

Sample vmstat output:

procs -----------memory---------- ---swap-- -----io---- --system--
----cpu----
r b swpd free buff cache si so bi bo in cs us
sy id wa
3 2 724 18392 5052 5599624 0 0 915 2352 5 3 11
2 80 8

It seems that the postgres processes do not want to use swap -- swap
never increases as the processes run. Again I am not sure whether this
is significant.

If I run the clusters sequentially, there is no error - they just take
too long. I was out of ideas so I bounced the server entirely and ran
the clusters in parallel a few times - success! But I don't want to
have to bounce the server regularly.

So, I'm not sure what to do. I need to figure out why the server would
deteriorate in such a way. Any advice that can be offered would be much
appreciated. I can provide any additional information that might be
necessary.

I am running Postgres 8.1.2 on CentOS 4.4 64-bit.

Thanks a lot,
Mark

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ian Harding 2006-12-14 16:47:38 Re: Subcribing to this list, what's the secret?
Previous Message Tom Lane 2006-12-14 16:27:10 Re: out of memory woes