Skip site navigation (1) Skip section navigation (2)

Re: Linux server connection process consumes all memory

From: Pandu Poluan <pandu(at)poluan(dot)info>
To: Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, ahodgson(at)simkin(dot)ca, pgsql-novice(at)postgresql(dot)org, Merlin Moncure <mmoncure(at)gmail(dot)com>
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-07 11:23:56
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-novice
On Dec 7, 2011 4:26 PM, "Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com>
> On 06/12/2011 17:10, Tom Lane wrote:
>> Merlin Moncure<mmoncure(at)gmail(dot)com>  writes:
>>> *) You may want to consider changing your vm over commit settings
>>> and/or reducing swap in order to get your server to more aggressively
>>> return OOM to postgres memory allocation.  The specific error returned
>>> to postgres for an OOM of course would be very helpful.
>> Yeah.  I would try starting the postmaster under smaller ulimit settings
>> so that the kernel gives it ENOMEM before you start getting swapped.
>> When that happens, the backend will dump a memory map into the
>> postmaster log that would be very useful for seeing what is actually
>> happening here.
>>                        regards, tom lane
> Hello all,
> I think I have solved the problem. Many thanks for the support and the
time you spend. The solution/bug/problem is as follows:
> 1. There was one connection that as I described was used IN A LOOP
22million times. This connection was assigned a PID x (on the linux server)
> 2. Nested within this LOOP there was another connection that had been
forgotten from past code and the linux server was assigning to it a PID y
> 3. PID y was of course called also 22million times (since it was in the
loop). However it had a nasty bug and it was creating constantly prepared
commands! (opps my mistake). So PID y was creating 22million prepared
> 4. As I had no clue that that there was at all PID y, monitoring the TOP
on the server I was presented with the misbehaving PID y but I was of the
impression that it was PID x. In fact PID x was below in the list happy
doing its own job.
> So the healthy PID X had a top signature as follows (please note the
difference between RES and SHR as well as the magnitude in Mb as Merlin
> 30475 postgres  20   0 2187m 746m 741m S   31  9.5   0:41.48 postgres
> While the unhealthy PID Y had a TOP signature (please note that RES
memory is at 12.9g! and SHR 1.4g as well as the magnitude in Gb!):
> 15965 postgres 20 0 12.9g 6.4g 1.4g S 11 83.4 13:59.15 postgres
> As I said I had no clue about the existence of PID Y and since it was
coming top at the TOP list I had wrongfully assumed that it was the PID X.
It gets more complicated by the fact that the test code I sent you, which
should have been working fine as it had no nested buggy loop, was mainly
running from home over the DSL line thus I never let it conclude its
22million iterations (it would have been still running!) instead I was
monitoring the TOP and since the memory was going UP I was wrongfully
assuming that I had the same issue (if I had let it run for 2 -3 hours I
would have noticed what Merlin suggested about RES/SHR ratio). So it was a
misdiagnosis after all :)
> I hope this explains everything.
> Kind Regards and sorry for the misunderstanding.

All's well that ends well.

May I ask how you finally found the culprit? Through profiling or code

That said, I can't help admiring that PostgreSQL still survives 22 million
prepared commands without committing suicide. Yes, it's severely impacted,
but still survives.


In response to


pgsql-novice by date

Next:From: Ioannis AnagnostopoulosDate: 2011-12-07 12:14:21
Subject: Re: Linux server connection process consumes all memory
Previous:From: Ioannis AnagnostopoulosDate: 2011-12-07 09:24:38
Subject: Re: Linux server connection process consumes all memory

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group