Re: Linux server connection process consumes all memory

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-novice(at)postgresql(dot)org, ahodgson(at)simkin(dot)ca
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-06 17:02:02
Message-ID: CAHyXU0z+xjcW-7nDj4hpZEo2pX7MQajeZ_NYVjiMpF4RoQK5TA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On Tue, Dec 6, 2011 at 3:14 AM, Ioannis Anagnostopoulos
<ioannis(at)anatec(dot)com> wrote:
> On 05/12/2011 21:13, Merlin Moncure wrote:
>
> On Mon, Dec 5, 2011 at 4:41 AM, Ioannis Anagnostopoulos
> <ioannis(at)anatec(dot)com> wrote:
>> Below is the forum discussion I have opened with Devart's ado.net driver
>> and
>> the NPGSQL (opensource) driver.
>>
>> http://www.devart.com/forums/viewtopic.php?p=76192#76192
>> http://pgfoundry.org/forum/forum.php?thread_id=10481&forum_id=519
>>
>> As I say there, I managed to reduce my queries to 55.000 from 22 million.
>> The server in this scenario is not running out of memory
>> but it is still apparent that memory consumption is high (8.9%!!). So if
>> 20
>> people try to run the same query we are going to be back
>> in square 1. For one more time let me assure that there is ONLY one
>> prepared
>> statement that is created at the beginning and is executed
>> over and over again. My question to postgres people is IF there is any bug
>> that becomes apparent as a memory leak after a lot of executions
>> of the same statement.  My next test will be to convert my .net code to
>> PGSql and execute it on the server without involving any client.
>>
>> The top below show the "updated" query running. This time the load is at
>> 8.9% right at the end of the run. Nowhere near the 96% but please
>> keep in mind that this happened when I reduced to iterations to 55.000.
>>
>> top - 10:35:23 up 96 days,  1:30,  1 user,  load average: 0.00, 0.01, 0.05
>> Tasks:  87 total,   1 running,  86 sleeping,   0 stopped,   0 zombie
>> Cpu(s):  1.5%us,  0.0%sy,  0.0%ni, 98.5%id,  0.0%wa,  0.0%hi,  0.0%si,
>>  0.0%st
>> Mem:   8070900k total,  8017768k used,    53132k free,    56800k buffers
>> Swap:  7811068k total,     4336k used,  7806732k free,  7671980k cached
>>
>>
>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 22181 postgres  20   0 2187m 701m 697m S    6  8.9   1:29.33 postgres
>
> You are misinterpreting what top is telling you (it's a pretty common
> mistake).  Most of the memory postgres is using as you can see is
> inside SHR, or is shared memory that is shared between all backend
> processes.
>
> Hello Merlin, thank you for the answer. Well the example that you quote is
> from
> a sunshine scenario were I had reduced the iteration to something more down
> to earth
> and the process concluded. However if you followed the other examples I
> listed in earlier
> post you will see cases like this:
>
> top - 11:46:05 up 85 days, 2:41, 1 user, load average: 0.67, 1.03, 1.19
> Tasks: 89 total, 1 running, 88 sleeping, 0 stopped, 0 zombie
> Cpu(s): 10.3%us, 0.7%sy, 0.0%ni, 69.8%id, 19.2%wa, 0.0%hi, 0.0%si, 0.0%st
> Mem: 8070900k total, 8017232k used, 53668k free, 1508k buffers
> Swap: 7811068k total, 4283720k used, 3527348k free, 2088528k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 15965 postgres 20 0 12.9g 6.4g 1.4g S 11 83.4 13:59.15 postgres
> 15961 postgres 20 0 4285m 1.8g 1.8g D 31 23.2 9:35.03 postgres
> 15827 postgres 20 0 4276m 52m 51m S 0 0.7 0:00.50 postgres
> 15830 postgres 20 0 4278m 25m 24m S 0 0.3 0:01.27 postgres
> 15959 postgres 20 0 4288m 4376 4288 S 0 0.1 0:00.25 postgres
> 15832 postgres 20 0 4279m 1388 888 S 0 0.0 0:00.19 postgres
>
> This is an extract from top and below is an extract of iostat:
>
> Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
> sda 158.00 1802.00 1602.00 3604 3204
> sdb 70.50 590.00 40.00 1180 80
> sdc 16.00 528.00 0.00 1056 0
>
> Please note that process 15965 is my connection process. In that example we
> have
> VIRT = 12.9G
> RES = 6.4G
> SHR=1.4G

Interesting. Well, in your 'sunshine scenario', postgres non shared
memory usage is basically zero -- growth of memory consumption in SHR
does not in any suggest runaway growth -- so memory consumption is not
in fact 8.9% but is < 1% and is basically a completely normal memory
profile. This also tends to rule out incremental memory consumption
issues that are a general class of problem when you are dealing with
huge numbers of system objects, like prepared statements, tables,
schemas, etc (but not records).

This means that something changed in your bigger job that caused
memory consumption to go completely out of whack....5 gigs of resident
memory is definitely not normal. Given the above. my first suspicion
is around work_mem -- what's it set to? Also, are you using any third
party sever-side modules (like PostGIS?) C functions? pl/java? A more
unlikely but possible scenario is an absolutely pathological query
plan that has gone totally haywire.

What we would need to know is exactly what is happening at the precise
point int time your resident memory starts running away from SHR.
Helpful information might include:

*) if your program is continually processing queries or is stuck at a
single query (pg_stat_activity system view is your friend here)

*) if we luck out and a single query is doing it, we'd need to see the
query and the plan (explain the query).

*) your setting for work_mem and maintenance_work_mem

*) anything else that's interesting...large numbers of tables/views/etc?

*) You may want to consider changing your vm over commit settings
and/or reducing swap in order to get your server to more aggressively
return OOM to postgres memory allocation. The specific error returned
to postgres for an OOM of course would be very helpful.

*) Using the binary protocol? custom C types? A forged/invalid datum
size can cause postgres to grab a lot of memory (data corruption can
also do this) -- but it would take more than one consecutive one to do
it.

merlin

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Tom Lane 2011-12-06 17:10:46 Re: Linux server connection process consumes all memory
Previous Message Ioannis Anagnostopoulos 2011-12-06 09:14:23 Re: Linux server connection process consumes all memory