Re: Out of memory

From: Jeremy Palmer <JPalmer(at)linz(dot)govt(dot)nz>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Out of memory
Date: 2011-04-12 10:11:54
Message-ID: 666FB8D75E95AE42965A0E76A5E5337E06DCC19F68@prdlsmmsg01.ad.linz.govt.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Well after a few days of further investigation I still can't track the issue down. The main problem I can only reproduce the error running the whole transaction. So I can't isolate the problem down to a simple use case or even smaller subset of the transaction, which would have been nice for posting to this list.

Does anyone have an idea of how I might go about trying to tackle this problem now. Should I try further reducing the memory settings? Or install a debug version of PostgreSQL and get some further information about what is going on before the error. Any advice would be greatly appreciated.

Regards,
Jeremy

________________________________________
From: Jeremy Palmer
Sent: Tuesday, 5 April 2011 9:50 p.m.
To: pgsql-general(at)postgresql(dot)org
Subject: Out of memory

Hi,

I've been having repeated troubles trying to get a PostgreSQL app to play nicely on Ubuntu. I recently posted a message on this list about an out of memory error and got a resolution by reducing the work_mem setting. However I'm now getting further out of memory issues during the same stage of plpgsql function as mentioned before.

The function itself is run as part of larger transaction which does the following:

1/ Maintains 104 tables (15 PostGIS tables), by loading and applying incremental table changes. A typical incremental load with maintain about 10,000 rows.

2/ When each one of these tables is updated an after trigger is fired that maintains an associated table revision table.

3/ After all of the tables are maintained a plpgsql function is called to build/maintain a set of de-normalised tables. These tables total about 20GB. Each one of these tables is compared against the previous table revision to determine its row changes. It's in this function that the out of memory exception is occurring.

The server log error message I'm getting in the function is here http://pastebin.com/346zi2sS. It's very long and contains the top transaction memory debug info.

My initial observation about this error is that maybe PostgreSQL is encountering a memory corruption error because the amount of OS memory does not seem to run out. The plpgsql function uses functions from both PostGIS and pgc_checksum (http://pgfoundry.org/projects/pg-comparator) - so maybe they are the cause of the problem. Or maybe I have configured something wrong...

I did some memory logging during and the execution of the function. It shows for the majority of the transaction execution that the actual memory used is about 1GB (grows from the initial 600mb) with about 6.5GB cached for the OS:

total used free shared buffers cached
Mem: 8004 7839 165 0 0 6802
-/+ buffers/cache: 1037 6967
Swap: 397 0 397

But just before the out of memory error occurs there is a spike to 2.5GB of used memory, but there us still 4.5GB cached by the OS:

total used free shared buffers cached
Mem: 8004 7702 301 0 0 4854
-/+ buffers/cache: 2848 5156
Swap: 397 0 397

Then after the error the memory slowly returns this state:

total used free shared buffers cached
Mem: 8004 1478 6526 0 0 1133
-/+ buffers/cache: 345 7659
Swap: 397 0 397

The OS I'm running is:

Linux TSTLHAPP01 2.6.32-29-server #58-Ubuntu SMP Fri Feb 11 21:06:51 UTC 2011 x86_64 GNU/Linux.

It’s a running on VMWare and, has 2 CPU’s and 8GB of RAM. This VM is dedicated to PostgreSQL, not much else is running other than cacti, ssh and ftp server daemons. The main OS parameters I have tuned are:

vm.swappiness=0
vm.overcommit_memory=2
kernel.shmmax = 4196769792
kernel.shmall = 1024602

And the PostgreSQL is:

PostgreSQL 9.0.3 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit.

The main changed postgresql.conf parameters I've tuned are:

shared_buffers = 512MB
maintenance_work_mem = 512MB
temp_buffers = 256MB
work_mem = 1MB
wal_buffers = 16MB
effective_cache_size = 4094MB

The size of the database is 350GB. The typical number of users connected to the database is 1 or 2. This database is used for loading external data, managing revision table information and generating and outputting de-normalised datasets, so it does not have a high number of transactions running. Typically 1 large one per day.

Two questions:

1) Have I set the OS and postgresql parameter to sensible values given the hardware and database utilization.
2) Can anyone help me make sense of the top transaction memory error to help track down the issue?

Any other suggestions would be greatly appreciated.

Thanks
Jeremy
______________________________________________________________________________________________________

This message contains information, which is confidential and may be subject to legal privilege.
If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message.
If you have received this message in error, please notify us immediately (Phone 0800 665 463 or info(at)linz(dot)govt(dot)nz) and destroy the original message.
LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ.

Thank you.
______________________________________________________________________________________________________

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message tv 2011-04-12 10:29:27 Re: What is the difference between these queries
Previous Message AI Rumman 2011-04-12 10:06:24 pg_connect connection problem