excessive amounts of consumed memory (RSS), triggering OOM killer

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: excessive amounts of consumed memory (RSS), triggering OOM killer
Date: 2014-12-01 22:39:46
Message-ID: 547CEE32.9000606@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

while working on the patch decreasing amount of memory consumed by
array_agg [1], I've ran into some strange OOM issues. Reproducing them
using the attached SQL script is rather simple.

[1] https://commitfest.postgresql.org/action/patch_view?id=1652

At first I thought there's some rare hidden memory leak, but I'm pretty
sure that's not the case. I've even put some explicit counters into
aset.c counting allocated/freed blocks, and it seems to be working just
fine (and matching the context tree perfectly). So no memory leak.

The only thing I can think of is some interaction (fragmentation?) with
the system allocator (I'm using glibc-2.19 and kernel 3.16.1, btw),
making the RSS values even less useful than I thought. Sadly, it seems
to trigger the OOM killer :-(

It's entirely possible that this is a known behavior of the allocator,
and I've been unaware of it. It's also true that the work_mem values are
really excessive, and the actual values would be much lower, which would
make the issues much less serious.

To demonstrate the problem I'll use the attached SQL script - it's a
rather simple script generating sample tables and then executing trivial
array_agg() queries.

The script sets work_mem to two different values:

work_mem = '1GB' ==> this limits the INSERT (generating data)

work_mem = '1024GB' ==> bogus value, forcing a hash aggegate
(assuming the hash table fits into memory)

The size of the sample tables (and the amount of memory needed for the
hash aggregate) is determined by the fist parameter set in the script.
With this value

\set size 10000

I get a OOM crash on the first execution of the SQL script (on a machine
with 8GB of RAM and 512MB shared buffers), but YMMV.

The problem is, that even with a much smaller dataset (say, using size
7500) you'll get an OOM error after several executions of the script.
How many executions are needed seems to be inversely proportional to the
size of the data set.

The "problem" is that the RSS amount is increasing over time for some
reason. For example with the "size = 5000", the memory stats for the
process look like this over the first few minutes:

VIRT RES SHR %CPU %MEM TIME+ COMMAND
5045508 2,818g 187220 51,9 36,6 0:15.39 postgres: INSERT
5045508 3,600g 214868 62,8 46,8 3:11.58 postgres: INSERT
5045508 3,771g 214868 50,9 49,0 3:40.03 postgres: INSERT
5045508 3,840g 214868 48,5 49,9 4:00.71 postgres: INSERT
5045508 3,978g 214880 51,5 51,7 4:40.73 postgres: INSERT
5045508 4,107g 214892 53,2 53,4 5:22.04 postgres: INSERT
5045508 4,297g 215276 53,9 55,9 6:22.63 postgres: INSERT
...

Those are rows for the backend process, captured from "top" over time.
How long the backend was running is in the TIME column. Each iteration
takes ~30 seconds, so those lines represent approximately iterations 1,
6, 7, 8, 11, etc.

Notice how the RSS value grows over time, and also notice that this is
the INSERT, restricted by work_mem=1GB. So the memory consumption should
be ~1.5GB, and MemoryContextStats(TopMemoryContext) collected at this
point is consistent with that (see the mem-ctx.log).

And then the value stabilizes at ~4,430g and stops growing. With size
7500 it however takes only ~20 iterations to reach the OOM issue, with a
crash log like this:

[10560.843547] Killed process 15862 (postgres) total-vm:7198260kB,
anon-rss:6494136kB, file-rss:300436kB

So, any ideas what might be the culprit here?

As I said, this is clearly made worse by inappropriately high work_mem
values, but I'm not sure it's completely harmless. Imagine for example
long-running backends, executing complex queries with inaccurate
estimates. That may easily result in using much more memory than the
work_mem limit, and increasing the RSS value over time.

regards
Tomas

Attachment Content-Type Size
array-agg.sql application/sql 3.2 KB
mem-ctx.log text/x-log 8.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2014-12-01 23:31:19 Re: excessive amounts of consumed memory (RSS), triggering OOM killer
Previous Message Alvaro Herrera 2014-12-01 22:34:13 Re: tracking commit timestamps