Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion

From: Valentine Gogichashvili <valgog(at)gmail(dot)com>
To: Frank van Vugt <ftm(dot)van(dot)vugt(at)foxi(dot)nl>
Cc: PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion
Date: 2013-10-19 19:02:25
Message-ID: CAP93muUfwRvg6EpdNZh=5viaZUUXfzm1do3EkkKBTJiC-9Jeaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Frank,

this is a little bit not relevant to the question itself. But to prevent
OOM killer from currupting your database please consider this for your
production environments:
http://www.postgresql.org/docs/9.1/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT

ვალენტინ გოგიჩაშვილი
Valentine Gogichashvili

On Sat, Oct 19, 2013 at 3:38 PM, Frank van Vugt <ftm(dot)van(dot)vugt(at)foxi(dot)nl>wrote:

> L.S.
>
> Something seems wrong here.... when applying arrag_agg() on a large
> recordset,
> above a certain size 'things fall over' and memory usage races off until
> the
> system runs out of it:
>
>
> # select version();
> version
>
> -----------------------------------------------------------------------------------
> PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.7.1,
> 64-bit
>
>
> # \! free -g
> total used free shared buffers cached
> Mem: 31 1 30 0 0 0
> -/+ buffers/cache: 0 30
> Swap: 31 0 31
>
>
>
> ====================
> Create concatenate function and aggregate:
>
> CREATE OR REPLACE FUNCTION comma_cat(text, text)
> RETURNS text
> LANGUAGE 'plpgsql'
> IMMUTABLE
> STRICT
> SECURITY INVOKER
> AS '
> BEGIN
> IF $1 = '''' THEN
> RETURN $2;
> ELSIF $2 = '''' THEN
> RETURN $1;
> ELSE
> RETURN $1 || '', '' || $2;
> END IF;
> END;';
> CREATE AGGREGATE comcat(text) ( SFUNC = comma_cat, STYPE = text );
>
>
> Activate timing:
>
> \timing
> Timing is on.
>
>
> Create demo data:
>
> create temp table f as
> select id, random() as value
> from generate_series(1, 1e7::int) as f(id);
>
> Time: 7036,917 ms
>
>
>
> ====================
> Don't mind the 'usefulness' of the exact query below, I ran into this issue
> when experimenting a bit using random().
>
>
>
>
> On my system, using the comcat() aggregate is no problem regardless of the
> size of the recordset:
>
> with g as (select * from f limit 1e5)
> select comcat(id::text), min(value)
> from g
> group by g.value
> having count(1) > 1;
> Time: 189,835 ms
>
>
> with g as (select * from f limit 1e6)
> select comcat(id::text), min(value)
> from g
> group by g.value
> having count(1) > 1;
> Time: 1815,756 ms
>
>
> with g as (select * from f)
> select comcat(id::text), min(value)
> from g
> group by g.value
> having count(1) > 1;
> Time: 18660,326 ms
>
> ====================
> However, using the array_agg() this breaks (on my system ) on the largest
> set:
>
> with g as (select * from f limit 1e5)
> select array_agg(id), min(value)
> from g
> group by g.value
> having count(1) > 1;
> Time: 361,242 ms
>
>
> with g as (select * from f limit 1e6)
> select array_agg(id), min(value)
> from g
> group by g.value
> having count(1) > 1;
> Time: 3310,347 ms
>
>
> with g as (select * from f)
> select array_agg(id), min(value)
> from g
> group by g.value
> having count(1) > 1;
> Time: <none, fails>
>
>
> => the last query eats up all 32GB main memory in seconds, then starts on
> the
> 32GB swap (which obviously takes a bit longer to digest) until eventually
> the
> child process gets killed by the oom-killer and postgresql restarts....
>
>
>
>
>
>
> --
>
> Best,
>
>
>
>
> Frank.
>
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message jackie.qq.chang 2013-10-19 21:49:05 BUG #8540: Avoid sscanf buffer overflow
Previous Message Frank van Vugt 2013-10-19 13:38:40 array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion