Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion

From: Frank van Vugt <ftm(dot)van(dot)vugt(at)foxi(dot)nl>
To: pgsql-bugs(at)postgresql(dot)org
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion
Date: 2013-10-20 08:26:53
Message-ID: 1615181.fDHOPIPOzk@techfox.foxi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

Interesting read so far!

Op zondag 20 oktober 2013 04:48:01 schreef Tomas Vondra:
> because the example is constructed so that groups with multiple elements
> are very unlikely).

yep, the original intention of the 'experiment' was to investigate how often
random() would 'clash' on various machines with/without hardware number
generators, etc

> the report says that it consumed ~32GB RAM and swap (not sure how
> much, but probably not a small amount). On my machine it easily ate 8GB
> of RAM and 4GB of swap (and then got shot by OOM).

amount of swap is 32GB as well (see output of 'free' on top of report ;) )

> Anyway, I disabled the preallocation (i.e. 1 element initially, +1 for
> each iteration) which should be ~80MB of data, but even then I was
> unable to execute that query.
>
> The effect on smaller queries (say, 1e6 rows) was negligible too - it
> consumed more or less the same amount of memory, irrespectedly of the
> preallocation.

on my setup, running the array_agg() over 1e6 records uses up ~5GB

> With 1e6 groups that's ~8GB (not really far from what I see here), and
> with 1e7 groups it's ~80GB. Not the most efficient approach for 80MB of
> values.

exactly, the 'scaling' of memory hunger was not what I'd expect from such a
recordset, which made me write up the report

> The failing query is slightly artificial, but pretty much any array_agg
> query with large number of groups is going to fail exactly the same. So
> if we could improve that somehow, that'd be nice.

exactly

> But let's say it's a minor issue, and by switching to a shared memory
> context we've already saved ~50% memory for such these cases (single
> element in a group, 1kB chunk - 64*8B = 512B).

I never ran into this earlier (and I've been using PostgreSQL for quite some
time now) and even now, this is not about a 'production query', so I'm happy
with any and all benefits for future versions that come from this ;)

Having said that, should you want me to check the effects of some patch, just
let me know, I'd be happy to do that.

--

Best,

Frank.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tomas Vondra 2013-10-20 10:57:43 Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion
Previous Message Frank van Vugt 2013-10-20 08:26:39 Re: array_agg() on a set larger than some arbitrary(?) limit causes runaway memory usage and eventually memory exhaustion