Re: BUG #16389: OOM on CTE since version 12

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: rikard(at)ngs(dot)hr, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16389: OOM on CTE since version 12
Date: 2020-04-25 13:13:02
Message-ID: 20200425131302.xaqjze4ejizilth4@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Rikard,

On Sat, Apr 25, 2020 at 07:08:06AM +0000, PG Bug reporting form wrote:
>The following bug has been logged on the website:
>
>Bug reference: 16389
>Logged by: Rikard Pavelic
>Email address: rikard(at)ngs(dot)hr
>PostgreSQL version: 12.2
>Operating system: Centos 7, Linux 3.10
>Description:
>
>Hi, we are testing upgrade from 11 to 12 and encountered some strange
>OOMs.
>This one was consistently happening on several instances, while it passed
>normally on many other ones:
>
>org.postgresql.util.PSQLException:ERROR: out of memory
> Detail: Failed on request of size 52 in memory context "ExecutorState".
>Where: "WITH source AS (
> SELECT
> i,
> row_number() over () as ord,
> (t.old).amount_array AS old,
> (t.new).amount_array AS new,
> coalesce(array_upper((t.old).amount_array, 1), 0) AS old_count,
> coalesce(array_upper((t.new).amount_array, 1), 0) AS new_count,
> (SELECT array_agg(row(_r.pk1, _r.pk2)) FROM unnest((t.old).amount_array)
>_r) as old_pks,
> (SELECT array_agg(row(_r.pk1, _r.pk2)) FROM unnest((t.new).amount_array)
>_r) as new_pks,
> (SELECT array_agg(i) FROM generate_series(1, CASE WHEN
>coalesce(array_upper((t.old).amount_array, 1), 0) >
>coalesce(array_upper((t.new).amount_array, 1), 0) THEN
>array_upper((t.old).amount_array, 1) ELSE array_upper((t.new).amount_array,
>1) END) i) as indexes
> FROM global_temp_input_table t
> WHERE
> NOT (t.old).amount_array IS NULL AND (t.new).amount_array IS NULL
> OR (t.old).amount_array IS NULL AND NOT (t.new).amount_array IS NULL
> OR NOT (t.old).amount_array IS NULL AND NOT (t.new).amount_array IS NULL
>AND (t.old).amount_array != (t.new).amount_array
>)
>INSERT INTO global_temp_update
>SELECT i, index, old[index] AS old,
> CASE
> WHEN old_pks[index] = new_pks[index] THEN new[index]
> WHEN NOT COALESCE(old_pks[index] = ANY (new_pks), false) THEN null
> ELSE (select n from unnest(new) n where n.pk1 = old[index].pk1 AND n.pk2 =
>old[index].pk2)
> END AS changed,
> new[index] AS new,
> CASE WHEN old_pks[index] = new_pks[index] THEN false ELSE NOT
>new_pks[index] IS NULL AND NOT COALESCE(new_pks[index] = ANY (old_pks),
>false) END AS is_new
>FROM
>(
> SELECT i as _i, ord as _ord, unnest((SELECT array_agg(i) FROM
>generate_series(1, CASE WHEN old_count > new_count THEN old_count ELSE
>new_count END) i)) as index
> FROM source s
>) ix
>INNER JOIN source sq ON ix._i = sq.i AND ix._ord = sq.ord"
>
>The global input temp tables look like this
>
>CREATE UNLOGGED TABLE global_temp_input_table
>(
> i integer,
> old data_source_view,
> new data_source_view
>);
>
>where data_source_view is an view with an array of composites (amount_array)
>which gets unnested and compared here.
>
>While the query is not optimal, the OOMs stopped after we put MATERIALIZED
>on top (which does not seem necessary as it's used twice).
>Anyway, it's not an immediate problem for us anymore, as we've rewritten it
>into a saner form,
>but it hints at some underlying issue with recent CTE optimization.
>

It's difficult to say what went wrong based just on the query text, but
seeing that you're using composites as columns, I wouldn't be surprised
if this was a case of poor estimates. I'd bet we'll only build stats
(MCV, histograms, ...) for the whole composite value, not for individual
columns, which would mean estimates for the various conditions may be
arbitrarily off. Which can easily lead to plans using e.g. hash
aggregate and OOM. It's possible some of the instances were "lucky" not
to trigger this, e.g. due to slightly different data.

It'd be good to see explains for both the original and modified query,
and explain analyze for both (or at least for the modified one, if the
older one still ends with OOM).

Another thing you might do is grabbing memory context stats from the
older query, before it fails with OOM. That'll tell us which part of the
query allocates the memory. You'll need to use gdb or some other
debugger to do this, about like this:

-- in database
select pg_backend_pid();
... run query ...

-- in shell (wait until the query allocates enough memory)
gdb -p $the_backend_pid
(gdb) p MemoryContextStats(TopMemoryContext)

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2020-04-25 15:01:04 BUG #16390: Regression between 12.2 and 11.6 on a recursive query : very slow and overestimation of rows
Previous Message PG Bug reporting form 2020-04-25 07:08:06 BUG #16389: OOM on CTE since version 12