DO with a large amount of statements get stuck with high memory consumption

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: DO with a large amount of statements get stuck with high memory consumption
Date: 2016-07-12 19:29:10
Message-ID: CAHyXU0x24k3nATzNWswzHSdzk39On0GfgFtbvZD=anSQSBHcNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've noticed that pl/pgsql functions/do commands do not behave well
when the statement resolves and frees memory. To be clear:

FOR i in 1..1000000
LOOP
INSERT INTO foo VALUES (i);
END LOOP;

...runs just fine while

BEGIN
INSERT INTO foo VALUES (1);
INSERT INTO foo VALUES (2);
...
INSERT INTO foo VALUES (1000000);
END;

(for the curious, create a script yourself via
copy (
select
'do $$begin create temp table foo(i int);'
union all select
format('insert into foo values (%s);', i) from generate_series(1,1000000) i
union all select 'raise notice ''abandon all hope!''; end; $$;'
) to '/tmp/breakit.sql';

...while consume amounts of resident memory proportional to the number
of statemnts and eventually crash the server. The problem is obvious;
each statement causes a plan to get created and the server gets stuck
in a loop where SPI_freeplan() is called repeatedly. Everything is
working as designed I guess, but when this happens it's really
unpleasant: the query is uncancellable and unterminatable, nicht gut.
A pg_ctl kill ABRT <pid> will do the trick but I was quite astonished
to see linux take a few minutes to clean up the mess (!) on a somewhat
pokey virtualized server with lots of memory. With even as little as
ten thousand statements the cleanup time far exceed the runtime of the
statement block.

I guess the key takeaway here is, "don't do that"; pl/pgsql
aggressively generates plans and turns out to be a poor choice for
bulk loading because of all the plan caching. Having said that, I
can't help but wonder if there should be a (perhaps user configurable)
limit to the amount of SPI plans a single function call should be able
to acquire on the basis you are going to smack into very poor
behaviors in the memory subsystem.

Stepping back, I can't help but wonder what the value of all the plan
caching going on is at all for statement blocks. Loops might comprise
a notable exception, noted. I'd humbly submit though that (relative
to functions) it's much more likely to want to do something like
insert a lot of statements and a impossible to utilize any cached
plans.

This is not an academic gripe -- I just exploded production :-D.

merlin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-07-12 19:42:25 Re: BUG #14245: Segfault on weird to_tsquery
Previous Message Tom Lane 2016-07-12 19:24:01 Re: GiST index build versus NaN coordinates