Curing plpgsql's memory leaks for statement-lifespan values

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Curing plpgsql's memory leaks for statement-lifespan values
Date: 2016-07-21 23:02:32
Message-ID: 17863.1469142152@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In
https://www.postgresql.org/message-id/tencent_5C738ECA65BAD6861AA43E8F@qq.com
it was pointed out that you could get an intra-function-call memory leak
from something like

LOOP
BEGIN
EXECUTE 'bogus command';
EXCEPTION WHEN OTHERS THEN
END;
END LOOP;

The problem is that exec_stmt_dynexecute() loses control to the error
thrown from the bogus command, and therefore leaks its "querystr" local
variable --- which is not much of a leak, but it adds up if the loop
iterates enough times. There are similar problems in many other places in
plpgsql. Basically the issue is that while running a plpgsql function,
CurrentMemoryContext points to a function-lifespan context (the same as
the SPI procCxt the function is using). We also store things such as
values of the function's variables there, so just resetting that context
is not an option. plpgsql does have an expression-evaluation-lifespan
context for short-lived stuff, but anything that needs to live for more
or less the duration of a statement is put into the procedure-lifespan
context, where it risks becoming a leak. (That design was fine
originally, because any error would result in abandoning function
execution and thereby cleaning up that context. But once we invented
plpgsql exception blocks, it's not so good.)

One way we could resolve the problem is to require all plpgsql code to
use PG_TRY/PG_CATCH blocks to ensure that statement-lifespan variables
are explicitly released. That's undesirable on pretty much every front
though: it'd be notationally ugly, prone to omissions, and not very
speedy.

Another answer is to invent a third per-function memory context intended
to hold statement-lifespan variables. We could say that statements are
still expected to clean up their mess in normal cases, and this context is
only cleared when BEGIN/EXCEPT catches an error. Each BEGIN/EXCEPT would
need to create and install a new such context, since otherwise it might be
clobbering statement-lifespan variables from a surrounding statement.
A more aggressive variant is to arrange things so we can automatically
reset that context after each statement, avoiding the need for explicit
pfree's of statement-lifespan variables. I think that would mean that
not only BEGINs but also looping statements would need to install new
contexts for their controlled statements, unless they had no
pass-by-reference local values. The traffic for creating and deleting
such contexts might be too expensive, even though it'd buy back some
retail pfree's.

I looked into whether we could actually install such a context as
CurrentMemoryContext, allowing palloc's to go there by default. This
seems not workable: a lot of the existing palloc's really need to be
creating function-lifespan values, and there's also a problem that SPI
expects CurrentMemoryContext to be the same as its procCxt, because many
SPI functions will just set CurrentMemoryContext to equal procCxt at exit.
So in any case, use of such a context would require explicit
MemoryContextAllocs or MemoryContextSwitchTos, which is a bit annoying
notationally, but there seems little help for it.

Thoughts, better ideas?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-07-21 23:27:57 Re: Password identifiers, protocol aging and SCRAM protocol
Previous Message David Fetter 2016-07-21 21:57:33 Re: PoC: Make it possible to disallow WHERE-less UPDATE and DELETE