Re: calling procedures is slow and consumes extra much memory against calling function

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: calling procedures is slow and consumes extra much memory against calling function
Date: 2020-09-28 01:04:06
Message-ID: 193996.1601255046@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> I am sending another patch that tries to allow CachedPlans for CALL
> statements. I think this patch is very accurate, but it is not nice,
> because it is smudging very precious reference counting for CachedPlans.

I spent some time testing this. Although the #1 patch gets rid of
the major memory leak of cached plans, the original test case still
shows a pretty substantial leak across repeated executions of a CALL.
The reason is that the stanza for rebuilding stmt->target also gets
executed each time through, and that leaks not only the relatively
small PLpgSQL_row datum but also a bunch of catalog lookup cruft
created on the way to building the datum. Basically this code forgot
that plpgsql's outer execution layer can't assume that it's running
in a short-lived context.

I attach a revised #1 that takes care of that problem, and also
cleans up what seems to me to be pretty sloppy thinking in both
the original code and Pavel's #1 patch: we should be restoring
the previous value of expr->plan, not cavalierly assuming that
it was necessarily NULL. I didn't care for looking at the plan's
"saved" field to decide what was happening, either. We really
should have a local flag variable clearly defining which behavior
it is that we're implementing.

With this patch, I see zero memory bloat on Pavel's original example,
even with a much larger repeat count.

I don't like much of anything about plpgsql-stmt_call-fix-2.patch.
It feels confused and poorly documented, possibly because "fragile"
is not a very clear term for whatever property it is you're trying to
attribute to plans. But in any case, I think it's fixing the problem
in the wrong place. I think the right way to fix it probably is to
manage a CALL's saved plan the same as every other plpgsql plan,
but arrange for the transient refcount on that plan to be held by a
ResourceOwner that is not a child of any transaction resowner, but
rather belongs to the procedure's execution and will be released on
the way out of the procedure.

In any case, I doubt we'd risk back-patching either the #2 patch
or any other approach to avoiding the repeat planning. We need a
back-patchable fix that at least tamps down the memory bloat,
and this seems like it'd do.

regards, tom lane

Attachment Content-Type Size
v2-plpgsql-stmt_call-fix-1.patch text/x-diff 6.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-09-28 01:15:31 Re: Partition prune with stable Expr
Previous Message Andy Fan 2020-09-28 00:54:17 Re: Partition prune with stable Expr