Re: plan cache overhead on plpgsql expression

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plan cache overhead on plpgsql expression
Date: 2020-03-26 18:49:47
Message-ID: 20200326184947.vniqm2kqa4hmv56x@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-03-26 14:37:59 -0400, Tom Lane wrote:
> I wrote:
> > I had a thought about a possibly-cleaner way to do this. We could invent
> > a resowner function, say ResourceOwnerReleaseAllPlanCacheRefs, that
> > explicitly releases all plancache pins it knows about. So plpgsql
> > would not call the regular ResourceOwnerRelease entry point at all,
> > but just call that and then ResourceOwnerDelete, again relying on the
> > assertions therein to catch anything not released.
>
> Here's a version that does it like that. This does seem marginally
> nicer than the other way. I have a feeling that at some point we'll
> want to expose resowners' contents more generally, but I'm not quite
> sure what the use-cases will be, so I don't want to design that now.

Yea, agreed with all of what you said in that paragraph.

> Testing that reminded me of the other regression test failure I'd seen
> when I first tried to do it: select_parallel.sql shows a WARNING about
> a plancache leak in a parallel worker process. When I looked into the
> reason for that, it turned out that some cowboy has split
> XACT_EVENT_COMMIT into XACT_EVENT_COMMIT and
> XACT_EVENT_PARALLEL_COMMIT (where the latter is used in parallel
> workers) without bothering to fix the collateral damage to plpgsql.
> So plpgsql_xact_cb isn't doing any cleanup in parallel workers, and
> hasn't been for a couple of releases.

Ugh.

> The bad effects of that are probably limited given that the worker
> process will exit after committing, but I still think that that part
> of this patch is a bug fix that needs to be back-patched.

Ugh. Lucky that we don't register many things inside those resowners.

> (Just
> looking at what FreeExecutorState does, I wonder whether
> jit_release_context has any side-effects that are visible outside the
> process? But I bet I can make a test case that shows issues even
> without JIT, based on the failure to call ExprContext shutdown
> callbacks.)

JIT doesn't currently have side-effects outside of the process. I really
want to add caching support, which'd presumably have problems due to
this, but it's not there yet... This could lead to leaking a fair bit of
memory over time otherwise.

> /*
> + * CachedPlanAllowsSimpleValidityCheck: can we use CachedPlanIsSimplyValid?
> + *
> + * This function, together with CachedPlanIsSimplyValid, provides a fast path
> + * for revalidating "simple" generic plans. The core requirement to be simple
> + * is that the plan must not require taking any locks, which translates to
> + * not touching any tables; this happens to match up well with an important
> + * use-case in PL/pgSQL.

Hm - is there currently sufficient guarantee that we absorb sinval
messages? Would still matter for types, functions, etc?

> /*
> + * ResourceOwnerReleaseAllPlanCacheRefs
> + * Release the plancache references (only) held by this owner.
> + *
> + * We might eventually add similar functions for other resource types,
> + * but for now, only this is needed.
> + */
> +void
> +ResourceOwnerReleaseAllPlanCacheRefs(ResourceOwner owner)
> +{
> + ResourceOwner save;
> + Datum foundres;
> +
> + save = CurrentResourceOwner;
> + CurrentResourceOwner = owner;
> + while (ResourceArrayGetAny(&(owner->planrefarr), &foundres))
> + {
> + CachedPlan *res = (CachedPlan *) DatumGetPointer(foundres);
> +
> + ReleaseCachedPlan(res, true);
> + }
> + CurrentResourceOwner = save;
> +}

While it'd do a small bit unnecessary work, I do wonder if it'd be
better to use this code in ResourceOwnereReleaseInternal().

> --- a/src/pl/plpgsql/src/pl_exec.c
> +++ b/src/pl/plpgsql/src/pl_exec.c
> @@ -84,6 +84,13 @@ typedef struct
> * has its own simple-expression EState, which is cleaned up at exit from
> * plpgsql_inline_handler(). DO blocks still use the simple_econtext_stack,
> * though, so that subxact abort cleanup does the right thing.
> + *
> + * (However, if a DO block executes COMMIT or ROLLBACK, then exec_stmt_commit
> + * or exec_stmt_rollback will unlink it from the DO's simple-expression EState
> + * and create a new shared EState that will be used thenceforth. The original
> + * EState will be cleaned up when we get back to plpgsql_inline_handler. This
> + * is a bit ugly, but it isn't worth doing better, since scenarios like this
> + * can't result in indefinite accumulation of state trees.)
> */
> typedef struct SimpleEcontextStackEntry
> {
> @@ -96,6 +103,15 @@ static EState *shared_simple_eval_estate = NULL;
> static SimpleEcontextStackEntry *simple_econtext_stack = NULL;
>
> /*
> + * In addition to the shared simple-eval EState, we have a shared resource
> + * owner that holds refcounts on the CachedPlans for any "simple" expressions
> + * we have evaluated in the current transaction. This allows us to avoid
> + * continually grabbing and releasing a plan refcount when a simple expression
> + * is used over and over.
> + */
> +static ResourceOwner shared_simple_eval_resowner = NULL;

Perhaps add a reference to the new (appreciated, btw) DO comment above?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2020-03-26 18:50:24 Re: archive status ".ready" files may be created too early
Previous Message Erik Rijkers 2020-03-26 18:41:45 Re: proposal \gcsv