Re: Reference Leak with type

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Rohit Bhogate <rohit(dot)bhogate(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reference Leak with type
Date: 2021-04-10 21:57:43
Message-ID: 2512554.1618091863@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here's a proposed patch for this problem.

The core problem in this test case is that the refcount is logged in the
Portal resowner, which is a child of the initial transaction's resowner,
so it goes away in the COMMIT (after warning of a resource leak); but
the expression tree is still there and still thinks it has a refcount.
By chance a new ResourceOwner is created in the same place where the old
one was, so that when the expression tree is finally destroyed at the
end of the DO block, we see an error about "this refcount isn't logged
here" rather than a crash. Unrelated-looking code changes could turn
that into a real crash, of course.

I spent quite a bit of time fruitlessly trying to fix it by manipulating
which resowner the tupledesc refcount is logged in, specifically by
running plpgsql "simple expressions" with the simple_eval_resowner as
CurrentResourceOwner. But this just causes other problems to appear,
because then that resowner becomes responsible for more stuff than
just the plancache refcounts that plpgsql is expecting it to hold.
Some of that stuff needs to be released at subtransaction abort,
which is problematic because most of what plpgsql wants it to deal
in needs to survive until end of main transaction --- in particular,
the plancache refcounts need to live that long, and so do the tupdesc
refcounts we're concerned with here, because those are associated with
"simple expression" trees that are supposed to have that lifespan.
It's possible that we could make this approach work, but at minimum
it'd require creating and destroying an additional resowner per
subtransaction; and maybe we'd have to give up on sharing "simple
expression" trees across subtransactions. So the potential performance
hit is pretty bad, and I'm not even 100% sure it'd work at all.

So the alternative proposed in the attached is to give up on associating
a long-lived tupdesc refcount with these expression nodes at all.
Intead, we can use a method that plpgsql has been using for a few
years, which is to rely on the fact that typcache entries never go
away once made, and just save a pointer into the typcache. We can
detect possible changes in the cache entry by watching for changes
in its tupDesc_identifier counter.

This infrastructure exists as far back as v11, so using it doesn't
present any problems for back-patchability. It is slightly
nervous-making that we have to change some fields in struct ExprEvalStep
--- but the overall struct size isn't changing, and I can't really
see a reason why extensions would be interested in the contents of
these particular subfield types.

regards, tom lane

Attachment Content-Type Size
fix-tupdesc-reference-leak.patch text/x-diff 17.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2021-04-10 22:03:23 Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Previous Message Tomas Vondra 2021-04-10 21:22:53 Re: pgsql: autovacuum: handle analyze for partitioned tables