Re: Problem with tupdesc in jsonb_to_recordset

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Problem with tupdesc in jsonb_to_recordset
Date: 2018-07-13 17:22:00
Message-ID: 1602.1531502520@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> What's happening in the original case is this: the SRF call protocol
> says that it's the executor's responsibility to free rsi.setDesc if it's
> not refcounted, under the assumption that in such a case it's in
> per-query memory (and not in either a shorter-lived or longer-lived
> context).
> The change to update_cached_tupdesc broke the protocol by causing
> populate_recordset_worker to set rsi.setDesc to a non-refcounted tupdesc
> allocated in a context that's not the per-query context.

Right, and more to the point, a non-refcounted tupdesc that the function
will use again if it's called again.

> What's not clear here is why populate_recordset_record thinks it needs
> to update the tupdesc for every record in a single result set. The
> entire result set will ultimately be treated as having the same tupdesc
> regardless, so any per-record changes will break things later anyway.

It's just convenient to check it there; it's not really expecting the
tupdesc to change intra-query.

The core of the problem here is that populate_record() is recursive,
in cases with nested composite types. So we might be dealing either
with the outer composite type that is also going to be the returned
tuple type, or with some inner composite type that won't be. It's
the desire to avoid repeated tupdesc lookups for the latter case that
motivates wanting to keep a tupdesc inside the fn_extra cache structure.
Otherwise we could do one lookup_rowtype_tupdesc per SRF call cycle and
pass back the result as rsi.setDesc, without caching it.

> My first approach - assuming that update_cached_tupdesc has good reason
> to make a copy, which I'm not convinced is the case - would be to simply
> make a per-query-context copy of the tupdesc to assign to rsi.setDesc in
> order to conform to the call protocol.

The alternative to making a copy would be finding a way to guarantee that
we decrement the refcount of the result of lookup_rowtype_tupdesc at end
of query (or whenever the fn_extra cache is discarded), rather than
immediately. That seems like a mess.

I agree that making a copy to return as rsi.setDesc is the simplest and
safest fix. If somebody produces a test case showing that that adds
unacceptable overhead, we could think about ways to improve it later;
but it's going to be more complicated and probably more fragile.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2018-07-13 20:54:15 Re: cannot restore schema with is not distinct from on hstore since PG 9.6.8
Previous Message Tom Lane 2018-07-13 16:10:57 Re: Fwd: SQL

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-07-13 17:27:51 Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Previous Message Heikki Linnakangas 2018-07-13 16:57:34 Re: Generating partitioning tuple conversion maps faster