Quick Links

Re: BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	jeremy(at)musicsmith(dot)net
Cc:	pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan
Date:	2020-03-22 03:23:03
Message-ID:	1741.1584847383@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs pgsql-hackers

PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> [ $SUBJECT ]

I got around to looking at this today, and what I find is that the
problem is that exec_stmt_return_query() uses a portal (i.e. a cursor)
to read the results of the query. That seemed like a good idea, back
in the late bronze age, because it allowed plpgsql to fetch the query
results a few rows at a time and not risk blowing out memory with a huge
SPI result. However, the parallel-query infrastructure refuses to
parallelize when the query is being read via a cursor.

I think that the latter restriction is probably sane, because we don't
want to suspend execution of a parallel query while we've got worker
processes waiting. And there might be some implementation restrictions
lurking under it too --- that's not a part of the code I know in any
detail.

However, there's no fundamental reason why exec_stmt_return_query has
to use a cursor. It's going to run the query to completion immediately
anyway, and shove all the result rows into a tuplestore. What we lack
is a way to get the SPI query to pass its results directly to a
tuplestore, without the SPITupleTable intermediary. (Note that the
tuplestore can spill a large result to disk, whereas SPITupleTable
can't do that.)

So, attached is a draft patch to enable that. By getting rid of the
intermediate SPITupleTable, this should improve the performance of
RETURN QUERY somewhat even without considering the possibility of
parallelizing the source query. I've not tried to measure that though.
I've also not looked for other places that could use this new
infrastructure, but there may well be some.

One thing I'm not totally pleased about with this is adding another
SPI interface routine using the old parameter-values API (that is,
null flags as char ' '/'n'). That was the path of least resistance
given the other moving parts in pl_exec.c and spi.c, but maybe we
should try to modernize that before we set it in stone.

Another thing standing between this patch and committability is suitable
additions to the SPI documentation. But I saw no value in writing that
before the previous point is settled.

I will go add this to the next commitfest (for v14), but I wonder
if we should try to squeeze it into v13? This isn't the only
complaint we've gotten about non-parallelizability of RETURN QUERY.

regards, tom lane

Attachment	Content-Type	Size
plpgsql-return-query-results-directly-1.patch	text/x-diff	19.3 KB

In response to

BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan at 2019-10-04 20:20:32 from PG Bug reporting form

Responses

Re: BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan at 2020-03-22 06:48:14 from Pavel Stehule
Re: BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan at 2020-06-09 00:11:43 from Tom Lane
Re: BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan at 2020-06-12 18:13:11 from Robert Haas

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Pavel Stehule	2020-03-22 06:48:14	Re: BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan
Previous Message	David G. Johnston	2020-03-20 20:09:23	Re: BUG #16309: Postgres's ISO 8601 date output is not compliant

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Justin Pryzby	2020-03-22 04:53:05	Re: kill_prior_tuple and index scan costing
Previous Message	Andres Freund	2020-03-22 02:33:02	Re: kill_prior_tuple and index scan costing