Re: Enabling parallelism for queries coming from SQL or other PL functions

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enabling parallelism for queries coming from SQL or other PL functions
Date: 2017-03-15 15:25:20
Message-ID: CA+TgmoYx9nP=LERwi+nxb02pe=dEkT_XZQytg6-PAFgH-nAg8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 10, 2017 at 7:08 AM, Rafia Sabih
<rafia(dot)sabih(at)enterprisedb(dot)com> wrote:
> I wanted to clarify a few things here, I noticed that call of ExecutorRun in
> postquel_getnext() uses !es->lazyEval as execute_once, this is confusing, as
> it is true even in cases when a simple query like "select count(*) from t"
> is used in a sql function. Hence, restricting parallelism for cases when it
> shouldn't. It seems to me that es->lazyEval is not set properly or it should
> not be true for simple select statements. I found that in the definition of
> execution_state
> bool lazyEval; /* true if should fetch one row at a time */
> and in init_execution_state, there is a comment saying,
> * Mark the last canSetTag query as delivering the function result; then,
> * if it is a plain SELECT, mark it for lazy evaluation. If it's not a
> * SELECT we must always run it to completion.
>
> I find these two things contradictory to each other. So, is this point
> missed or is there some deep reasoning behind that?

I don't understand what you think is contradictory. I think the idea
is that if it's not a SELECT, we have to run it to completion because
it might have side effects, but if it is a SELECT, we assume (granted,
it might be wrong) that there are no side effects, and therefore we
can just run it until it produces the number of rows of output that we
need.

Note this:

if (completed || !fcache->returnsSet)
postquel_end(es);

When the SQL function doesn't return a set, then we can allow
parallelism even when lazyEval is set, because we'll only call
ExecutorStart() once. But my impression is that something like this:

SELECT * FROM blah() LIMIT 3

...will trigger three separate calls to ExecutorRun(), which is a
problem if the plan is a parallel plan.

I have not verified this; the above thoughts are just based on code-reading.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Sharma 2017-03-15 15:37:51 Re: Microvacuum support for Hash Index
Previous Message Dilip Kumar 2017-03-15 15:21:22 Re: Parallel Bitmap scans a bit broken