Re: Enabling parallelism for queries coming from SQL or other PL functions

From: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enabling parallelism for queries coming from SQL or other PL functions
Date: 2017-03-07 14:07:47
Message-ID: CAOGQiiN4dLZOkrjP2Pta6Kw0wE0jYZCFZ5nsXWbRWoUX5gjyPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 26, 2017 at 7:09 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> I think I see the problem that you're trying to solve, but I agree
> that this doesn't seem all that elegant. The reason why we have that
> numberTuples check is because we're afraid that we might be in a
> context like the extended-query protocol, where the caller can ask for
> 1 tuple, and then later ask for another tuple. That won't work,
> because once we shut down the workers we can't reliably generate the
> rest of the query results. However, I think it would probably work
> fine to let somebody ask for less than the full number of tuples if
> it's certain that they won't later ask for any more.
>
> So maybe what we ought to do is allow CURSOR_OPT_PARALLEL_OK to be set
> any time we know that ExecutorRun() will be called for the QueryDesc
> at most once rather than (as at present) only where we know it will be
> executed only once with a tuple-count of zero. Then we could change
> things in ExecutePlan so that it doesn't disable parallel query when
> the tuple-count is non-zero, but does take an extra argument "bool
> execute_only_once", and it disables parallel execution if that is not
> true. Also, if ExecutorRun() is called a second time for the same
> QueryDesc when execute_only_once is specified as true, it should
> elog(ERROR, ...). Then exec_execute_message(), for example, can pass
> that argument as false when the tuple-count is non-zero, but other
> places that are going to fetch a limited number of rows could pass it
> as true even though they also pass a row-count.
>
> I'm not sure if that's exactly right, but something along those lines
> seems like it should work.
>

IIUC, this needs an additional bool execute_once in the queryDesc which is
set to true in standard_ExecutorRun when the query is detected to be coming
from PL function or provided count is zero i.e. execute till the end, in
case execute_once is already true then report the error.

>
> I think that a final patch for this functionality should involve
> adding CURSOR_OPT_PARALLEL_OK to appropriate places in each PL, plus
> maybe some infrastructure changes like the ones mentioned above.
> Maybe it can be divided into two patches, one to make the
> infrastructure changes and a second to add CURSOR_OPT_PARALLEL_OK to
> more places.
>

I have split the patch into two, one is to allow optimiser to select a
parallel plan for queries in PL functions
(pl_parallel_opt_support_v1.patch), wherein CURSOR_OPT_PARALLEL_OK is
passed at required places.

Next, the patch for allowing execution of such queries in parallel mode,
that involves infrastructural changes along the lines mentioned upthread
(pl_parallel_exec_support_v1.patch).

--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

Attachment Content-Type Size
pl_parallel_exec_support_v1.patch application/octet-stream 3.8 KB
pl_parallel_opt_support_v1.patch application/octet-stream 2.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-07 14:17:40 Re: Proposal : Parallel Merge Join
Previous Message Michael Paquier 2017-03-07 13:21:39 Re: SCRAM authentication, take three