Re: Defer selection of asynchronous subplans until the executor initialization stage

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Defer selection of asynchronous subplans until the executor initialization stage
Date: 2021-08-23 09:18:43
Message-ID: CAPmGK15AshMEtgUqs3yZfD5mzW87wyn=_uAXbBeRLG0nod7vHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 30, 2021 at 1:50 PM Andrey Lepikhov
<a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> I have completely rewritten this patch.
>
> Main idea:
>
> The async_capable field of a plan node inform us that this node could
> work in async mode. Each node sets this field based on its own logic.
> The actual mode of a node is defined by the async_capable of PlanState
> structure. It is made at the executor initialization stage.
> In this patch, only an append node could define async behaviour for its
> subplans.

I finally reviewed the patch. One thing I noticed about the patch is
that it would break ordered Appends. Here is such an example using
the patch:

create table pt (a int) partition by range (a);
create table loct1 (a int);
create table loct2 (a int);
create foreign table p1 partition of pt for values from (10) to (20)
server loopback1 options (table_name 'loct1');
create foreign table p2 partition of pt for values from (20) to (30)
server loopback2 options (table_name 'loct2');

explain verbose select * from pt order by a;
QUERY PLAN
-------------------------------------------------------------------------------------
Append (cost=200.00..440.45 rows=5850 width=4)
-> Async Foreign Scan on public.p1 pt_1 (cost=100.00..205.60
rows=2925 width=4)
Output: pt_1.a
Remote SQL: SELECT a FROM public.loct1 ORDER BY a ASC NULLS LAST
-> Async Foreign Scan on public.p2 pt_2 (cost=100.00..205.60
rows=2925 width=4)
Output: pt_2.a
Remote SQL: SELECT a FROM public.loct2 ORDER BY a ASC NULLS LAST
(7 rows)

This would not always provide tuples in the required order, as async
execution would provide them from the subplans rather randomly. I
think it would not only be too late but be not efficient to do the
planning work at execution time (consider executing generic plans!),
so I think we should avoid doing so. (The cost of doing that work for
simple foreign scans is small, but if we support async execution for
upper plan nodes such as NestLoop as discussed before, I think the
cost for such plan nodes would not be small anymore.)

To just execute what was planned at execution time, I think we should
return to the patch in [1]. The patch was created for Horiguchi-san’s
async-execution patch, so I modified it to work with HEAD, and added a
simplified version of your test cases. Please find attached a patch.

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/7fe10f95-ac6c-c81d-a9d3-227493eb9055@postgrespro.ru

Attachment Content-Type Size
allow-async-in-more-cases.patch application/octet-stream 11.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2021-08-23 09:33:09 Re: Proposal: More structured logging
Previous Message houzj.fnst@fujitsu.com 2021-08-23 09:15:25 RE: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION