Re: Asynchronous Append on postgres_fdw nodes.

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, "movead(dot)li" <movead(dot)li(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Asynchronous Append on postgres_fdw nodes.
Date: 2021-02-10 10:31:02
Message-ID: CAPmGK15Hvh5x_cOYNjdu3FjY-owxXfv3bdBiL20UPmJs6PAS3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 4, 2021 at 7:21 PM Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> wrote:
> On Mon, Feb 1, 2021 at 12:06 PM Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> wrote:
> > Rather than doing so, I'd like to propose to allow
> > FDWs to disable async execution of them in problematic cases by
> > themselves during executor startup in the first cut. What I have in
> > mind for that is:
> >
> > 1) For an FDW that has async-capable ForeignScan(s), we allow the FDW
> > to record, for each of the async-capable and non-async-capable
> > ForeignScan(s), the information on a connection to be used for the
> > ForeignScan into EState during BeginForeignScan().
> >
> > 2) After doing ExecProcNode() to each SubPlan and the main query tree
> > in InitPlan(), we give the FDW a chance to a) reconsider, for each of
> > the async-capable ForeignScan(s), whether the ForeignScan can be
> > executed asynchronously as planned, based on the information stored
> > into EState in #1, and then b) disable async execution of the
> > ForeignScan if not.
>
> s/ExecProcNode()/ExecInitNode()/. Sorry for that. I’ll post an
> updated patch for this in a few days.

I created a WIP patch for this. For #2, I added a new callback
routine ReconsiderAsyncForeignScan(). The routine for postgres_fdw
postgresReconsiderAsyncForeignScan() is pretty simple: async execution
of an async-capable ForeignScan is disabled if the connection used for
it is used in other parts of the query plan tree except async subplans
just below the parent Append. Here is a running example:

postgres=# create table t1 (a int, b int, c text);
postgres=# create table t2 (a int, b int, c text);
postgres=# create foreign table p1 (a int, b int, c text) server
server1 options (table_name 't1');
postgres=# create foreign table p2 (a int, b int, c text) server
server2 options (table_name 't2');
postgres=# create table pt (a int, b int, c text) partition by range (a);
postgres=# alter table pt attach partition p1 for values from (10) to (20);
postgres=# alter table pt attach partition p2 for values from (20) to (30);
postgres=# insert into p1 select 10 + i % 10, i, to_char(i, 'FM0000')
from generate_series(0, 99) i;
postgres=# insert into p2 select 20 + i % 10, i, to_char(i, 'FM0000')
from generate_series(0, 99) i;
postgres=# analyze pt;
postgres=# create table loct (a int, b int);
postgres=# create foreign table ft (a int, b int) server server1
options (table_name 'loct');
postgres=# insert into ft select i, i from generate_series(0, 99) i;
postgres=# analyze ft;
postgres=# create view v as select * from ft;

postgres=# explain verbose select * from pt, v where pt.b = v.b and v.b = 99;
QUERY PLAN
-----------------------------------------------------------------------------------------
Nested Loop (cost=200.00..306.84 rows=2 width=21)
Output: pt.a, pt.b, pt.c, ft.a, ft.b
-> Foreign Scan on public.ft (cost=100.00..102.27 rows=1 width=8)
Output: ft.a, ft.b
Remote SQL: SELECT a, b FROM public.loct WHERE ((b = 99))
-> Append (cost=100.00..204.55 rows=2 width=13)
-> Foreign Scan on public.p1 pt_1 (cost=100.00..102.27
rows=1 width=13)
Output: pt_1.a, pt_1.b, pt_1.c
Remote SQL: SELECT a, b, c FROM public.t1 WHERE ((b = 99))
-> Async Foreign Scan on public.p2 pt_2
(cost=100.00..102.27 rows=1 width=13)
Output: pt_2.a, pt_2.b, pt_2.c
Remote SQL: SELECT a, b, c FROM public.t2 WHERE ((b = 99))
(12 rows)

For this query, while p2 is executed asynchronously, p1 isn’t as it
uses the same connection with ft. BUT:

postgres=# create role view_owner SUPERUSER;
postgres=# create user mapping for view_owner server server1;
postgres=# alter view v owner to view_owner;

postgres=# explain verbose select * from pt, v where pt.b = v.b and v.b = 99;
QUERY PLAN
-----------------------------------------------------------------------------------------
Nested Loop (cost=200.00..306.84 rows=2 width=21)
Output: pt.a, pt.b, pt.c, ft.a, ft.b
-> Foreign Scan on public.ft (cost=100.00..102.27 rows=1 width=8)
Output: ft.a, ft.b
Remote SQL: SELECT a, b FROM public.loct WHERE ((b = 99))
-> Append (cost=100.00..204.55 rows=2 width=13)
-> Async Foreign Scan on public.p1 pt_1
(cost=100.00..102.27 rows=1 width=13)
Output: pt_1.a, pt_1.b, pt_1.c
Remote SQL: SELECT a, b, c FROM public.t1 WHERE ((b = 99))
-> Async Foreign Scan on public.p2 pt_2
(cost=100.00..102.27 rows=1 width=13)
Output: pt_2.a, pt_2.b, pt_2.c
Remote SQL: SELECT a, b, c FROM public.t2 WHERE ((b = 99))
(12 rows)

in this setup, p1 is executed asynchronously as ft doesn’t use the
same connection with p1.

I added to postgresReconsiderAsyncForeignScan() this as well: even if
the connection isn’t used in the other parts, async execution of an
async-capable ForeignScan is disabled if the subplans of the Append
are all async-capable, and they use the same connection, because in
that case the subplans won’t be parallelized at all, and the overhead
of async execution may cause a performance degradation.

Attached is an updated version of the patch. Sorry for the delay.

Best regards,
Etsuro Fujita

Attachment Content-Type Size
async-wip-2021-02-10.patch application/octet-stream 85.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hou, Zhijie 2021-02-10 10:46:11 RE: Parallel INSERT (INTO ... SELECT ...)
Previous Message Peter Smith 2021-02-10 10:29:10 Re: [HACKERS] logical decoding of two-phase transactions