Re: Asynchronous execution on FDW

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Cc: hlinnaka(at)iki(dot)fi
Subject: Re: Asynchronous execution on FDW
Date: 2015-07-10 07:32:31
Message-ID: 20150710.163231.255890296.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, This is the new version of this patch.

At Tue, 07 Jul 2015 10:19:35 +0900, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote
> This is rather a makeshift solution for the problem, but
> considering current trend of parallelism, it might the time to
> make the executor to fit parallel execution.
>
> If it is acceptable to reconstruct the executor nodes to have
> additional return state PREP_RUN or such (which means it needs
> one more call for the first tuple) , I'll modify the whole
> executor to handle the state in the next patch to do so.

I made a patchset to do this. The details of it and some examples
are shown after the summary below.

- I provided an infrastructure for asynchronous (simultaneous)
execution of multiple execnodes belonging one node, like joins.

- It (should) have addressed the "parameterized plan" problem.

- The infrastructure is a bit intrusive but simple, and it will
be usable by any nodes that supports asynchronous execution
(none so far except fdw, needs some modification in core). So
the async exec for Postgres-FDW now became an exapmle for the
infrastructure. It might be nice to start backend worker for
promising async resuest for a sort node.

- The postgres_fdw part is almost the same as the previous one.

The detailed explanation of the patchset follows.

============

I made a patchset to do this. It consists of five patches (plus
one for debug message).

1. Add infrastructure for run-state of executor node.

Currently executor nodes have binary run-states, one is
!TupIsNull(slot) which indicates that the next tuple may come
from the node, and the another is TupIsNull(slot) which
indicates that no more tuple will be come.

This patch expands it to four-state and have the value in
PlanState struct.

Inited : it is just after initialized.

Started: it is startd execution but no tuple retrieved. This
could be skipped.

Running: it is returning tuples.

Done : it has no more tuple to return. This is equivalent to
TupIsNull(slot).

The nodes Group, ModifyTable, SetOp and WindowAgg had their own
state flag replaceable by the new states in their own *State
part so they are moved to this new state set in this patch. This
patch does not change the current behavior.

2. Change all tuple-returning execnodes to maintain the new
run-state appropriately.

The rest nodes are modified by this patch to maintain the state
to be consistent with the TupIsNull() state at the ExecProcNode
level. This patch does not change the current behavior, too. (I
feel that the state Done would be no other than an encumbrance
in maintenance. The state is not referred in nowhere)

3. Add a feature to start node asynchronously.

All nodes that have more than one child node can execute the
children asynchronously by this patch. It tries start children
asynchronously if the state is "Inited" when entering Exec*
functions. Async request for nodes which has just one child is
simply propagated to the child, and leaf nodes such as scans
will decide whether to be async or not. Currently no leaf node
can be async except postgres_fdw.

NestLoop may run parameterized plan so it is specially treated
in StartNestLoop so that parameterized plans will not be
asynchronously started.

In StartHashJoin, whether the inner (hash) node is executed or
not is judged by the similar logic with ExecHashJoin.

Even after this patch applied, no leaf node can start
asynchronously so the behavior of the executor still be
unchanged.

4. Add StartForeignScan to FdwRoutine

Add new entry function to accept the asynchronous execution
request from the core.

5. Allow asynchronous remote query of postgres_fdw.

This is almost the same as the previous version. Except that it
runs on the new infrastructure, and added new server/foreign
table option allow_async.

The first foreign scan on the same server will be asynchronously
started execution if requested. And apart from the async start,
every successive fetches for the same foreign scan will be
asynchronously fetched.

Currently there's no means to observe what it is doing from
outside, so the additional sixth patch is to output debug
messages about asynchronous execution.

However, currently it is no test code for that but I'm at a loss
what to do as the test..

FWIW I provided two exaples of running asynchronous exexution.

regards,

===== Example
CREATE SERVER sv1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'postgres');
CREATE SERVER sv2 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'postgres');
CREATE USER MAPPING FOR CURRENT_USER SERVER sv1;
CREATE USER MAPPING FOR CURRENT_USER SERVER sv2;
CREATE TABLE lp (a int, b int);
CREATE TABLE lt1 () INHERITS (lp);
CREATE TABLE lt2 () INHERITS (lp);
CREATE TABLE lt3 () INHERITS (lp);
CREATE TABLE lt4 () INHERITS (lp);
CREATE TABLE fp (LIKE lp);
CREATE FOREIGN TABLE ft1 () INHERITS (fp) SERVER sv1 OPTIONS (table_name 'lt1');
CREATE FOREIGN TABLE ft2 () INHERITS (fp) SERVER sv1 OPTIONS (table_name 'lt1');
CREATE FOREIGN TABLE ft3 () INHERITS (fp) SERVER sv2 OPTIONS (table_name 'lt1');
CREATE FOREIGN TABLE ft4 () INHERITS (fp) SERVER sv2 OPTIONS (table_name 'lt1');
INSERT INTO lt1 (SELECT a, a FROM generate_series(0, 999) a);
INSERT INTO lt2 (SELECT a+1000, a FROM generate_series(0, 999) a);
INSERT INTO lt3 (SELECT a+2000, a FROM generate_series(0, 999) a);
INSERT INTO lt4 (SELECT a+3000, a FROM generate_series(0, 999) a);

;; TEST FOR SIMPLE APPEND
=# SELECT * FROM fp;
1 LOG: pg_fdw: [ft1/sv1/0x293a580] Async exec started.
2 LOG: pg_fdw: [ft2/sv1/0x293a580] Async exec denied.
3 LOG: pg_fdw: [ft3/sv2/0x2898c70] Async exec started.
4 LOG: pg_fdw: [ft4/sv2/0x2898c70] Async exec denied.
5 LOG: pg_fdw: [ft1/sv1/0x293a580] Async fetch
....
6 LOG: pg_fdw: [ft1/sv1/0x293a580] Async fetch
7 LOG: pg_fdw: [ft2/sv1/0x293a580] Sync fetch.
8 LOG: pg_fdw: [ft2/sv1/0x293a580] Async fetch
...
9 LOG: pg_fdw: [ft2/sv1/0x293a580] Async fetch
10 LOG: pg_fdw: [ft3/sv2/0x2898c70] Async fetch
....
11 LOG: pg_fdw: [ft3/sv2/0x2898c70] Async fetch
12 LOG: pg_fdw: [ft4/sv2/0x2898c70] Sync fetch.
14 LOG: pg_fdw: [ft4/sv2/0x2898c70] Async fetch
...
15 LOG: pg_fdw: [ft4/sv2/0x2898c70] Async fetch

;; The notation inside the square bracket is
;; <table name>/<server name>/<ponter of connection>.
;;
;; 1-4 foreign servers denied async for the second scan for each (ft2/ft4).
;;
;; At 7, reading different table from 6 made it sync fetch but
;; the successive fetches afterward are async.
;;
;; ft2 and ft3 was on different server so 10 is async fetch for
;; the query executed asynchronously at 3.
;;
;; At 12 the same thing to 7 occurred.

;; TEST FOR PARAMETERIZED NESTLOOP
=# SET enable_hashjoin TO false;
=# SET enable_mergejoin TO false;
=# SET enable_material TO false;
=# ALTER FOREIGN TABLE ft4 OPTIONS (ADD use_remote_estimate 'true');
=# SELECT ft4.a FROM ft1 JOIN ft4 ON ft1.b = ft4.b WHERE ft1.a BETWEEN 800 AND 1000;
1 LOG: pg_fdw: [ft1/sv1/0x293a580] Async exec started.
2 LOG: pg_fdw: [ft1/sv1/0x293a580] Async fetch
3 LOG: pg_fdw: [ft4/sv2/0x2898c70] Sync fetch.
4 LOG: pg_fdw: [ft4/sv2/0x2898c70] Sync fetch.
...
5 LOG: pg_fdw: [ft4/sv2/0x2898c70] Sync fetch.
6 LOG: pg_fdw: [ft1/sv1/0x293a580] Async fetch
7 LOG: pg_fdw: [ft4/sv2/0x2898c70] Sync fetch.
...
8 LOG: pg_fdw: [ft4/sv2/0x2898c70] Sync fetch.
9 LOG: pg_fdw: [ft1/sv1/0x293a580] Async fetch

;; ft4 did not even try to async since the inner(ft4) is parameterized.
;; All fetches for inner(ft4) was executed synchronously.
;;
;; Meanwhile, ft1 was continuously reading asynchronously.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-Add-infrastructure-for-executor-node-run-state.patch text/x-patch 28.1 KB
0002-Change-all-tuple-returning-execution-nodes-to-mainta.patch text/x-patch 39.1 KB
0003-Add-a-feature-to-start-node-asynchronously.patch text/x-patch 32.7 KB
0004-Add-StartForeignScan-to-FdwRoutine.patch text/x-patch 3.6 KB
0005-Allow-asynchronous-remote-query-of-postgres_fdw.patch text/x-patch 41.1 KB
0006-Debug-message-for-async-execution-of-postgres_fdw.patch text/x-patch 2.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2015-07-10 08:30:31 Re: Asynchronous execution on FDW
Previous Message Sawada Masahiko 2015-07-10 05:41:50 Re: Freeze avoidance of very large table.