Re: Asynchronous Append on postgres_fdw nodes.

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, "movead(dot)li" <movead(dot)li(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Asynchronous Append on postgres_fdw nodes.
Date: 2020-11-17 09:56:02
Message-ID: CAPmGK14xrGe+Xks7+fVLBoUUbKwcDkT9km1oFXhdY+FFhbMjUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 5, 2020 at 3:35 PM Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> wrote:
> Yes, if there are no objections from you or Thomas or Robert or anyone
> else, I'll update Robert's patch as such.

Here is a new version of the patch (as promised in the developer
unconference in PostgresConf.CN & PGConf.Asia 2020):

* In Robert's patch [1] (and Horiguchi-san's, which was created based
on Robert's), ExecAppend() was modified to retrieve tuples from
async-aware children *before* the tuples will be needed, but I don't
think that's really a good idea, because the query might complete
before returning the tuples. So I modified that function so that a
tuple is retrieved from an async-aware child *when* it is needed, like
Thomas' patch. I used FDW callback functions proposed by Robert, but
introduced another FDW callback function ForeignAsyncBegin() for each
async-aware child to start an asynchronous data fetch at the first
call to ExecAppend() after ExecInitAppend() or ExecReScanAppend().

* For EvalPlanQual, I modified the patch so that async-aware children
are treated as if they were synchronous when executing EvalPlanQual.

* In Robert's patch, all async-aware children below Append nodes in
the query waiting for events to occur were managed by a single EState,
but I modified the patch so that such children are managed by each
Append node, like Horiguchi-san's patch and Thomas'.

* In Robert's patch, the FDW callback function
ForeignAsyncConfigureWait() allowed multiple events to be configured,
but I limited that function to only allow a single event to be
configured, just for simplicity.

* I haven't yet added some planner/resowner changes from Horiguchi-san's patch.

* I haven't yet done anything about the issue on postgres_fdw's
handling of concurrent data fetches by multiple ForeignScan nodes
(below *different* Append nodes in the query) using the same
connection discussed in [2]. I modified the patch to just disable
applying this feature to problematic test cases in the postgres_fdw
regression tests, by a new GUC enable_async_append.

Comments welcome! The attached is still WIP and maybe I'm missing
something, though.

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAPmGK16E1erFV9STg8yokoewY6E-zEJtLzHUJcQx%2B3dyivCT%3DA%40mail.gmail.com

Attachment Content-Type Size
async-wip-2020-11-17.patch application/octet-stream 78.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2020-11-17 10:38:44 Re: Split copy.
Previous Message Amit Kapila 2020-11-17 09:49:38 Re: Cache relation sizes?