Re: Asynchronous Append on postgres_fdw nodes.

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, "movead(dot)li" <movead(dot)li(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Asynchronous Append on postgres_fdw nodes.
Date: 2020-12-12 10:06:51
Message-ID: CAPmGK14O0ZwVkKPWRM828-De4KZ_SaZdMBJm9_KdM=T5O7xhMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 20, 2020 at 8:16 PM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
> I looked through the nodeAppend.c and postgres_fdw.c part and those
> are I think the core of this patch.

Thanks again for the review!

> - * figure out which subplan we are currently processing
> + * try to get a tuple from async subplans
> + */
> + if (!bms_is_empty(node->as_needrequest) ||
> + (node->as_syncdone && !bms_is_empty(node->as_asyncpending)))
> + {
> + if (ExecAppendAsyncGetNext(node, &result))
> + return result;
>
> The function ExecAppendAsyncGetNext() is a function called only here,
> and contains only 31 lines. It doesn't seem to me that the separation
> makes the code more readable.

Considering the original ExecAppend() is about 50 lines long, 31 lines
of code would not be small. So I'd vote for separating it into
another function as proposed.

> - /* choose new subplan; if none, we're done */
> - if (!node->choose_next_subplan(node))
> + /* wait or poll async events */
> + if (!bms_is_empty(node->as_asyncpending))
> + {
> + Assert(!node->as_syncdone);
> + Assert(bms_is_empty(node->as_needrequest));
> + ExecAppendAsyncEventWait(node);
>
> You moved the function to wait for events from execAsync to
> nodeAppend. The former is a generic module that can be used from any
> kind of executor nodes, but the latter is specialized for nodeAppend.
> In other words, the abstraction level is lowered here. What is the
> reason for the change?

The reason is just because that function is only called from
ExecAppend(). I put some functions only called from nodeAppend.c in
execAsync.c, though.

> + /* Perform the actual callback. */
> + ExecAsyncRequest(areq);
> + if (ExecAppendAsyncResponse(areq))
> + {
> + Assert(!TupIsNull(areq->result));
> + *result = areq->result;
>
> Putting aside the name of the functions, the first two function are
> used only this way at only two places. ExecAsyncRequest(areq) tells
> fdw to store the first tuple among the already received ones to areq,
> and ExecAppendAsyncResponse(areq) is checking the result is actually
> set. Finally the result is retrieved directory from areq->result.
> What is the reason that the two functions are separately exists?

I think that when an async-aware node gets a tuple from an
async-capable node, they should use ExecAsyncRequest() /
ExecAyncHogeResponse() rather than ExecProcNode() [1]. I modified the
patch so that ExecAppendAsyncResponse() is called from Append, but to
support bubbling up the plan tree discussed in [2], I think it should
be called from ForeignScans (the sides of async-capable nodes). Am I
right? Anyway, I’ll rename ExecAppendAyncResponse() to the one
proposed in Robert’s original patch.

> + /* Perform the actual callback. */
> + ExecAsyncNotify(areq);
>
> Mmm. The usage of the function (or its name) looks completely reverse
> to me. I think FDW should NOTIFY to exec nodes that the new tuple
> gets available but the reverse is nonsense. What the function is
> actually doing is to REQUEST fdw to fetch tuples that are expected to
> have arrived, which is different from what the name suggests.

As mentioned in a previous email, this is an FDW callback routine
revived from Robert’s patch. I think the naming is reasonable,
because the callback routine notifies the FDW of readiness of a file
descriptor. And actually, the callback routine tells the core whether
the corresponding ForeignScan node is ready for a new request or not,
by setting the callback_pending flag accordingly.

> postgres_fdw.c
>
> > postgresIterateForeignScan(ForeignScanState *node)
> > {
> > PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
> > TupleTableSlot *slot = node->ss.ss_ScanTupleSlot;
> >
> > /*
> > * If this is the first call after Begin or ReScan, we need to create the
> > * cursor on the remote side.
> > */
> > if (!fsstate->cursor_exists)
> > create_cursor(node);
>
> With the patch, cursors are also created in another place so at least
> the comment is wrong.

Good catch! Will fix.

> That being said, I think we should unify the
> code except the differences between async and sync. For example, if
> the fetch_more_data_begin() needs to be called only for async
> fetching, the cursor should be created before calling the function, in
> the code path common with sync fetching.

I think that that would make the code easier to understand, but I’m
not 100% sure we really need to do so.

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/CA%2BTgmoYrbgTBnLwnr1v%3Dpk%2BC%3DznWg7AgV9%3DM9ehrq6TDexPQNw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CA%2BTgmoZSWnhy%3DSB3ggQcB6EqKxzbNeNn%3DEfwARnCS5tyhhBNcw%40mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Lukas Meisegeier 2020-12-12 11:52:12 Re: Feature Proposal: Add ssltermination parameter for SNI-based LoadBalancing
Previous Message Etsuro Fujita 2020-12-12 09:25:57 Re: Asynchronous Append on postgres_fdw nodes.