Re: Async execution of postgres_fdw.

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Async execution of postgres_fdw.
Date: 2014-12-18 06:10:22
Message-ID: CAFjFpRdgY0gF72k7vPm0s1LU7J8V1W9T8X-aFpEY3WT81dSmQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Horiguchi-san,
Here are my comments for the patches together

Sanity
1. The patch applies cleanly but has trailing white spaces.
[ashutosh(at)ubuntu coderoot]git apply
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:41: trailing whitespace.
entry->conn =
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:44: trailing whitespace.

/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:611: trailing
whitespace.

warning: 3 lines add whitespace errors.

2. The patches compile cleanly.
3. The regression is clean, even in contrib/postgres_fdw and
contrib/file_fdw

Tests
-------
We need tests to make sure that the logic remains intact even after further
changes in this area. Couple of tests which require multiple foreign scans
within the same query fetching rows more than fetch size (100) would be
required. Also, some DMLs, which involve multiple foreign scans would test
the sanity when UPDATE/DELETE interleave such scans. By multiple foreign
scans I mean both multiple scans on a single foreign server and multiple
scans spread across multiple foreign servers.

Code
-------
Because previous "conn" is now replaced by "conn->pgconn", the double
indirection makes the code a bit ugly and prone to segfaults (conn being
NULL or invalid pointer). Can we minimize such code or wrap it under a
macro?

We need some comments about the structure definition of PgFdwConn and its
members explaining the purpose of this structure and its members.

Same is the case with enum PgFdwConnState. In fact, the state diagram of a
connection has become more complicated with the async connections, so it
might be better to explain that state diagram at one place in the code
(through comments). The definition of the enum might be a good place to do
that. Otherwise, the logic of connection maintenance is spread at multiple
places and is difficult to understand by looking at the code.

In function GetConnection(), at line
elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
- entry->conn, server->servername);
+ entry->conn->pgconn, server->servername);

entry->conn->pgconn may not necessarily be a new connection and may be NULL
(as the next line check it for being NULL). So, I think this line should be
moved within the following if block after pgconn has been initialised with
the new connection.
+ if (entry->conn->pgconn == NULL)
+ {
+ entry->conn->pgconn = connect_pg_server(server, user);
+ entry->conn->nscans = 0;
+ entry->conn->async_state = PGFDW_CONN_IDLE;
+ entry->conn->async_scan = NULL;
+ }

The if condition if (entry->conn == NULL) in GetConnection(), used to track
whether there is a PGConn active for the given entry, now it tracks whether
it has PgFdwConn for the same.

Please see more comments inline.

On Mon, Dec 15, 2014 at 2:39 PM, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

>
>
> * Outline of this patch
>
> From some consideration after the previous discussion and
> comments from others, I judged the original (WIP) patch was
> overdone as the first step. So I reduced the patch to minimal
> function. The new patch does the following,
>
> - Wrapping PGconn by PgFdwConn in order to handle multiple scans
> on one connection.
>
> - The core async logic was added in fetch_more_data().
>

It might help if you can explain this logic in this mail as well as in code
(as per my comment above).

>
> - Invoking remote commands asynchronously in ExecInitForeignScan.
>

> - Canceling async invocation if any other foreign scans,
> modifies, deletes use the same connection.
>

> Cancellation is done by immediately fetching the return of
> already-invoked acync command.
>

> * Where this patch will be effective.
>
> With upcoming inheritance-partition feature, this patch enables
> stating and running foreign scans asynchronously. It will be more
> effective for longer TAT or remote startup times, and larger
> number of foreign servers. No negative performance effect on
> other situations.
>
>
AFAIU, this logic sends only the first query in asynchronous manner not all
of them. Is that right? If yes, I think it's a sever limitation of the
feature. For a query involving multiple foreign scans, only the first one
will be done in async fashion and not the rest. Sorry, if my understanding
is wrong.

I think, we need some data which shows the speed up by this patch. You may
construct a case, where a single query involved multiple foreign scans, and
we can check what is the speed up obtained against the number of foreign
scans.

> * Concerns about this patch.
>
> - This breaks the assumption that scan starts at ExecForeignScan,
> not ExecInitForeignScan, which might cause some problem.
>
>
This should be fine as long as it doesn't have any side effects like
sending query during EXPLAIN (which has been taken care of in this patch.)
Do you think, we need any special handling for PREPAREd statements?

> - error reporting code in do_sql_command is quite ugly..
>
>
>
> regards,
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-12-18 07:46:57 Re: pg_regress writes into source tree
Previous Message Michael Paquier 2014-12-18 05:21:14 Re: [REVIEW] Re: Compression of full-page-writes