Re: Async execution of postgres_fdw.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Async execution of postgres_fdw.
Date: 2015-01-09 08:30:03
Message-ID: 20150109.173003.218333899.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, thank you for the comment.

This is the second version of the patch.

- Refactored to make the code simpler and clearer.
- Added comment about logic outline and struct members.
- Removed trailig white spaces..

- No additional test yet.

======
> warning: 3 lines add whitespace errors.

Whoops. Fixed.

> 2. The patches compile cleanly.
> 3. The regression is clean, even in contrib/postgres_fdw and
> contrib/file_fdw
>
> Tests
> -------
> We need tests to make sure that the logic remains intact even after further
> changes in this area. Couple of tests which require multiple foreign scans
> within the same query fetching rows more than fetch size (100) would be
> required. Also, some DMLs, which involve multiple foreign scans would test
> the sanity when UPDATE/DELETE interleave such scans. By multiple foreign
> scans I mean both multiple scans on a single foreign server and multiple
> scans spread across multiple foreign servers.

Additional tests indeed might be needed. Some of the test related
to this patch are implicitly done in the present regression
tests. But no explicit ones.

fetch_size is currently a bare constant so I think it is not so
necessary to test for other fetch sizes. Even if different size
will potentially cause a problem, it will be found when the
different number is actually applied.

On the current design, async scan is started only on the first
scan on the connection, and if the next scan or modify claims the
same connection, the async state is immediately finished and
behaves as the same as the unpatched version. But since
asynchronous/parallel scan is introduced in any form, such kind
of test seems to be needed.

multi-server tests are not done also in the unpatched version but
there's no difference between multiple foregn servers on the same
remote server and them distributed on multiple remotes. The async
scan of this patch works only on the same foreign server so there
seems to be no need such kind of test. Do you have any specific
concern about this?

After all, I will add some explict tests for async-canceling in
the next patch.

> Code
> -------
> Because previous "conn" is now replaced by "conn->pgconn", the double
> indirection makes the code a bit ugly and prone to segfaults (conn being
> NULL or invalid pointer). Can we minimize such code or wrap it under a
> macro?

Agreed. It was annoyance also for me. I've done the following
things to encapsulate PgFdwConn to some extent in the second
version of this patch. They are described below.

> We need some comments about the structure definition of PgFdwConn and its
> members explaining the purpose of this structure and its members.

Thank you for pointing that. I forgot that. I added simple
comments there.

> Same is the case with enum PgFdwConnState. In fact, the state diagram of a
> connection has become more complicated with the async connections, so it
> might be better to explain that state diagram at one place in the code
> (through comments). The definition of the enum might be a good place to do
> that.

I added a comment describing the and logic and meaning of the
statesjust above the enum declaration.

> Otherwise, the logic of connection maintenance is spread at multiple
> places and is difficult to understand by looking at the code.
>
> In function GetConnection(), at line
> elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
> - entry->conn, server->servername);
> + entry->conn->pgconn, server->servername);

Thank you, I replaced conn's in this form with PFC_PGCONN(conn).

> entry->conn->pgconn may not necessarily be a new connection and may be NULL
> (as the next line check it for being NULL). So, I think this line should be
> moved within the following if block after pgconn has been initialised with
> the new connection.
> + if (entry->conn->pgconn == NULL)
> + {
> + entry->conn->pgconn = connect_pg_server(server, user);
> + entry->conn->nscans = 0;
> + entry->conn->async_state = PGFDW_CONN_IDLE;
> + entry->conn->async_scan = NULL;
> + }
>
> The if condition if (entry->conn == NULL) in GetConnection(), used to track
> whether there is a PGConn active for the given entry, now it tracks whether
> it has PgFdwConn for the same.

After some soncideration, I decided to make PgFdwConn to be
handled more similarly to PGconn. This patch has shrunk as a
result and bacame looks clear.

- Added macros to encapsulate PgFdwConn struct. (One of them is a function)

- Added macros to call PQxxx functions taking PgFdwConn.

- connect_pg_server() returns PgFdwConn.

- connection.c does not touch the inside of PgFdwConn except a
few points. The PgFdwConn's memory is allocated with malloc()
as PGconn and freed by PFCfinish() which is the correspondent
of PQfinish().

As the result of those chagnes, this patch has altered into the
following shape.

- All points where PGconn is used now uses PgFdwConn. They are
seemingly simple replacements.

- The major functional changes are concentrated within
fetch_more_data(), postgreBeginForeignScan(), GetConnection() ,
ReleaseConnection(), and the additional function
finish_async_connection().

> Please see more comments inline.

> > * Outline of this patch
> >
> > From some consideration after the previous discussion and
> > comments from others, I judged the original (WIP) patch was
> > overdone as the first step. So I reduced the patch to minimal
> > function. The new patch does the following,
> >
> > - Wrapping PGconn by PgFdwConn in order to handle multiple scans
> > on one connection.
> >
> > - The core async logic was added in fetch_more_data().
> >
>
> It might help if you can explain this logic in this mail as well as in code
> (as per my comment above).

I wrote the outline of the logic in the comment for enum
PgFdwConnState in postgres_fdw.h. Is it make sense?

> > * Where this patch will be effective.
> >
> > With upcoming inheritance-partition feature, this patch enables
> > stating and running foreign scans asynchronously. It will be more
> > effective for longer TAT or remote startup times, and larger
> > number of foreign servers. No negative performance effect on
> > other situations.
> >
> >
> AFAIU, this logic sends only the first query in asynchronous manner not all
> of them. Is that right? If yes, I think it's a sever limitation of the
> feature. For a query involving multiple foreign scans, only the first one
> will be done in async fashion and not the rest. Sorry, if my understanding
> is wrong.

You're right for the first point. So the domain I think this is
effective is the case of sharding. Each remote server can have
dedicate PGconn connection in the case. Addition to it, the
ongoing FDW Join pushdown should increase the chance for async
execution in this manner. I found that It is difficult to find
the appropriate policy for managing the load on the remote server
when multiple PGconn connection for single remote server, so it
would be the next issue.

> I think, we need some data which shows the speed up by this patch. You may
> construct a case, where a single query involved multiple foreign scans, and
> we can check what is the speed up obtained against the number of foreign
> scans.

Agreed, I'll show you some such figures afterwards.

> > * Concerns about this patch.
> >
> > - This breaks the assumption that scan starts at ExecForeignScan,
> > not ExecInitForeignScan, which might cause some problem.
> >
> >
> This should be fine as long as it doesn't have any side effects like
> sending query during EXPLAIN (which has been taken care of in this patch.)
> Do you think, we need any special handling for PREPAREd statements?

I suppose there's no difference between PREAPREd and
not-PREPAREDd at the level of FDW.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-Asynchronous-execution-of-postgres_fdw-v2.patch text/x-patch 31.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Abhijit Menon-Sen 2015-01-09 08:32:03 Re: What exactly is our CRC algorithm?
Previous Message Michael Paquier 2015-01-09 08:27:49 Re: TABLESAMPLE patch