Re: Async execution of postgres_fdw.

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Async execution of postgres_fdw.
Date: 2015-01-12 09:36:20
Message-ID: CAFjFpRf7iXYENyUfMFJXfPB4v=yTw2J3ea2=yTXfG4=+E-QS5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 9, 2015 at 2:00 PM, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> Hello, thank you for the comment.
>
> This is the second version of the patch.
>
> - Refactored to make the code simpler and clearer.
> - Added comment about logic outline and struct members.
> - Removed trailig white spaces..
>
> - No additional test yet.
>
>
> ======
> > warning: 3 lines add whitespace errors.
>
> Whoops. Fixed.
>
> > 2. The patches compile cleanly.
> > 3. The regression is clean, even in contrib/postgres_fdw and
> > contrib/file_fdw
> >
> > Tests
> > -------
> > We need tests to make sure that the logic remains intact even after
> further
> > changes in this area. Couple of tests which require multiple foreign
> scans
> > within the same query fetching rows more than fetch size (100) would be
> > required. Also, some DMLs, which involve multiple foreign scans would
> test
> > the sanity when UPDATE/DELETE interleave such scans. By multiple foreign
> > scans I mean both multiple scans on a single foreign server and multiple
> > scans spread across multiple foreign servers.
>
> Additional tests indeed might be needed. Some of the test related
> to this patch are implicitly done in the present regression
> tests. But no explicit ones.
>
> fetch_size is currently a bare constant so I think it is not so
> necessary to test for other fetch sizes. Even if different size
> will potentially cause a problem, it will be found when the
> different number is actually applied.
>
> On the current design, async scan is started only on the first
> scan on the connection, and if the next scan or modify claims the
> same connection, the async state is immediately finished and
> behaves as the same as the unpatched version. But since
> asynchronous/parallel scan is introduced in any form, such kind
> of test seems to be needed.
>
> multi-server tests are not done also in the unpatched version but
> there's no difference between multiple foregn servers on the same
> remote server and them distributed on multiple remotes. The async
> scan of this patch works only on the same foreign server so there
> seems to be no need such kind of test. Do you have any specific
> concern about this?
>
> After all, I will add some explict tests for async-canceling in
> the next patch.
>
> > Code
> > -------
> > Because previous "conn" is now replaced by "conn->pgconn", the double
> > indirection makes the code a bit ugly and prone to segfaults (conn being
> > NULL or invalid pointer). Can we minimize such code or wrap it under a
> > macro?
>
> Agreed. It was annoyance also for me. I've done the following
> things to encapsulate PgFdwConn to some extent in the second
> version of this patch. They are described below.
>

Looks better.

>
> > We need some comments about the structure definition of PgFdwConn and its
> > members explaining the purpose of this structure and its members.
>
> Thank you for pointing that. I forgot that. I added simple
> comments there.
>
> > Same is the case with enum PgFdwConnState. In fact, the state diagram of
> a
> > connection has become more complicated with the async connections, so it
> > might be better to explain that state diagram at one place in the code
> > (through comments). The definition of the enum might be a good place to
> do
> > that.
>
> I added a comment describing the and logic and meaning of the
> statesjust above the enum declaration.
>

This needs to be clarified further. But that can wait till we finalise the
approach and the patch. Esp. comment below is confusing
1487 + * PGFDW_CONN_SYNC_RUNNING is rather an internal state in
1488 + * fetch_more_data(). It indicates that the function shouldn't send
the next
1489 + * fetch requst after getting the result.

I couldn't get the meaning of the second sentence, esp. it's connection
with synchronous-ness.

> Otherwise, the logic of connection maintenance is spread at multiple
> > places and is difficult to understand by looking at the code.
> >
> > In function GetConnection(), at line
> > elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
> > - entry->conn, server->servername);
> > + entry->conn->pgconn, server->servername);
>
> Thank you, I replaced conn's in this form with PFC_PGCONN(conn).
>

This looks better.

>
> > entry->conn->pgconn may not necessarily be a new connection and may be
> NULL
> > (as the next line check it for being NULL). So, I think this line should
> be
> > moved within the following if block after pgconn has been initialised
> with
> > the new connection.
> > + if (entry->conn->pgconn == NULL)
> > + {
> > + entry->conn->pgconn = connect_pg_server(server, user);
> > + entry->conn->nscans = 0;
> > + entry->conn->async_state = PGFDW_CONN_IDLE;
> > + entry->conn->async_scan = NULL;
> > + }
> >
> > The if condition if (entry->conn == NULL) in GetConnection(), used to
> track
> > whether there is a PGConn active for the given entry, now it tracks
> whether
> > it has PgFdwConn for the same.
>
> After some soncideration, I decided to make PgFdwConn to be
> handled more similarly to PGconn. This patch has shrunk as a
> result and bacame looks clear.
>

I think it's still prone to segfaults considering two pointer indirections.

>
> - Added macros to encapsulate PgFdwConn struct. (One of them is a function)
>
> - Added macros to call PQxxx functions taking PgFdwConn.
>
> - connect_pg_server() returns PgFdwConn.
>
> - connection.c does not touch the inside of PgFdwConn except a
> few points. The PgFdwConn's memory is allocated with malloc()
> as PGconn and freed by PFCfinish() which is the correspondent
> of PQfinish().
>
> As the result of those chagnes, this patch has altered into the
> following shape.
>
> - All points where PGconn is used now uses PgFdwConn. They are
> seemingly simple replacements.
>
> - The major functional changes are concentrated within
> fetch_more_data(), postgreBeginForeignScan(), GetConnection() ,
> ReleaseConnection(), and the additional function
> finish_async_connection().
>
> > Please see more comments inline.
>
> > > * Outline of this patch
> > >
> > > From some consideration after the previous discussion and
> > > comments from others, I judged the original (WIP) patch was
> > > overdone as the first step. So I reduced the patch to minimal
> > > function. The new patch does the following,
> > >
> > > - Wrapping PGconn by PgFdwConn in order to handle multiple scans
> > > on one connection.
> > >
> > > - The core async logic was added in fetch_more_data().
> > >
> >
> > It might help if you can explain this logic in this mail as well as in
> code
> > (as per my comment above).
>
> I wrote the outline of the logic in the comment for enum
> PgFdwConnState in postgres_fdw.h. Is it make sense?
>

The point about two different ForeignScan nodes using the same connection
needs some clarification, I guess. It's not very clear, why would there be
more queries run on the same connection. I know why this happens, but it's
important to mention it somewhere. If it's already mentioned somewhere in
the file, sorry for not paying attention to that.

>
> > > * Where this patch will be effective.
> > >
> > > With upcoming inheritance-partition feature, this patch enables
> > > stating and running foreign scans asynchronously. It will be more
> > > effective for longer TAT or remote startup times, and larger
> > > number of foreign servers. No negative performance effect on
> > > other situations.
> > >
> > >
> > AFAIU, this logic sends only the first query in asynchronous manner not
> all
> > of them. Is that right? If yes, I think it's a sever limitation of the
> > feature. For a query involving multiple foreign scans, only the first one
> > will be done in async fashion and not the rest. Sorry, if my
> understanding
> > is wrong.
>
> You're right for the first point. So the domain I think this is
> effective is the case of sharding. Each remote server can have
> dedicate PGconn connection in the case. Addition to it, the
> ongoing FDW Join pushdown should increase the chance for async
> execution in this manner. I found that It is difficult to find
> the appropriate policy for managing the load on the remote server
> when multiple PGconn connection for single remote server, so it
> would be the next issue.
>

I think there is more chance that there will more than one ForeignScan
nodes trying interact with a foreign server, even after the push-down work.
The current solution doesn't address that. We actually need parallel
querying in two cases
1. Querying multiple servers in parallel
2. Querying same server (by two querists) in parallel within the same query
e.g. an un-pushable join.

We need a solution which is work in both the cases.

Is it possible to use the parallel query infrastructure being built by
Robert or to do something like parallel seq scan? That will work, not just
for Postgres FDW but all the FDWs.

>
> > I think, we need some data which shows the speed up by this patch. You
> may
> > construct a case, where a single query involved multiple foreign scans,
> and
> > we can check what is the speed up obtained against the number of foreign
> > scans.
>
> Agreed, I'll show you some such figures afterwards.
>
> > > * Concerns about this patch.
> > >
> > > - This breaks the assumption that scan starts at ExecForeignScan,
> > > not ExecInitForeignScan, which might cause some problem.
> > >
> > >
> > This should be fine as long as it doesn't have any side effects like
> > sending query during EXPLAIN (which has been taken care of in this
> patch.)
> > Do you think, we need any special handling for PREPAREd statements?
>
> I suppose there's no difference between PREAPREd and
> not-PREPAREDd at the level of FDW.
>

In case of Prepared statements, ExecInit is called at the end of planning,
without subsequent execution like the case of EXPLAIN. I see that the patch
handles EXPLAIN well, but I didn't see any specific code for PREPARE.

>
> regards,
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Voronin 2015-01-12 11:27:29 ereport bug
Previous Message Jim Nasby 2015-01-12 08:17:12 Re: Parallel Seq Scan