Re: Introducing coarse grain parallelism by postgres_fdw.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Introducing coarse grain parallelism by postgres_fdw.
Date: 2014-07-28 09:15:45
Message-ID: 20140728.181545.225059735.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, thank you for the comment.

> Hi Kyotaro,
> fetch_more_rows() always runs "FETCH 100 FROM <cursor_name>" on the foreign
> server to get the next set of rows. The changes you have made seem to run
> only the first FETCHes from all the nodes but not the subsequent ones. The
> optimization will be helpful only when there are less than 100 rows per
> postgres connection in the query. If there are more than 100 rows from a
> single foreign server, the second onwards FETCHes will be serialized.
>
> Is my understanding correct?

Yes, you're right. So I wrote that as following.

Me> it almost halves the response time because the remote queries
Me> take far longer startup time than running time.

Parallelizing all FETCHes would be effective if the connection
transfers bytes at a speed near the row fetch speed but I
excluded the case because of the my assumption that the chance is
relatively lower for the gain, and for the simplicity as PoC. If
this approach is not so inappropriate and not getting objections,
I will work on this for the more complete implement, including
cost estimation.

> On Fri, Jul 25, 2014 at 2:05 PM, Kyotaro HORIGUCHI <
> horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>
> > Hello,
> >
> > I noticed that postgresql_fdw can run in parallel by very small
> > change. The attached patch let scans by postgres_fdws on
> > different foreign servers run sumiltaneously. This seems a
> > convenient entry point to parallel execution.
> >
> > For the testing configuration which the attched sql script makes,
> > it almost halves the response time because the remote queries
> > take far longer startup time than running time. The two foreign
> > tables fvs1, fvs2 and fvs1_2 are defined on the same table but
> > fvs1 and fvs1_2 are on the same foreign server pgs1 and fvs2 is
> > on the another foreign server pgs2.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2014-07-28 09:24:43 Re: Introducing coarse grain parallelism by postgres_fdw.
Previous Message Heikki Linnakangas 2014-07-28 08:59:11 Re: gaussian distribution pgbench