Re: FDW and parallel execution

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: k(dot)knizhnik(at)postgrespro(dot)ru
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: FDW and parallel execution
Date: 2017-04-13 07:49:49
Message-ID: 20170413.164949.16305379.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry for the too-brief reply.

At Tue, 11 Apr 2017 20:08:46 +0300, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> wrote in <94c8692a-f299-b72b-6227-270b8a9ed7ad(at)postgrespro(dot)ru>
>
> On 04.04.2017 13:29, Kyotaro HORIGUCHI wrote:
> > Hi,
> >
> > At Sun, 02 Apr 2017 16:30:24 +0300, Konstantin Knizhnik
> > <k(dot)knizhnik(at)postgrespro(dot)ru> wrote in <58E0FCF0(dot)2070603(at)postgrespro(dot)ru>
> >> My FDW provides implementation for IsForeignScanParallelSafe which
> >> returns true.
> >> I wonder what can prevent optimizer from using parallel plan in this
> >> case?
> > Parallel execution requires partial paths. It's the work for
> > GetForeignPaths of your FDW.
>
> Thank you very much for explanation.
> But unfortunately I still do not completely understand what kind of
> queries allow parallel execution with FDW.

At Tue, 11 Apr 2017 19:20:04 +0200, PostgreSQL - Hans-Jürgen Schönig <postgres(at)cybertec(dot)at> wrote in <0c9c101d-0fbb-1e19-f04c-7a6ec577d960(at)cybertec(dot)at>
> did you check out antonin houska's patches?
> we basically got code, which can do that.

Parallel aggregation is already available. Antonin's patch is
partition-wise aggregation, which boosts the case where partition
key is aggregation key, I suppose. parallel aggregation seems to
be considered when any appropriate partial path is available. (I
haven't tried anything, though.)

set_plain_rel_pathlist() does the work for plain relations so
what we should do in GetForeignPaths would be follows.

- check rel->consider_parallel (won't be requried since the fDW
knows that) and rel->lateral_relids.

- If parallel is OK, create a path with create_foreignscan_path
in ordinary way then change some parallel related members as
necessary.

- Like create_plain_partial_paths(), check certain conditions and
finally add_partial_path() the created partial foreign scan path.

I haven't really done this, so I might be wrong.

> Section "FDW Routines for Parallel Execution" of FDW specification
> says:
> > A ForeignScan node can, optionally, support parallel execution. A
> > parallel ForeignScan will be executed in multiple processes and should
> > return each row only once across all cooperating processes. To do
> > this, processes can coordinate through fixed size chunks of dynamic
> > shared memory. This shared memory is not guaranteed to be mapped at
> > the same address in every process, so pointers may not be used. The
> > following callbacks are all optional in general, but required if
> > parallel execution is to be supported.
>
> I provided IsForeignScanParallelSafe, EstimateDSMForeignScan,
> InitializeDSMForeignSca and InitializeWorkerForeignScan in my FDW.
> IsForeignScanParallelSafe returns true.
> Also in GetForeignPaths function I created path with
> baserel->consider_parallel == true.
> Is it enough or I should do something else?

Creating partial paths, I think. create_grouping_paths() requires
partial_pathlist in input_rel.

The section is explaning FDW routines specially provided for
parallel execution. But it doesn't seem mentioning "how to run a
parallel execution" as a whole.

> But unfortunately I failed to find any query: sequential scan, grand
> aggregation, aggregation with group by, joins... when parallel
> execution plan is used for this FDW.
> Also there are no examples of using this functions in Postgres
> distributive and I failed to find any such examples in Internet.

Maybe you're the pioneer in this area.

> Can somebody please clarify my situation with parallel execution and
> FDW and may be point at some examples?
> Thank in advance.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-04-13 08:17:01 Re: Quorum commit for multiple synchronous replication.
Previous Message Craig Ringer 2017-04-13 07:47:55 Re: [PATCH v1] Add and report the new "in_hot_standby" GUC pseudo-variable.