Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Date: 2015-03-04 02:41:58
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8010BB365@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Obviously FDW can add multiple paths at a time, like GetForeignPaths,
> so IMO it should be renamed to GetForeignJoinPaths, with plural form.
>
> In addition to that, new member of RelOptInfo, fdw_handler, should be
> initialized explicitly in build_simple_rel.
>
> Please see attached a patch for these changes.
>
Thanks for your checks. Yep, the name of FDW handler should be ...Paths(),
instead of Path().

The attached one integrates Hanada-san's updates.
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

> -----Original Message-----
> From: Shigeru Hanada [mailto:shigeru(dot)hanada(at)gmail(dot)com]
> Sent: Tuesday, March 03, 2015 9:26 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Robert Haas; Tom Lane; pgsql-hackers(at)postgreSQL(dot)org
> Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom
> Plan API)
>
> Kaigai-san,
>
> The v6 patch was cleanly applied on master branch. I'll rebase my
> patch onto it, but before that I have a comment about name of the new
> FDW API handler GetForeignJoinPath.
>
> Obviously FDW can add multiple paths at a time, like GetForeignPaths,
> so IMO it should be renamed to GetForeignJoinPaths, with plural form.
>
> In addition to that, new member of RelOptInfo, fdw_handler, should be
> initialized explicitly in build_simple_rel.
>
> Please see attached a patch for these changes.
>
> I'll review the v6 path afterwards.
>
>
> 2015-03-03 20:20 GMT+09:00 Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>:
> > Sorry, I misoperated on patch creation.
> > Attached one is the correct version.
> > --
> > NEC OSS Promotion Center / PG-Strom Project
> > KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> >
> >
> >> -----Original Message-----
> >> From: pgsql-hackers-owner(at)postgresql(dot)org
> >> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> >> Sent: Tuesday, March 03, 2015 6:31 PM
> >> To: Kaigai Kouhei(海外 浩平); Robert Haas
> >> Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> >> Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
> >>
> >> The attached version of custom/foreign-join interface patch
> >> fixes up the problem reported on the join-pushdown support
> >> thread.
> >>
> >> The previous version referenced *_ps_tlist on setrefs.c, to
> >> check whether the Custom/ForeignScan node is associated with
> >> a particular base relation, or not.
> >> This logic considered above nodes performs base relation scan,
> >> if *_ps_tlist is valid. However, it was incorrect in case when
> >> underlying pseudo-scan relation has empty targetlist.
> >> Instead of the previous logic, it shall be revised to check
> >> scanrelid itself. If zero, it means Custom/ForeignScan node is
> >> not associated with a particular base relation, thus, its slot
> >> descriptor for scan shall be constructed based on *_ps_tlist.
> >>
> >>
> >> Also, I noticed a potential problem if CSP/FDW driver want to
> >> displays expression nodes using deparse_expression() but
> >> varnode within this expression does not appear in the *_ps_tlist.
> >> For example, a remote query below shall return rows with two
> >> columns.
> >>
> >> SELECT atext, btext FROM tbl_a, tbl_b WHERE aid = bid;
> >>
> >> Thus, ForeignScan will perform like as a scan on relation with
> >> two columns, and FDW driver will set two TargetEntry on the
> >> fdw_ps_tlist. If FDW is designed to keep the join condition
> >> (aid = bid) using expression node form, it is expected to be
> >> saved on custom/fdw_expr variable, then setrefs.c rewrites the
> >> varnode according to *_ps_tlist.
> >> It means, we also have to add *_ps_tlist both of "aid" and "bid"
> >> to avoid failure on variable lookup. However, these additional
> >> entries changes the definition of the slot descriptor.
> >> So, I adjusted ExecInitForeignScan and ExecInitCustomScan to
> >> use ExecCleanTypeFromTL(), not ExecTypeFromTL(), when it construct
> >> the slot descriptor based on the *_ps_tlist.
> >> It expects CSP/FDW drivers to add target-entries with resjunk=true,
> >> if it wants to have additional entries for variable lookups on
> >> EXPLAIN command.
> >>
> >> Fortunately or unfortunately, postgres_fdw keeps its remote query
> >> in cstring form, so it does not need to add junk entries on the
> >> fdw_ps_tlist.
> >>
> >> Thanks,
> >> --
> >> NEC OSS Promotion Center / PG-Strom Project
> >> KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> >>
> >>
> >> > -----Original Message-----
> >> > From: pgsql-hackers-owner(at)postgresql(dot)org
> >> > [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> >> > Sent: Sunday, February 15, 2015 11:01 PM
> >> > To: Kaigai Kouhei(海外 浩平); Robert Haas
> >> > Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> >> > Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
> >> >
> >> > The attached patch is a rebased version of join replacement with
> >> > foreign-/custom-scan. Here is no feature updates at this moment
> >> > but SGML documentation is added (according to Michael's comment).
> >> >
> >> > This infrastructure allows foreign-data-wrapper and custom-scan-
> >> > provider to add alternative scan paths towards relations join.
> >> > From viewpoint of the executor, it looks like a scan on a pseudo-
> >> > relation that is materialized from multiple relations, even though
> >> > FDW/CSP internally processes relations join with their own logic.
> >> >
> >> > Its basic idea is, (1) scanrelid==0 indicates this foreign/custom
> >> > scan node runs on a pseudo relation and (2) fdw_ps_tlist and
> >> > custom_ps_tlist introduce the definition of the pseudo relation,
> >> > because it is not associated with a tangible relation unlike
> >> > simple scan case, thus planner cannot know the expected record
> >> > type to be returned without these additional information.
> >> > These two enhancement enables extensions to process relations
> >> > join internally, and to perform as like existing scan node from
> >> > viewpoint of the core backend.
> >> >
> >> > Also, as an aside. I had a discussion with Hanada-san about this
> >> > interface off-list. He had an idea to keep create_plan_recurse()
> >> > static, using a special list field in CustomPath structure to
> >> > chain underlying Path node. If core backend translate the Path
> >> > node to Plan node if valid list given, extension does not need to
> >> > call create_plan_recurse() by itself.
> >> > I have no preference about this. Does anybody have opinion?
> >> >
> >> > Thanks,
> >> > --
> >> > NEC OSS Promotion Center / PG-Strom Project
> >> > KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> >> >
> >> >
> >> > > -----Original Message-----
> >> > > From: pgsql-hackers-owner(at)postgresql(dot)org
> >> > > [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> >> > > Sent: Thursday, January 15, 2015 8:03 AM
> >> > > To: Robert Haas
> >> > > Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> >> > > Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan
> >> > > API)
> >> > >
> >> > > > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> >> > > wrote:
> >> > > > > When custom-scan node replaced a join-plan, it shall have at least
> >> > > > > two child plan-nodes. The callback handler of PlanCustomPath needs
> >> > > > > to be able to call create_plan_recurse() to transform the underlying
> >> > > > > path-nodes to plan-nodes, because this custom-scan node may take
> >> > > > > other built-in scan or sub-join nodes as its inner/outer input.
> >> > > > > In case of FDW, it shall kick any underlying scan relations to
> >> > > > > remote side, thus we may not expect ForeignScan has underlying plans...
> >> > > >
> >> > > > Do you have an example of this?
> >> > > >
> >> > > Yes, even though full code set is too large for patch submission...
> >> > >
> >> > > https://github.com/pg-strom/devel/blob/master/src/gpuhashjoin.c#L1880
> >> > >
> >> > > This create_gpuhashjoin_plan() is PlanCustomPath callback of GpuHashJoin.
> >> > > It takes GpuHashJoinPath inherited from CustomPath that has multiple
> >> > > underlying scan/join paths.
> >> > > Once it is called back from the backend, it also calls create_plan_recurse()
> >> > > to make inner/outer plan nodes according to the paths.
> >> > >
> >> > > In the result, we can see the following query execution plan that CustomScan
> >> > > takes underlying scan plans.
> >> > >
> >> > > postgres=# EXPLAIN SELECT * FROM t0 NATURAL JOIN t1 NATURAL JOIN t2;
> >> > > QUERY PLAN
> >> > > ----------------------------------------------------------------------
> >> > > ------------
> >> > > Custom Scan (GpuHashJoin) (cost=2968.00..140120.31 rows=3970922
> >> > > width=143)
> >> > > Hash clause 1: (aid = aid)
> >> > > Hash clause 2: (bid = bid)
> >> > > Bulkload: On
> >> > > -> Custom Scan (GpuScan) on t0 (cost=500.00..57643.00 rows=4000009
> >> > > width=77)
> >> > > -> Custom Scan (MultiHash) (cost=734.00..734.00 rows=40000
> >> > > width=37)
> >> > > hash keys: aid
> >> > > nBatches: 1 Buckets: 46000 Memory Usage: 99.99%
> >> > > -> Seq Scan on t1 (cost=0.00..734.00 rows=40000 width=37)
> >> > > -> Custom Scan (MultiHash) (cost=734.00..734.00 rows=40000
> >> > > width=37)
> >> > > hash keys: bid
> >> > > nBatches: 1 Buckets: 46000 Memory Usage: 49.99%
> >> > > -> Seq Scan on t2 (cost=0.00..734.00 rows=40000
> >> > > width=37)
> >> > > (13 rows)
> >> > >
> >> > > Thanks,
> >> > > --
> >> > > NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei
> >> > > <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> >> > >
> >> > >
> >> > > > -----Original Message-----
> >> > > > From: Robert Haas [mailto:robertmhaas(at)gmail(dot)com]
> >> > > > Sent: Thursday, January 15, 2015 2:07 AM
> >> > > > To: Kaigai Kouhei(海外 浩平)
> >> > > > Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> >> > > > Subject: ##freemail## Re: Custom/Foreign-Join-APIs (Re: [HACKERS]
> >> > > > [v9.5] Custom Plan API)
> >> > > >
> >> > > > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> >> > > wrote:
> >> > > > > When custom-scan node replaced a join-plan, it shall have at least
> >> > > > > two child plan-nodes. The callback handler of PlanCustomPath needs
> >> > > > > to be able to call create_plan_recurse() to transform the underlying
> >> > > > > path-nodes to plan-nodes, because this custom-scan node may take
> >> > > > > other built-in scan or sub-join nodes as its inner/outer input.
> >> > > > > In case of FDW, it shall kick any underlying scan relations to
> >> > > > > remote side, thus we may not expect ForeignScan has underlying plans...
> >> > > >
> >> > > > Do you have an example of this?
> >> > > >
> >> > > > --
> >> > > > Robert Haas
> >> > > > EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> >> > > > Company
> >> > >
> >> > > --
> >> > > Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org) To make
> >> > > changes to your subscription:
> >> > > http://www.postgresql.org/mailpref/pgsql-hackers
>
>
>
> --
> Shigeru HANADA

Attachment Content-Type Size
pgsql-v9.5-custom-join.v7.patch application/octet-stream 44.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-03-04 02:48:15 Re: INSERT ... ON CONFLICT UPDATE and logical decoding
Previous Message Jan de Visser 2015-03-04 02:37:04 Re: Idea: closing the loop for "pg_ctl reload"