Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Date: 2015-03-10 03:18:41
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8010BEE85@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The attached patch integrates a suggestion from Ashutosh Bapat.
It allows to track set of relations involved in a join, but
replaced by foreign-/custom-scan. It enables to make correct
EXPLAIN output, if FDW/CSP driver makes human readable symbols
using deparse_expression() or others.

Differences from v7 is identical with what I posted on the
join push-down support thread.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> Sent: Wednesday, March 04, 2015 11:42 AM
> To: Shigeru Hanada
> Cc: Robert Haas; Tom Lane; pgsql-hackers(at)postgreSQL(dot)org
> Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
>
> > Obviously FDW can add multiple paths at a time, like GetForeignPaths,
> > so IMO it should be renamed to GetForeignJoinPaths, with plural form.
> >
> > In addition to that, new member of RelOptInfo, fdw_handler, should be
> > initialized explicitly in build_simple_rel.
> >
> > Please see attached a patch for these changes.
> >
> Thanks for your checks. Yep, the name of FDW handler should be ...Paths(),
> instead of Path().
>
> The attached one integrates Hanada-san's updates.
> --
> NEC OSS Promotion Center / PG-Strom Project
> KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
>
>
> > -----Original Message-----
> > From: Shigeru Hanada [mailto:shigeru(dot)hanada(at)gmail(dot)com]
> > Sent: Tuesday, March 03, 2015 9:26 PM
> > To: Kaigai Kouhei(海外 浩平)
> > Cc: Robert Haas; Tom Lane; pgsql-hackers(at)postgreSQL(dot)org
> > Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom
> > Plan API)
> >
> > Kaigai-san,
> >
> > The v6 patch was cleanly applied on master branch. I'll rebase my
> > patch onto it, but before that I have a comment about name of the new
> > FDW API handler GetForeignJoinPath.
> >
> > Obviously FDW can add multiple paths at a time, like GetForeignPaths,
> > so IMO it should be renamed to GetForeignJoinPaths, with plural form.
> >
> > In addition to that, new member of RelOptInfo, fdw_handler, should be
> > initialized explicitly in build_simple_rel.
> >
> > Please see attached a patch for these changes.
> >
> > I'll review the v6 path afterwards.
> >
> >
> > 2015-03-03 20:20 GMT+09:00 Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>:
> > > Sorry, I misoperated on patch creation.
> > > Attached one is the correct version.
> > > --
> > > NEC OSS Promotion Center / PG-Strom Project
> > > KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> > >
> > >
> > >> -----Original Message-----
> > >> From: pgsql-hackers-owner(at)postgresql(dot)org
> > >> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> > >> Sent: Tuesday, March 03, 2015 6:31 PM
> > >> To: Kaigai Kouhei(海外 浩平); Robert Haas
> > >> Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> > >> Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
> > >>
> > >> The attached version of custom/foreign-join interface patch
> > >> fixes up the problem reported on the join-pushdown support
> > >> thread.
> > >>
> > >> The previous version referenced *_ps_tlist on setrefs.c, to
> > >> check whether the Custom/ForeignScan node is associated with
> > >> a particular base relation, or not.
> > >> This logic considered above nodes performs base relation scan,
> > >> if *_ps_tlist is valid. However, it was incorrect in case when
> > >> underlying pseudo-scan relation has empty targetlist.
> > >> Instead of the previous logic, it shall be revised to check
> > >> scanrelid itself. If zero, it means Custom/ForeignScan node is
> > >> not associated with a particular base relation, thus, its slot
> > >> descriptor for scan shall be constructed based on *_ps_tlist.
> > >>
> > >>
> > >> Also, I noticed a potential problem if CSP/FDW driver want to
> > >> displays expression nodes using deparse_expression() but
> > >> varnode within this expression does not appear in the *_ps_tlist.
> > >> For example, a remote query below shall return rows with two
> > >> columns.
> > >>
> > >> SELECT atext, btext FROM tbl_a, tbl_b WHERE aid = bid;
> > >>
> > >> Thus, ForeignScan will perform like as a scan on relation with
> > >> two columns, and FDW driver will set two TargetEntry on the
> > >> fdw_ps_tlist. If FDW is designed to keep the join condition
> > >> (aid = bid) using expression node form, it is expected to be
> > >> saved on custom/fdw_expr variable, then setrefs.c rewrites the
> > >> varnode according to *_ps_tlist.
> > >> It means, we also have to add *_ps_tlist both of "aid" and "bid"
> > >> to avoid failure on variable lookup. However, these additional
> > >> entries changes the definition of the slot descriptor.
> > >> So, I adjusted ExecInitForeignScan and ExecInitCustomScan to
> > >> use ExecCleanTypeFromTL(), not ExecTypeFromTL(), when it construct
> > >> the slot descriptor based on the *_ps_tlist.
> > >> It expects CSP/FDW drivers to add target-entries with resjunk=true,
> > >> if it wants to have additional entries for variable lookups on
> > >> EXPLAIN command.
> > >>
> > >> Fortunately or unfortunately, postgres_fdw keeps its remote query
> > >> in cstring form, so it does not need to add junk entries on the
> > >> fdw_ps_tlist.
> > >>
> > >> Thanks,
> > >> --
> > >> NEC OSS Promotion Center / PG-Strom Project
> > >> KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> > >>
> > >>
> > >> > -----Original Message-----
> > >> > From: pgsql-hackers-owner(at)postgresql(dot)org
> > >> > [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> > >> > Sent: Sunday, February 15, 2015 11:01 PM
> > >> > To: Kaigai Kouhei(海外 浩平); Robert Haas
> > >> > Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> > >> > Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan
> API)
> > >> >
> > >> > The attached patch is a rebased version of join replacement with
> > >> > foreign-/custom-scan. Here is no feature updates at this moment
> > >> > but SGML documentation is added (according to Michael's comment).
> > >> >
> > >> > This infrastructure allows foreign-data-wrapper and custom-scan-
> > >> > provider to add alternative scan paths towards relations join.
> > >> > From viewpoint of the executor, it looks like a scan on a pseudo-
> > >> > relation that is materialized from multiple relations, even though
> > >> > FDW/CSP internally processes relations join with their own logic.
> > >> >
> > >> > Its basic idea is, (1) scanrelid==0 indicates this foreign/custom
> > >> > scan node runs on a pseudo relation and (2) fdw_ps_tlist and
> > >> > custom_ps_tlist introduce the definition of the pseudo relation,
> > >> > because it is not associated with a tangible relation unlike
> > >> > simple scan case, thus planner cannot know the expected record
> > >> > type to be returned without these additional information.
> > >> > These two enhancement enables extensions to process relations
> > >> > join internally, and to perform as like existing scan node from
> > >> > viewpoint of the core backend.
> > >> >
> > >> > Also, as an aside. I had a discussion with Hanada-san about this
> > >> > interface off-list. He had an idea to keep create_plan_recurse()
> > >> > static, using a special list field in CustomPath structure to
> > >> > chain underlying Path node. If core backend translate the Path
> > >> > node to Plan node if valid list given, extension does not need to
> > >> > call create_plan_recurse() by itself.
> > >> > I have no preference about this. Does anybody have opinion?
> > >> >
> > >> > Thanks,
> > >> > --
> > >> > NEC OSS Promotion Center / PG-Strom Project
> > >> > KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> > >> >
> > >> >
> > >> > > -----Original Message-----
> > >> > > From: pgsql-hackers-owner(at)postgresql(dot)org
> > >> > > [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> > >> > > Sent: Thursday, January 15, 2015 8:03 AM
> > >> > > To: Robert Haas
> > >> > > Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> > >> > > Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan
> > >> > > API)
> > >> > >
> > >> > > > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> > >> > > wrote:
> > >> > > > > When custom-scan node replaced a join-plan, it shall have at least
> > >> > > > > two child plan-nodes. The callback handler of PlanCustomPath needs
> > >> > > > > to be able to call create_plan_recurse() to transform the underlying
> > >> > > > > path-nodes to plan-nodes, because this custom-scan node may take
> > >> > > > > other built-in scan or sub-join nodes as its inner/outer input.
> > >> > > > > In case of FDW, it shall kick any underlying scan relations to
> > >> > > > > remote side, thus we may not expect ForeignScan has underlying plans...
> > >> > > >
> > >> > > > Do you have an example of this?
> > >> > > >
> > >> > > Yes, even though full code set is too large for patch submission...
> > >> > >
> > >> > > https://github.com/pg-strom/devel/blob/master/src/gpuhashjoin.c#L1880
> > >> > >
> > >> > > This create_gpuhashjoin_plan() is PlanCustomPath callback of
> GpuHashJoin.
> > >> > > It takes GpuHashJoinPath inherited from CustomPath that has multiple
> > >> > > underlying scan/join paths.
> > >> > > Once it is called back from the backend, it also calls
> create_plan_recurse()
> > >> > > to make inner/outer plan nodes according to the paths.
> > >> > >
> > >> > > In the result, we can see the following query execution plan that CustomScan
> > >> > > takes underlying scan plans.
> > >> > >
> > >> > > postgres=# EXPLAIN SELECT * FROM t0 NATURAL JOIN t1 NATURAL JOIN t2;
> > >> > > QUERY PLAN
> > >> > >
> ----------------------------------------------------------------------
> > >> > > ------------
> > >> > > Custom Scan (GpuHashJoin) (cost=2968.00..140120.31 rows=3970922
> > >> > > width=143)
> > >> > > Hash clause 1: (aid = aid)
> > >> > > Hash clause 2: (bid = bid)
> > >> > > Bulkload: On
> > >> > > -> Custom Scan (GpuScan) on t0 (cost=500.00..57643.00 rows=4000009
> > >> > > width=77)
> > >> > > -> Custom Scan (MultiHash) (cost=734.00..734.00 rows=40000
> > >> > > width=37)
> > >> > > hash keys: aid
> > >> > > nBatches: 1 Buckets: 46000 Memory Usage: 99.99%
> > >> > > -> Seq Scan on t1 (cost=0.00..734.00 rows=40000 width=37)
> > >> > > -> Custom Scan (MultiHash) (cost=734.00..734.00 rows=40000
> > >> > > width=37)
> > >> > > hash keys: bid
> > >> > > nBatches: 1 Buckets: 46000 Memory Usage: 49.99%
> > >> > > -> Seq Scan on t2 (cost=0.00..734.00 rows=40000
> > >> > > width=37)
> > >> > > (13 rows)
> > >> > >
> > >> > > Thanks,
> > >> > > --
> > >> > > NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei
> > >> > > <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> > >> > >
> > >> > >
> > >> > > > -----Original Message-----
> > >> > > > From: Robert Haas [mailto:robertmhaas(at)gmail(dot)com]
> > >> > > > Sent: Thursday, January 15, 2015 2:07 AM
> > >> > > > To: Kaigai Kouhei(海外 浩平)
> > >> > > > Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> > >> > > > Subject: ##freemail## Re: Custom/Foreign-Join-APIs (Re: [HACKERS]
> > >> > > > [v9.5] Custom Plan API)
> > >> > > >
> > >> > > > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> > >> > > wrote:
> > >> > > > > When custom-scan node replaced a join-plan, it shall have at least
> > >> > > > > two child plan-nodes. The callback handler of PlanCustomPath needs
> > >> > > > > to be able to call create_plan_recurse() to transform the underlying
> > >> > > > > path-nodes to plan-nodes, because this custom-scan node may take
> > >> > > > > other built-in scan or sub-join nodes as its inner/outer input.
> > >> > > > > In case of FDW, it shall kick any underlying scan relations to
> > >> > > > > remote side, thus we may not expect ForeignScan has underlying plans...
> > >> > > >
> > >> > > > Do you have an example of this?
> > >> > > >
> > >> > > > --
> > >> > > > Robert Haas
> > >> > > > EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> > >> > > > Company
> > >> > >
> > >> > > --
> > >> > > Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org) To
> make
> > >> > > changes to your subscription:
> > >> > > http://www.postgresql.org/mailpref/pgsql-hackers
> >
> >
> >
> > --
> > Shigeru HANADA

Attachment Content-Type Size
pgsql-v9.5-custom-join.v8.patch application/octet-stream 48.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-03-10 03:36:42 Re: BRIN page type identifier
Previous Message Stephen Frost 2015-03-10 03:08:46 Re: Proposal: knowing detail of config files via SQL