Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Date: 2015-03-03 11:20:43
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8010BACFD@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry, I misoperated on patch creation.
Attached one is the correct version.
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> Sent: Tuesday, March 03, 2015 6:31 PM
> To: Kaigai Kouhei(海外 浩平); Robert Haas
> Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
>
> The attached version of custom/foreign-join interface patch
> fixes up the problem reported on the join-pushdown support
> thread.
>
> The previous version referenced *_ps_tlist on setrefs.c, to
> check whether the Custom/ForeignScan node is associated with
> a particular base relation, or not.
> This logic considered above nodes performs base relation scan,
> if *_ps_tlist is valid. However, it was incorrect in case when
> underlying pseudo-scan relation has empty targetlist.
> Instead of the previous logic, it shall be revised to check
> scanrelid itself. If zero, it means Custom/ForeignScan node is
> not associated with a particular base relation, thus, its slot
> descriptor for scan shall be constructed based on *_ps_tlist.
>
>
> Also, I noticed a potential problem if CSP/FDW driver want to
> displays expression nodes using deparse_expression() but
> varnode within this expression does not appear in the *_ps_tlist.
> For example, a remote query below shall return rows with two
> columns.
>
> SELECT atext, btext FROM tbl_a, tbl_b WHERE aid = bid;
>
> Thus, ForeignScan will perform like as a scan on relation with
> two columns, and FDW driver will set two TargetEntry on the
> fdw_ps_tlist. If FDW is designed to keep the join condition
> (aid = bid) using expression node form, it is expected to be
> saved on custom/fdw_expr variable, then setrefs.c rewrites the
> varnode according to *_ps_tlist.
> It means, we also have to add *_ps_tlist both of "aid" and "bid"
> to avoid failure on variable lookup. However, these additional
> entries changes the definition of the slot descriptor.
> So, I adjusted ExecInitForeignScan and ExecInitCustomScan to
> use ExecCleanTypeFromTL(), not ExecTypeFromTL(), when it construct
> the slot descriptor based on the *_ps_tlist.
> It expects CSP/FDW drivers to add target-entries with resjunk=true,
> if it wants to have additional entries for variable lookups on
> EXPLAIN command.
>
> Fortunately or unfortunately, postgres_fdw keeps its remote query
> in cstring form, so it does not need to add junk entries on the
> fdw_ps_tlist.
>
> Thanks,
> --
> NEC OSS Promotion Center / PG-Strom Project
> KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
>
>
> > -----Original Message-----
> > From: pgsql-hackers-owner(at)postgresql(dot)org
> > [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> > Sent: Sunday, February 15, 2015 11:01 PM
> > To: Kaigai Kouhei(海外 浩平); Robert Haas
> > Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> > Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
> >
> > The attached patch is a rebased version of join replacement with
> > foreign-/custom-scan. Here is no feature updates at this moment
> > but SGML documentation is added (according to Michael's comment).
> >
> > This infrastructure allows foreign-data-wrapper and custom-scan-
> > provider to add alternative scan paths towards relations join.
> > From viewpoint of the executor, it looks like a scan on a pseudo-
> > relation that is materialized from multiple relations, even though
> > FDW/CSP internally processes relations join with their own logic.
> >
> > Its basic idea is, (1) scanrelid==0 indicates this foreign/custom
> > scan node runs on a pseudo relation and (2) fdw_ps_tlist and
> > custom_ps_tlist introduce the definition of the pseudo relation,
> > because it is not associated with a tangible relation unlike
> > simple scan case, thus planner cannot know the expected record
> > type to be returned without these additional information.
> > These two enhancement enables extensions to process relations
> > join internally, and to perform as like existing scan node from
> > viewpoint of the core backend.
> >
> > Also, as an aside. I had a discussion with Hanada-san about this
> > interface off-list. He had an idea to keep create_plan_recurse()
> > static, using a special list field in CustomPath structure to
> > chain underlying Path node. If core backend translate the Path
> > node to Plan node if valid list given, extension does not need to
> > call create_plan_recurse() by itself.
> > I have no preference about this. Does anybody have opinion?
> >
> > Thanks,
> > --
> > NEC OSS Promotion Center / PG-Strom Project
> > KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> >
> >
> > > -----Original Message-----
> > > From: pgsql-hackers-owner(at)postgresql(dot)org
> > > [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> > > Sent: Thursday, January 15, 2015 8:03 AM
> > > To: Robert Haas
> > > Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> > > Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan
> > > API)
> > >
> > > > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> > > wrote:
> > > > > When custom-scan node replaced a join-plan, it shall have at least
> > > > > two child plan-nodes. The callback handler of PlanCustomPath needs
> > > > > to be able to call create_plan_recurse() to transform the underlying
> > > > > path-nodes to plan-nodes, because this custom-scan node may take
> > > > > other built-in scan or sub-join nodes as its inner/outer input.
> > > > > In case of FDW, it shall kick any underlying scan relations to
> > > > > remote side, thus we may not expect ForeignScan has underlying plans...
> > > >
> > > > Do you have an example of this?
> > > >
> > > Yes, even though full code set is too large for patch submission...
> > >
> > > https://github.com/pg-strom/devel/blob/master/src/gpuhashjoin.c#L1880
> > >
> > > This create_gpuhashjoin_plan() is PlanCustomPath callback of GpuHashJoin.
> > > It takes GpuHashJoinPath inherited from CustomPath that has multiple
> > > underlying scan/join paths.
> > > Once it is called back from the backend, it also calls create_plan_recurse()
> > > to make inner/outer plan nodes according to the paths.
> > >
> > > In the result, we can see the following query execution plan that CustomScan
> > > takes underlying scan plans.
> > >
> > > postgres=# EXPLAIN SELECT * FROM t0 NATURAL JOIN t1 NATURAL JOIN t2;
> > > QUERY PLAN
> > > ----------------------------------------------------------------------
> > > ------------
> > > Custom Scan (GpuHashJoin) (cost=2968.00..140120.31 rows=3970922
> > > width=143)
> > > Hash clause 1: (aid = aid)
> > > Hash clause 2: (bid = bid)
> > > Bulkload: On
> > > -> Custom Scan (GpuScan) on t0 (cost=500.00..57643.00 rows=4000009
> > > width=77)
> > > -> Custom Scan (MultiHash) (cost=734.00..734.00 rows=40000
> > > width=37)
> > > hash keys: aid
> > > nBatches: 1 Buckets: 46000 Memory Usage: 99.99%
> > > -> Seq Scan on t1 (cost=0.00..734.00 rows=40000 width=37)
> > > -> Custom Scan (MultiHash) (cost=734.00..734.00 rows=40000
> > > width=37)
> > > hash keys: bid
> > > nBatches: 1 Buckets: 46000 Memory Usage: 49.99%
> > > -> Seq Scan on t2 (cost=0.00..734.00 rows=40000
> > > width=37)
> > > (13 rows)
> > >
> > > Thanks,
> > > --
> > > NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei
> > > <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> > >
> > >
> > > > -----Original Message-----
> > > > From: Robert Haas [mailto:robertmhaas(at)gmail(dot)com]
> > > > Sent: Thursday, January 15, 2015 2:07 AM
> > > > To: Kaigai Kouhei(海外 浩平)
> > > > Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> > > > Subject: ##freemail## Re: Custom/Foreign-Join-APIs (Re: [HACKERS]
> > > > [v9.5] Custom Plan API)
> > > >
> > > > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> > > wrote:
> > > > > When custom-scan node replaced a join-plan, it shall have at least
> > > > > two child plan-nodes. The callback handler of PlanCustomPath needs
> > > > > to be able to call create_plan_recurse() to transform the underlying
> > > > > path-nodes to plan-nodes, because this custom-scan node may take
> > > > > other built-in scan or sub-join nodes as its inner/outer input.
> > > > > In case of FDW, it shall kick any underlying scan relations to
> > > > > remote side, thus we may not expect ForeignScan has underlying plans...
> > > >
> > > > Do you have an example of this?
> > > >
> > > > --
> > > > Robert Haas
> > > > EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> > > > Company
> > >
> > > --
> > > Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org) To make
> > > changes to your subscription:
> > > http://www.postgresql.org/mailpref/pgsql-hackers

Attachment Content-Type Size
pgsql-v9.5-custom-join.v6.patch application/octet-stream 44.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Asif Naeem 2015-03-03 11:36:06 Re: Install shared libs in lib/ and bin/ with MSVC (Was: install libpq.dll in bin directory on Windows / Cygwin)
Previous Message Andres Freund 2015-03-03 11:13:42 Re: INSERT ... ON CONFLICT UPDATE and logical decoding