Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Date: 2015-02-15 14:01:13
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8010B1680@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The attached patch is a rebased version of join replacement with
foreign-/custom-scan. Here is no feature updates at this moment
but SGML documentation is added (according to Michael's comment).

This infrastructure allows foreign-data-wrapper and custom-scan-
provider to add alternative scan paths towards relations join.
From viewpoint of the executor, it looks like a scan on a pseudo-
relation that is materialized from multiple relations, even though
FDW/CSP internally processes relations join with their own logic.

Its basic idea is, (1) scanrelid==0 indicates this foreign/custom
scan node runs on a pseudo relation and (2) fdw_ps_tlist and
custom_ps_tlist introduce the definition of the pseudo relation,
because it is not associated with a tangible relation unlike
simple scan case, thus planner cannot know the expected record
type to be returned without these additional information.
These two enhancement enables extensions to process relations
join internally, and to perform as like existing scan node from
viewpoint of the core backend.

Also, as an aside. I had a discussion with Hanada-san about this
interface off-list. He had an idea to keep create_plan_recurse()
static, using a special list field in CustomPath structure to
chain underlying Path node. If core backend translate the Path
node to Plan node if valid list given, extension does not need to
call create_plan_recurse() by itself.
I have no preference about this. Does anybody have opinion?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> Sent: Thursday, January 15, 2015 8:03 AM
> To: Robert Haas
> Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan
> API)
>
> > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> wrote:
> > > When custom-scan node replaced a join-plan, it shall have at least
> > > two child plan-nodes. The callback handler of PlanCustomPath needs
> > > to be able to call create_plan_recurse() to transform the underlying
> > > path-nodes to plan-nodes, because this custom-scan node may take
> > > other built-in scan or sub-join nodes as its inner/outer input.
> > > In case of FDW, it shall kick any underlying scan relations to
> > > remote side, thus we may not expect ForeignScan has underlying plans...
> >
> > Do you have an example of this?
> >
> Yes, even though full code set is too large for patch submission...
>
> https://github.com/pg-strom/devel/blob/master/src/gpuhashjoin.c#L1880
>
> This create_gpuhashjoin_plan() is PlanCustomPath callback of GpuHashJoin.
> It takes GpuHashJoinPath inherited from CustomPath that has multiple
> underlying scan/join paths.
> Once it is called back from the backend, it also calls create_plan_recurse()
> to make inner/outer plan nodes according to the paths.
>
> In the result, we can see the following query execution plan that CustomScan
> takes underlying scan plans.
>
> postgres=# EXPLAIN SELECT * FROM t0 NATURAL JOIN t1 NATURAL JOIN t2;
> QUERY PLAN
> ----------------------------------------------------------------------
> ------------
> Custom Scan (GpuHashJoin) (cost=2968.00..140120.31 rows=3970922
> width=143)
> Hash clause 1: (aid = aid)
> Hash clause 2: (bid = bid)
> Bulkload: On
> -> Custom Scan (GpuScan) on t0 (cost=500.00..57643.00 rows=4000009
> width=77)
> -> Custom Scan (MultiHash) (cost=734.00..734.00 rows=40000
> width=37)
> hash keys: aid
> nBatches: 1 Buckets: 46000 Memory Usage: 99.99%
> -> Seq Scan on t1 (cost=0.00..734.00 rows=40000 width=37)
> -> Custom Scan (MultiHash) (cost=734.00..734.00 rows=40000
> width=37)
> hash keys: bid
> nBatches: 1 Buckets: 46000 Memory Usage: 49.99%
> -> Seq Scan on t2 (cost=0.00..734.00 rows=40000
> width=37)
> (13 rows)
>
> Thanks,
> --
> NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei
> <kaigai(at)ak(dot)jp(dot)nec(dot)com>
>
>
> > -----Original Message-----
> > From: Robert Haas [mailto:robertmhaas(at)gmail(dot)com]
> > Sent: Thursday, January 15, 2015 2:07 AM
> > To: Kaigai Kouhei(海外 浩平)
> > Cc: Tom Lane; pgsql-hackers(at)postgreSQL(dot)org; Shigeru Hanada
> > Subject: ##freemail## Re: Custom/Foreign-Join-APIs (Re: [HACKERS]
> > [v9.5] Custom Plan API)
> >
> > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> wrote:
> > > When custom-scan node replaced a join-plan, it shall have at least
> > > two child plan-nodes. The callback handler of PlanCustomPath needs
> > > to be able to call create_plan_recurse() to transform the underlying
> > > path-nodes to plan-nodes, because this custom-scan node may take
> > > other built-in scan or sub-join nodes as its inner/outer input.
> > > In case of FDW, it shall kick any underlying scan relations to
> > > remote side, thus we may not expect ForeignScan has underlying plans...
> >
> > Do you have an example of this?
> >
> > --
> > Robert Haas
> > EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> > Company
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org) To make
> changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Attachment Content-Type Size
pgsql-v9.5-custom-join.v4.patch application/octet-stream 38.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2015-02-15 14:25:59 Re: Logical Replication Helpers WIP for discussion
Previous Message Michael Paquier 2015-02-15 11:55:28 Re: forward vs backward slashes in msvc build code