Re: [v9.5] Custom Plan API

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Stephen Frost" <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-09-12 00:40:01
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8FDAA9B@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Thu, Sep 11, 2014 at 11:24 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> wrote:
> >> Don't the changes to src/backend/optimizer/plan/createplan.c belong
> >> in patch #2?
> >>
> > The borderline between #1 and #2 is little bit bogus. So, I moved most
> > of portion into #1, however, invocation of InitCustomScan (that is a
> > callback in CustomPlanMethod) in create_custom_plan() is still in #2.
>
> Eh, create_custom_scan() certainly looks like it is in #1 from here, or
> at least part of it is. It calculates tlist and clauses and then does
> nothing with them. That clearly can't be the right division.
>
> I think it would make sense to have create_custom_scan() compute tlist and
> clauses first, and then pass those to CreateCustomPlan(). Then you don't
> need a separate InitCustomScan() - which is misnamed anyway, since it has
> nothing to do with ExecInitCustomScan().
>
The only reason why I put separate hooks here is, create_custom_scan() needs
to know exact size of the CustomScan node (including private fields), however,
it is helpful for extensions to kick its callback to initialize the node
next to the common initialization stuff.

If we have a static field to inform exact size of the data-type inherited
from the CustomScan in CustomPathMethod, it may be able to eliminate the
CreateCustomPlan(). One downside is, extension needs to register multiple
CustomPath table for each custom-plan node to be populated later.
So, my preference is the current design rather than static approach.

Regarding to the naming, how about GetCustomScan() instead of InitCustomScan()?
It follows the manner in create_foreignscan_plan().

> > OK, I revised. Now custom-scan assumes it has a particular valid
> > relation to be scanned, so no code path with scanrelid == 0 at this moment.
> >
> > Let us revisit this scenario when custom-scan replaces relation-joins.
> > In this case, custom-scan will not be associated with a particular
> > base- relation, thus it needs to admit a custom-scan node with scanrelid
> == 0.
>
> Yeah, I guess the question there is whether we'll want let CustomScan have
> scanrelid == 0 or require that CustomJoin be used there instead.
>
Right now, I cannot imagine a use case that requires individual CustomJoin
node because CustomScan with scanrelid==0 (that performs like custom-plan
rather than custom-scan in actually) is sufficient.

If a CustomScan gets chosen instead of built-in join logics, it shall looks
like a relation scan on the virtual one that is consists of two underlying
relation. Callbacks of the CustomScan has a responsibility to join underlying
relations; that is invisible from the core executor.

It seems to me CustomScan with scanrelid==0 is sufficient to implement
an alternative logic on relation joins, don't need an individual node
from the standpoint of executor.

> >> Why can't the Custom(GpuHashJoin) node build the hash table
> >> internally instead of using a separate node?
> >>
> > It's possible, however, it prevents to check sub-plans using EXPLAIN
> > if we manage inner-plans internally. So, I'd like to have a separate
> > node being connected to the inner-plan.
>
> Isn't that just a matter of letting the EXPLAIN code print more stuff?
> Why can't it?
>
My GpuHashJoin takes multiple relations to load them a hash-table.
On the other hand, Plan node can have two underlying relations at most
(inner/outer). Outer-side is occupied by the larger relation, so it
needs to make multiple relations visible using inner-branch.
If CustomScan can has a list of multiple underlying plan-nodes, like
Append node, it can represent the structure above in straightforward
way, but I'm uncertain which is the better design.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2014-09-12 00:48:53 Re: Suspicious check (src/backend/access/gin/gindatapage.c)
Previous Message Peter Geoghegan 2014-09-12 00:34:45 Re: B-Tree support function number 3 (strxfrm() optimization)