Hooking at standard_join_search (Was: Re: Foreign join pushdown vs EvalPlanQual)

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, 花田茂 <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Hooking at standard_join_search (Was: Re: Foreign join pushdown vs EvalPlanQual)
Date: 2015-09-02 12:55:42
Message-ID: 55E6F1CE.6050205@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015/08/01 23:25, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> The problem that was bothering us (or at least what was bothering me)
>> is that the PlannerInfo provides only a list of SpecialJoinInfo
>> structures, which don't directly give you the original join order. In
>> fact, min_righthand and min_lefthand are intended to constraint the
>> *possible* join orders, and are deliberately designed *not* to specify
>> a single join order. If you're sending a query to a remote PostgreSQL
>> node, you don't want to know what all the possible join orders are;
>> it's the remote side's job to plan the query. You do, however, need
>> an easy way to identify one join order that you can use to construct a
>> query. It didn't seem easy to do that without duplicating
>> make_join_rel(), which seemed like a bad idea.

> In principle it seems like you could traverse root->parse->jointree
> as a guide to reconstructing the original syntactic structure; though
> I'm not sure how hard it would be to ignore the parts of that tree
> that correspond to relations you're not shipping.

I'll investigate this.

>> But maybe there's a good way to do it. Tom wasn't crazy about this
>> hook both because of the frequency of calls and also because of the
>> long argument list. I think those concerns are legitimate; I just
>> couldn't see how to make the other way work.

> In my vision you probably really only want one call per build_join_rel
> event (that is, per construction of a new RelOptInfo), not per
> make_join_rel event.
>
> It's possible that an FDW that wants to handle joins but is not talking to
> a remote query planner would need to grovel through all the join ordering
> possibilities individually, and then maybe hooking at make_join_rel is
> sensible rather than having to reinvent that logic. But I'd want to see a
> concrete use-case first, and I certainly don't think that that's the main
> case to design the API around.

I'd vote for hooking at standard_join_search. Here is a use-case:

* When the callback routine is hooked at that funcition (right after
allpaths.c:1817), an FDW would collect lists of all the available
local-join-path orderings and parameterizations by looking at each path
in rel->pathlist (if the join rel only contains foreign tables that all
belong to the same foreign server).

* Then the FDW would use these as a heuristic to indcate which sort
orderings and parameterizations we should build foreign-join paths for.
(These would be also used as alternative paths for EvalPlanQual
handling, as discussed upthread.) It seems reasonable to me to consider
pushed-down versions of these paths as first candidates, but
foreign-join paths to build are not limited to such ones. The FDW is
allowed to consider any foreign-join paths as long as their alternative
paths are provided.

IMO one thing to consider for the postgres_fdw case would be the
use_remote_estimate option. In the case when the option is true, I
think we should perform remote EXPLAINs for pushed-down-join queries to
obtain cost estimates. But it would require too much time to do that
for each of the possible join rel. So, I think it would be better to
put off the callback routine's work as long as possible. I think that
that could probably be done by looking at rel->joininfo,
root->join_info_list and/or something like that. (When considering a
join rel A JOIN B both on the same foreign server, for example, we can
skip the routine's work if the join rel proved to be joined with C on
the same foreign server by looking at rel->joininfo, for example.)
Maybe I'm missing something, though.

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2015-09-02 12:56:03 Re: On-demand running query plans using auto_explain and signals
Previous Message Etsuro Fujita 2015-09-02 12:55:04 Re: Horizontal scalability/sharding