Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Date: 2015-03-14 00:02:03
Message-ID: 28596.1426291323@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Mar 13, 2015 at 2:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I don't object to the concept, but I think that is a pretty bad place
>> to put the hook call: add_paths_to_joinrel is typically called multiple
>> (perhaps *many*) times per joinrel and thus this placement would force
>> any user of the hook to do a lot of repetitive work.

> Interesting point. I guess the question is whether a some or all
> callers are going to actually *want* a separate call for each
> invocation of add_paths_to_joinrel(), or whether they'll be happy to
> operate on the otherwise-complete path list.

Hmm. You're right, it's certainly possible that some users would like to
operate on each possible pair of input relations, rather than considering
the joinrel "as a whole". Maybe we need two hooks, one like your patch
and one like I suggested.

> ... But for joinrels, it's not so
> easy to rule out, say, a hash-join here. Neither hook placement is
> much good for that; the path you want to get rid of may have already
> dominated paths you want to keep.

I don't particularly buy that line of argument. If a path has been
deleted because it was dominated by another, and you are unhappy about
that decision, then a hook of this sort is not the appropriate solution;
you need to be going and fixing the cost estimates that you think are
wrong. (This gets back to the point I keep making that I don't actually
believe you can do anything very useful with these hooks; anything
interesting is probably going to involve more fundamental changes to the
planner.)

> I think the foreign data wrapper join pushdown case, which also aims
> to substitute a scan for a join, is interesting to think about, even
> though it's likely to be handled by a new FDW method instead of via
> the hook. Where should the FDW method get called from? Currently,
> the FDW method in KaiGai's patch is GetForeignJoinPaths, and that gets
> called from add_paths_to_joinrel(). The patch at
> http://www.postgresql.org/message-id/CAEZqfEfy7p=uRpwN-Q-NNgzb8kwHbfqF82YSb9ztFZG7zN64Xw@mail.gmail.com
> uses that to implement join pushdown in postgres_fdw; if you have A
> JOIN B JOIN C all on server X, we'll notice that the join with A and B
> can be turned into a foreign scan on A JOIN B, and similarly for A-C
> and B-C. Then, if it turns out that the cheapest path for A-B is the
> foreign join, and the cheapest path for C is a foreign scan, we'll
> arrive at the idea of a foreign scan on A-B-C, and we'll realize the
> same thing in each of the other combinations as well. So, eventually
> the foreign join gets pushed down.

But this is in fact exactly the sort of case where you should not
rediscover all that over again for each pair of input rels. "Do all
these baserels belong to the same foreign server" is a question that need
only be considered once per joinrel. Not that that matters for this
hook, because as you say we're not doing foreign-server support through
this hook, but I think it's a fine example of why you'd want a single
call per joinrel.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-03-14 00:54:46 Re: Re: Abbreviated keys for Datum tuplesort
Previous Message Tomas Vondra 2015-03-13 23:39:30 xloginsert.c hole_length warning on gcc 4.8.3