Re: join removal

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: join removal
Date: 2010-03-26 22:10:38
Message-ID: 22266.1269641438@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sun, Jul 19, 2009 at 10:56 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Yeah. Ideally this sort of thing would happen in prepjointree.c, but
>> we don't have nearly enough information at that stage.

> You've mentioned this point a couple of times - what is ideal about
> prepjointree?

Well, it's the place where we like to rearrange the join tree, and
dropping a join altogether certainly counts as that. We can't do it
there, at the moment anyway, for lack of supporting data --- but if
it were possible to do it at that time I think it'd be the cleanest
approach.

> Reading through the "optimizer functions" section of
> src/backend/optimizer/README, it seems like the earliest point at
> which we could do this would be just before the call to
> make_one_rel(). I think that would eliminate some redundant
> computation.

Maybe. It would also add a new pass over the join tree that, in
99% of cases, would make no useful contribution whatever. It's
possible that this would still be cheaper than a lot of failed calls
to join_is_removable, but I'm unconvinced --- we were able to make
the failure path through that pretty durn cheap for most simple cases.
The approach you're sketching still involves a combinatorial search
so I doubt it's going to be cheap.

I should maybe pause here a moment to say that my approach to
considering the cost of new planner optimizations is to focus on how
much time the added code will eat on queries where it fails to make any
useful contribution. If the optimization wins, then presumably you will
make back at execution time whatever it might have cost you to plan
(if this is debatable, you probably shouldn't be bothering with the idea
at all). So the pain will come from adding planning time on queries
where there isn't any runtime payoff; especially for something like join
removal, which only applies to a small minority of queries anyway.
Therefore I'm suspicious of adding new passes over the query structure
if they are only going to be used for low-probability wins.

> The problem with moving it back any further seems to be that we have
> to know which clauses are mergejoinable before we can do the necessary
> computations; I think flattening of the query tree has to already be
> done too.

Yeah. I had been thinking that we could build the RelOptInfo and
IndexOptInfo structs earlier, but really all of the
clause-classification work done by deconstruct_jointree is important
as well for this function's purposes, so it's not that easy to push
back to prepjointree :-(. I suspect that any such attempt would end
up requiring a massive rethinking of the order of operations in the
planner. Maybe we should do that sometime but I'm not eager for it.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joseph Adams 2010-03-27 02:19:14 Re: Proposal: access control jails (and introduction as aspiring GSoC student)
Previous Message Robert Haas 2010-03-26 21:36:03 Re: join removal