Re: Improving our clauseless-join heuristics

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving our clauseless-join heuristics
Date: 2012-04-16 20:35:27
Message-ID: 4920.1334608527@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Kapila <amit(dot)kapila(at)huawei(dot)com> writes:
> For this kind of query, currently (referring 9.0.3 code) also it considers
> join of b,c and b,d.
> As there is no join clause between b,c,d so it will go in path of
> make_rels_by_clauseless_joins() where it considers join of b,c and b,d.

> In this kind of query, if the suggestion by me in below mail is followed,
> then it will consider joining a,b a,c a,d at level-2 in function
> make_rels_by_clause_joins() which it currently doesn't do which may generate
> useless join paths.
> However in real-world scenario's this kind of queries where 2 cols of
> different tables are
> used in one side expression (b.y + c.z) of where clause may be less.

> On the other hand, when we come to consider d, it will have no join
> clauses so we will consider joining it to each other rel in turn.

> When it come to consider d, as at level -2 it only consider later rels. So
> it should not consider joining with each other rel.

I might still be misunderstanding, but I think what you are suggesting
is that in the loop in make_rels_by_clause_joins, if we find that the
old_rel doesn't have a join clause/restriction with the current
other_rel, we check to see whether other_rel has any join clauses at
all, and force the join to occur anyway if it doesn't.

I can't really get excited about doing it that way instead of the
current way. In the first place, it seems to miss the need to
clauseless-join two relations when neither of them have any join
clauses, for instance plain old "SELECT * FROM a, b". So you still need
something like the make_rels_by_clauseless_joins code path, and it's
not entirely clear how to avoid duplicated work there. In the second
place, instead of N tests to see whether a rel lacks any join clauses,
we'd now have something approaching O(N^2) such tests, in the typical
case where most base rels are directly joined to only a few other rels.
So it seems to make things slower for little obvious benefit.

In general, queries with join-clause-less rels are pretty uncommon,
so I don't want to introduce extra work into make_rels_by_clause_joins
to handle the case.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2012-04-16 20:50:36 Re: 9.3 Pre-proposal: Range Merge Join
Previous Message Greg Smith 2012-04-16 20:23:09 Re: Last gasp