Re: 9.3 Pre-proposal: Range Merge Join

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 9.3 Pre-proposal: Range Merge Join
Date: 2012-04-19 06:25:45
Message-ID: 1334816745.5487.22.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2012-04-17 at 14:24 -0400, Robert Haas wrote:
> I thought Jeff was parenthetically complaining about cases like A LEFT
> JOIN (B INNER JOIN C ON b.y = c.y) ON a.x && b.x. That presumably
> would require the parameterized-path stuff to have any chance of doing
> partial index scans over B. However, I understand that's not the main
> issue here.

To take the mystery out of it, I was talking about any case where an
index scan is impossible or impractical. For instance, let's say the
ranges are computed values. Just to make it really impossible, let's say
the ranges are computed from columns in two different tables joined in a
subquery.

But yes, the ability of the planner to find the plan is also an issue
(hopefully less of one with the recent improvements).

> One thing that I think needs some analysis is when the range join idea
> is better or worse than a nested loop with inner index-scan, because
> potentially those are the options the planner has to choose between,
> and the costing model had better know enough to make the right thing
> happen. It strikes me that the nested loop with inner index-scan is
> likely to be a win when there are large chunks of the indexed relation
> that the nestloop never needs to visit at all - imagine small JOIN big
> ON small.a && big.a, for example. I suppose the really interesting
> question is how much we can save when the entirety of both relations
> has to be visited anyway - it seems promising, but I guess we won't
> know for sure without testing it.

Right, I will need to come up with a prototype that can at least test
the executor piece. I suspect that the plan choice won't be all that
different from an ordinary index nestloop versus mergejoin case, but
with much worse cardinality estimates to work with.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2012-04-19 06:55:16 Re: SPGiST versus hot standby - question about conflict resolution rules
Previous Message Tom Lane 2012-04-19 05:33:13 Re: Improving our clauseless-join heuristics