Re: 9.3 Pre-proposal: Range Merge Join

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: 9.3 Pre-proposal: Range Merge Join
Date: 2012-04-17 18:03:12
Message-ID: CA+TgmoYSKvnGDB7ZRHaus-AEHq_dLUKkGxOcgALRpMr=it1PHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 16, 2012 at 1:43 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Mon, 2012-04-16 at 02:52 -0400, Tom Lane wrote:
>> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
>> >  1. Order the ranges on both sides by the lower bound, then upper bound.
>> > Empty ranges can be excluded entirely.
>> >  2. Left := first range on left, Right := first range on right
>> >  3. If Left or Right is empty, terminate.
>> >  4. If lower(Left) > upper(Right), discard Right, goto 2
>> >  5. If lower(Right) > upper(Left), discard Left, goto 2
>> >  6. return (Left, Right) as joined tuple
>> >  7. Right := next range on right
>> >  8. goto 3
>>
>> This is surely not correct in detail.  As written, it will be impossible
>> for any tuple on the right side to be joined to more than one left-side
>> tuple.  You will need something analogous to the mark-and-restore
>> rewinding logic in standard mergejoin to make this work.
>
> Every time you discard a tuple on the left, you go to step 2, which
> rewinds the right side back to the first non-discarded tuple.
>
> So, implemented using mark and restore:
>
>  * start off with the marks at the first tuple on each side
>  * "discard" means move the mark down a tuple
>  * setting it back to the first range means restoring to the mark
>  * going to the next range (step 7) just means getting another
>    tuple, without changing the mark
>
> Only one side really needs the mark and restore logic, but it was easier
> to write the pseudocode in a symmetrical way (except step 7).

I'm actually not sure these are equivalent formulations. Suppose one
side has [i,i] where i ranges from 1 to 10000000 and the other side
the exact same thing plus [1,10000000]. That one really big range
will come up second on the right side, and you will not be able to
discard it until you reach the end of the join. If you just keep
rewinding the right side, you're going to end up with O(n^2) behavior,
whereas if you can discard tuples "from the middle" on the right side,
then you will get O(n) behavior, which is a big difference. In other
words, in your original algorithm the tuples that you discard in step
4 or 5 need not be the first remaining tuple on whichever side of the
join we're talking about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-04-17 18:24:02 Re: 9.3 Pre-proposal: Range Merge Join
Previous Message Greg Smith 2012-04-17 17:59:09 Re: Bug tracker tool we need