Re: Merge Join with an Index Optimization

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Michael Malis <michaelmalis2(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Merge Join with an Index Optimization
Date: 2016-09-11 23:20:20
Message-ID: CAEepm=3LYEfwPnqe6B0-bKS6vJ8MEBB8ZRLFCSL=qdfq-qb7dw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 11, 2016 at 11:07 PM, Michael Malis <michaelmalis2(at)gmail(dot)com> wrote:
> On Sun, Sep 11, 2016 at 9:20 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> Michael Malis <michaelmalis2(at)gmail(dot)com> writes:
>> >> As I understand it, a merge join will currently read all tuples from
>> >> both
>> >> subqueries (besides early termination). I believe it should be possible
>> >> to
>> >> take advantages of the indexes on one or both of the tables being read
>> >> from
>> >> to skip a large number of tuples that would currently be read.
>> >>
>> > IIUC, what you're proposing is to replace "read past some tuples in the
>> > index" with "re-descend the index btree". Yes, that would win if it
>
>
> Roughly yes, although is it necessary to rescan the btree from the top? Is
> it not possible to "resume" the scan from the previous location?
>
>>
>> > You might want to troll the archives for past discussions of index skip
>> > scan, which is more or less the same idea (could possibly be exactly the
>> > same idea, depending on how it's implemented).
>
>
> After searching a bit, all I was able to find was this. It looks like
> someone had a rough patch of applying a similar idea to DISTINCT. I can't
> tell what happened to it, but in the patch they mention that it should be
> possible to do a "local traversal ie from the current page".

That was me. Yeah, reserving the option for traversing other than
from the root was my reason for wanting to introduce a skip operation
to the IM interface as described in the later email in that thread[2],
rather than doing it via rescanning (though maybe it's not strictly
necessary, I don't know). There are a whole lot of interesting
execution tricks that could be enabled by btree skipping (see Oracle,
DB2, MySQL, SQLite). DISTINCT was the simplest thing I could think of
where literally every other RDBMS beats us because we don't have it.
But that was nowhere near a useful patch, just an experiment. I hope
I can get back to looking at it for 10...

[2] https://www.postgresql.org/message-id/CADLWmXWALK8NPZqdnRQiPnrzAnic7NxYKynrkzO_vxYr8enWww%40mail.gmail.com

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-09-11 23:39:54 Re: Tuplesort merge pre-reading
Previous Message Vitaly Burovoy 2016-09-11 23:11:25 Re: [REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...