| From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> | 
|---|---|
| To: | Michael Malis <michaelmalis2(at)gmail(dot)com> | 
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Merge Join with an Index Optimization | 
| Date: | 2016-09-11 23:20:20 | 
| Message-ID: | CAEepm=3LYEfwPnqe6B0-bKS6vJ8MEBB8ZRLFCSL=qdfq-qb7dw@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Sun, Sep 11, 2016 at 11:07 PM, Michael Malis <michaelmalis2(at)gmail(dot)com> wrote:
> On Sun, Sep 11, 2016 at 9:20 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> Michael Malis <michaelmalis2(at)gmail(dot)com> writes:
>> >> As I understand it, a merge join will currently read all tuples from
>> >> both
>> >> subqueries (besides early termination). I believe it should be possible
>> >> to
>> >> take advantages of the indexes on one or both of the tables being read
>> >> from
>> >> to skip a large number of tuples that would currently be read.
>> >>
>> > IIUC, what you're proposing is to replace "read past some tuples in the
>> > index" with "re-descend the index btree".  Yes, that would win if it
>
>
> Roughly yes, although is it necessary to rescan the btree from the top? Is
> it not possible to "resume" the scan from the previous location?
>
>>
>> > You might want to troll the archives for past discussions of index skip
>> > scan, which is more or less the same idea (could possibly be exactly the
>> > same idea, depending on how it's implemented).
>
>
> After searching a bit, all I was able to find was this. It looks like
> someone had a rough patch of applying a similar idea to DISTINCT. I can't
> tell what happened to it, but in the patch they mention that it should be
> possible to do a "local traversal ie from the current page".
That was me.  Yeah, reserving the option for traversing other than
from the root was my reason for wanting to introduce a skip operation
to the IM interface as described in the later email in that thread[2],
rather than doing it via rescanning (though maybe it's not strictly
necessary, I don't know).  There are a whole lot of interesting
execution tricks that could be enabled by btree skipping (see Oracle,
DB2, MySQL, SQLite).  DISTINCT was the simplest thing I could think of
where literally every other RDBMS beats us because we don't have it.
But that was nowhere near a useful patch, just an experiment.  I hope
I can get back to looking at it for 10...
-- 
Thomas Munro
http://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Geoghegan | 2016-09-11 23:39:54 | Re: Tuplesort merge pre-reading | 
| Previous Message | Vitaly Burovoy | 2016-09-11 23:11:25 | Re: [REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ... |