Re: Proposal: scan key push down to heap [WIP]

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: scan key push down to heap [WIP]
Date: 2016-11-19 13:18:12
Message-ID: CAFiTN-t5=A-9AHQsh6Wvg9_f8LwNFNVn0kP4K=v7XLR9X=8rFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 14, 2016 at 9:44 PM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>> Also, what if we abandoned the idea of pushing qual evaluation all the
>> way down into the heap and just tried to do HeapKeyTest in SeqNext
>> itself? Would that be almost as fast, or would it give up most of the
>> benefits?
> This we can definitely test. I will test and post the data.
>
> Thanks for the suggestion.

Here are some results with this approach...

create table test(a int, b varchar, c varchar, d int);
insert into test values(generate_series(1,10000000), repeat('x', 30),
repeat('y', 30), generate_series(1,10000000));
analyze test;

query: explain analyze select * from test where a < $1;

selectivity head patch1 patch2 patch3
10.00% 840 460 582 689
20.00% 885 563 665 752
50.00% 1076 786 871 910
80.00% 1247 988 1055 1099
100.00% 1386 1123 1193 1237

patch1: Original patch (heap_scankey_pushdown_v1.patch), only
supported for fixed length datatype and use heap_getattr.

patch2: Switches memory context in HeapKeyTest + Store tuple in slot
and use slot_getattr instead of heap_getattr.

patch3: call HeapKeyTest in SeqNext after storing slot, and also
switch memory context before calling HeapKeyTest.

Summary: (performance data)
----------------------------------------
1. At 10% selectivity patch1 shows > 40% gain which reduced to 30% in
patch2 and finally it drops to 17% in patch3.
2. I think patch1 wins over patch2, because patch1 avoid call to ExecStoreTuple.
3. Patch2 wins over patch3 because patch2 can reject tuple in
heapgettup_pagemode whereas patch3 can do it in SeqNext, so patch2 can
avoid some function calls instructions.

Summary (various patches and problems)
-----------------------------------------------------
Patch1:
problem1: This approach has performance problem in some cases (quals
on many columns of the table + high selectivity (>50%)).
problem2: HeapKeyTest uses ExecutorContext so we need to block any
datatype which needs memory allocation during eval functions.

Patch2: Patch2 solves both the problems of patch1, but exposes
executor items to heap and it's not a good design.

Patch3: This solves all the problems exists in patch1+patch2, but
performance is significantly less.

I haven't yet tested patch3 with TPCH, I will do that once machine is available.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Karl O. Pinc 2016-11-19 13:51:51 Re: Patch to implement pg_current_logfile() function
Previous Message Andreas Seltenreich 2016-11-19 12:46:35 [sqlsmith] Crash on GUC serialization