Re: Proposal: scan key push down to heap [WIP]

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: scan key push down to heap [WIP]
Date: 2016-11-02 00:31:29
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8012514CB@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Dilip Kumar
> Sent: Saturday, October 29, 2016 3:48 PM
> To: Andres Freund
> Cc: Tom Lane; Alvaro Herrera; pgsql-hackers
> Subject: Re: [HACKERS] Proposal: scan key push down to heap [WIP]
>
> On Wed, Oct 26, 2016 at 12:01 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > The gains are quite noticeable in some cases. So if we can make it work
> > without noticeable downsides...
> >
> > What I'm worried about though is that this, afaics, will quite
> > noticeably *increase* total cost in cases with a noticeable number of
> > columns and a not that selective qual. The reason for that being that
> > HeapKeyTest() uses heap_getattr(), whereas upper layers use
> > slot_getattr(). The latter "caches" repeated deforms, the former
> > doesn't... That'll lead to deforming being essentially done twice, and
> > it's quite often already a major cost of query processing.
>
> What about putting slot reference inside HeapScanDesc ?. I know it
> will make ,heap layer use executor structure but just a thought.
>
> I have quickly hacked this way where we use slot reference in
> HeapScanDesc and directly use
> slot_getattr inside HeapKeyTest (only if we have valid slot otherwise
> use _heap_getattr) and measure the worst case performance (what you
> have mentioned above.)
>
> My Test: (21 column table with varchar in beginning + qual is on last
> few column + varying selectivity )
>
> postgres=# \d test
> Table "public.test"
> Column | Type | Modifiers
> --------+-------------------+-----------
> f1 | integer |
> f2 | character varying |
> f3 | integer |
> f4 | integer |
> f5 | integer |
> f6 | integer |
> f7 | integer |
> f8 | integer |
> f9 | integer |
> f10 | integer |
> f11 | integer |
> f12 | integer |
> f13 | integer |
> f14 | integer |
> f15 | integer |
> f16 | integer |
> f17 | integer |
> f18 | integer |
> f19 | integer |
> f20 | integer |
> f21 | integer |
>
> tuple count : 10000000 (10 Million)
> explain analyze select * from test where f21< $1 and f20 < $1 and f19
> < $1 and f15 < $1 and f10 < $1; ($1 vary from 1Million to 1Million).
>
> Target code base:
> -----------------------
> 1. Head
> 2. Heap_scankey_pushdown_v1
> 3. My hack for keeping slot reference in HeapScanDesc
> (v1+use_slot_in_HeapKeyTest)
>
> Result:
> Selectivity Head scan_key_pushdown_v1 v1+use_slot_in_HeapKeyTest
> 0.1 3880 2980 2747
> 0.2 4041 3187 2914
> 0.5 5051 4921 3626
> 0.8 5378 7296 3879
> 1.0 6161 8525 4575
>
> Performance graph is attached in the mail..
>
> Observation:
> ----------------
> 1. Heap_scankey_pushdown_v1, start degrading after very high
> selectivity (this behaviour is only visible if table have 20 or more
> columns, I tested with 10 columns but with that I did not see any
> regression in v1).
>
> 2. (v1+use_slot_in_HeapKeyTest) is always winner, even at very high selectivity.
>
Prior to this interface change, it may be a starting point to restrict scan key
pushdown only when OpExpr references the column with static attcacheoff.
This type of column does not need walks on tuples from the head, thus, tuple
deforming cost will not be a downside.

By the way, I'm a bit skeptical whether this enhancement is really beneficial
than works for this enhancement, because we can now easily increase the number
of processor cores to run seq-scan with qualifier, especially, when it has high
selectivity.
How about your thought?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-11-02 01:09:18 Re: Hash Indexes
Previous Message Vik Fearing 2016-11-02 00:01:16 Re: Patch to implement pg_current_logfile() function