Skip site navigation (1) Skip section navigation (2)

POC, WIP: OR-clause support for indexes

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: POC, WIP: OR-clause support for indexes
Date: 2015-12-26 18:04:58
Message-ID: 567ED6CA.2040504@sigaev.ru (view raw, whole thread or download thread mbox)
Thread:
Lists: pgsql-hackers
I'd like to present OR-clause support for indexes. Although OR-clauses could be 
supported by bitmapOR index scan it isn't very effective and such scan lost any 
order existing in index. We (with Alexander Korotkov) presented results on 
Vienna's conference this year. In short, it provides performance improvement:

EXPLAIN ANALYZE
SELECT count(*) FROM tst WHERE id = 5 OR id = 500 OR id = 5000;
me=0.080..0.267 rows=173 loops=1)
          Recheck Cond: ((id = 5) OR (id = 500) OR (id = 5000))
          Heap Blocks: exact=172
          ->  Bitmap Index Scan on idx_gin  (cost=0.00..57.50 rows=15000 
width=0) (actual time=0.059..0.059 rows=147 loops=1)
                Index Cond: ((id = 5) OR (id = 500) OR (id = 5000))
  Planning time: 0.077 ms
  Execution time: 0.308 ms   <-------
                                                             QUERY PLAN 

-----------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=51180.53..51180.54 rows=1 width=0) (actual 
time=796.766..796.766 rows=1 loops=1)
    ->  Index Only Scan using idx_btree on tst  (cost=0.42..51180.40 rows=55 
width=0) (actual time=0.444..796.736 rows=173 loops=1)
          Filter: ((id = 5) OR (id = 500) OR (id = 5000))
          Rows Removed by Filter: 999829
          Heap Fetches: 1000002
  Planning time: 0.087 ms
  Execution time: 796.798 ms  <------
                                                 QUERY PLAN 

-------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=21925.63..21925.64 rows=1 width=0) (actual 
time=160.412..160.412 rows=1 loops=1)
    ->  Seq Scan on tst  (cost=0.00..21925.03 rows=237 width=0) (actual 
time=0.535..160.362 rows=175 loops=1)
          Filter: ((id = 5) OR (id = 500) OR (id = 5000))
          Rows Removed by Filter: 999827
  Planning time: 0.459 ms
  Execution time: 160.451 ms


It also could work together with KNN feature of GiST and in this case 
performance improvement could be up to several orders of magnitude, in 
artificial example it was 37000 times faster.

Not all  indexes can support oR-clause, patch adds support to GIN, GiST and BRIN 
indexes. pg_am table is extended for adding amcanorclause column which indicates 
possibility of executing of OR-clause by index.

  indexqual and indexqualorig doesn't contain implicitly-ANDed list of index 
qual expressions, now that lists could contain OR RestrictionInfo. Actually, the 
patch just tries to convert BitmapOr node to IndexScan or IndexOnlyScan. Thats 
significantly simplifies logic to find possible clause's list for index.
Index always gets a array of ScanKey but for indexes which support OR-clauses
array  of ScanKey is actually exection tree in reversed polish notation form. 
Transformation is done in ExecInitIndexScan().

The problems on the way which I see for now:
1 Calculating cost. Right now it's just a simple transformation of costs 
computed for BitmapOr path. I'd like to hope that's possible and so index's 
estimation function could be non-touched. So, they could believe that all 
clauses are implicitly-ANDed
2 I'd like to add such support to btree but it seems that it should be a 
separated patch. Btree search algorithm doesn't use any kind of stack of pages 
and algorithm to walk over btree doesn't clear for me for now.
3 I could miss some places which still assumes  implicitly-ANDed list of clauses 
although regression tests passes fine.

Hope, hackers will not have an strong objections to do that. But obviously patch
requires further work and I'd like to see comments, suggestions and 
recommendations. Thank you.


-- 
Teodor Sigaev                                   E-mail: teodor(at)sigaev(dot)ru
                                                    WWW: http://www.sigaev.ru/

Attachment: index_or-1.patch.gz
Description: application/x-gzip (19.3 KB)

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2015-12-26 18:11:52
Subject: Re: MergeAttributes type (mod) conflict error detail
Previous:From: Tom LaneDate: 2015-12-26 16:27:20
Subject: Re: 9.5rc1 brin_summarize_new_values

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group