Re: POC, WIP: OR-clause support for indexes

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2016-01-11 03:16:23
Message-ID: CAKJS1f_NuyEbHfUeSqBFw0G7jmLM5RC5PKRqcAxO0vnJWCgRcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27 December 2015 at 07:04, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:

> I'd like to present OR-clause support for indexes. Although OR-clauses
> could be supported by bitmapOR index scan it isn't very effective and such
> scan lost any order existing in index. We (with Alexander Korotkov)
> presented results on Vienna's conference this year. In short, it provides
> performance improvement:
>
> EXPLAIN ANALYZE
> SELECT count(*) FROM tst WHERE id = 5 OR id = 500 OR id = 5000;
> me=0.080..0.267 rows=173 loops=1)
> Recheck Cond: ((id = 5) OR (id = 500) OR (id = 5000))
> Heap Blocks: exact=172
> -> Bitmap Index Scan on idx_gin (cost=0.00..57.50 rows=15000
> width=0) (actual time=0.059..0.059 rows=147 loops=1)
> Index Cond: ((id = 5) OR (id = 500) OR (id = 5000))
> Planning time: 0.077 ms
> Execution time: 0.308 ms <-------
> QUERY PLAN
>
> -----------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=51180.53..51180.54 rows=1 width=0) (actual
> time=796.766..796.766 rows=1 loops=1)
> -> Index Only Scan using idx_btree on tst (cost=0.42..51180.40
> rows=55 width=0) (actual time=0.444..796.736 rows=173 loops=1)
> Filter: ((id = 5) OR (id = 500) OR (id = 5000))
> Rows Removed by Filter: 999829
> Heap Fetches: 1000002
> Planning time: 0.087 ms
> Execution time: 796.798 ms <------
> QUERY PLAN
>
> -------------------------------------------------------------------------------------------------------------
> Aggregate (cost=21925.63..21925.64 rows=1 width=0) (actual
> time=160.412..160.412 rows=1 loops=1)
> -> Seq Scan on tst (cost=0.00..21925.03 rows=237 width=0) (actual
> time=0.535..160.362 rows=175 loops=1)
> Filter: ((id = 5) OR (id = 500) OR (id = 5000))
> Rows Removed by Filter: 999827
> Planning time: 0.459 ms
> Execution time: 160.451 ms
>
>
> It also could work together with KNN feature of GiST and in this case
> performance improvement could be up to several orders of magnitude, in
> artificial example it was 37000 times faster.
>
> Not all indexes can support oR-clause, patch adds support to GIN, GiST
> and BRIN indexes. pg_am table is extended for adding amcanorclause column
> which indicates possibility of executing of OR-clause by index.
>
> indexqual and indexqualorig doesn't contain implicitly-ANDed list of
> index qual expressions, now that lists could contain OR RestrictionInfo.
> Actually, the patch just tries to convert BitmapOr node to IndexScan or
> IndexOnlyScan. Thats significantly simplifies logic to find possible
> clause's list for index.
> Index always gets a array of ScanKey but for indexes which support
> OR-clauses
> array of ScanKey is actually exection tree in reversed polish notation
> form. Transformation is done in ExecInitIndexScan().
>
> The problems on the way which I see for now:
> 1 Calculating cost. Right now it's just a simple transformation of costs
> computed for BitmapOr path. I'd like to hope that's possible and so index's
> estimation function could be non-touched. So, they could believe that all
> clauses are implicitly-ANDed
> 2 I'd like to add such support to btree but it seems that it should be a
> separated patch. Btree search algorithm doesn't use any kind of stack of
> pages and algorithm to walk over btree doesn't clear for me for now.
> 3 I could miss some places which still assumes implicitly-ANDed list of
> clauses although regression tests passes fine.
>
> Hope, hackers will not have an strong objections to do that. But obviously
> patch
> requires further work and I'd like to see comments, suggestions and
> recommendations. Thank you.

Hi,

I'd like to see comments too! but more so in the code. :) I've had a look
over this, and it seems like a great area in which we could improve on, and
your reported performance improvements are certainly very interesting too.
However I'm finding the code rather hard to follow, which might be a
combination of my lack of familiarity with the index code, but more likely
it's the lack of comments to explain what's going on. Let's just take 1
function as an example:

Here there's not a single comment, so I'm just going to try to work out
what's going on based on the code.

+static void
+compileScanKeys(IndexScanDesc scan)
+{
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+ int *stack,
+ stackPos = -1,
+ i;
+
+ if (scan->numberOfKeys <= 1 || so->useExec == false)
+ return;
+
+ Assert(scan->numberOfKeys >=3);

Why can numberOfKeys never be 2? I looked at what calls this and I can't
really work it out. I'm really also not sure what useExec means as there's
no comment in that struct member, and what if numberOfKeys == 1 and useExec
== false, won't this Assert() fail? If that's not a possible situation then
why not?

+
+ if (so->leftArgs != NULL)
+ return;
+
+ so->leftArgs = MemoryContextAlloc(so->giststate->scanCxt,
+ sizeof(*so->leftArgs) * scan->numberOfKeys);
+ so->rightArgs = MemoryContextAlloc(so->giststate->scanCxt,
+ sizeof(*so->rightArgs) * scan->numberOfKeys);
+
+ stack = palloc(sizeof(*stack) * scan->numberOfKeys);
+
+ for(i=0; i<scan->numberOfKeys; i++)
+ {
+ ScanKey key = scan->keyData + i;

Is there a reason not to use keyData[i]; ?

+
+ if (stackPos >= 0 && (key->sk_flags & (SK_OR | SK_AND)))
+ {
+ Assert(stackPos >= 1 && stackPos < scan->numberOfKeys);

stackPos >= 1? This seems unnecessary and confusing as the if test surely
makes that impossible.
+
+ so->leftArgs[i] = stack[stackPos - 1];

Something is broken here as stackPos can be 0 (going by the if() not the
Assert()), therefore that's stack[-1].

+ so->rightArgs[i] = stack[stackPos];
+ stackPos--;
+ }
+ else
+ {
+ stackPos++;
+ }
+

stackPos is initialised to -1, so this appears to always skip the first
element of the keyData array. If that's really the intention, then wouldn't
it be better to just make the initial condition of the for() look i = 1 ?

+ stack[stackPos] = i;
+ }
+
+ Assert(stackPos == 0);
+ pfree(stack);
+}

I'd like to review more, but it feels like a job that's more difficult than
it needs to be due to lack of comments.

Would it be possible to update the patch to try and explain things a little
better?

Many thanks

David

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2016-01-11 03:31:57 Re: No Issue Tracker - Say it Ain't So!
Previous Message Robert Haas 2016-01-11 02:32:02 Re: Apparently deprecated code in planner.c