Re: POC, WIP: OR-clause support for indexes

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, teodor(at)sigaev(dot)ru, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2024-02-13 00:00:00
Message-ID: CACJufxFSfSGZpmVt_98SRojdugW2s9H-qYm2sYr0A3o-xPea0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 8, 2024 at 1:34 PM Andrei Lepikhov
<a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> A couple of questions:
> 1. As I see, transformAExprIn uses the same logic as we invented but
> allows composite and domain types. Could you add a comment explaining
> why we forbid row types in general, in contrast to the transformAExprIn
> routine?
> 2. Could you provide the tests to check issues covered by the recent (in
> v.15) changes?
>
> Patch 0001-* in the attachment incorporates changes induced by Jian's
> notes from [1].
> Patch 0002-* contains a transformation of the SAOP clause, which allows
> the optimizer to utilize partial indexes if they cover all values in
> this array. Also, it is an answer to Alexander's note [2] on performance
> degradation. This first version may be a bit raw, but I need your
> opinion: Does it resolve the issue?
>

+ newa = makeNode(ArrayExpr);
+ /* array_collid will be set by parse_collate.c */
+ newa->element_typeid = scalar_type;
+ newa->array_typeid = array_type;
+ newa->multidims = false;
+ newa->elements = aexprs;
+ newa->location = -1;

I am confused by the comments `array_collid will be set by
parse_collate.c`, can you further explain it?
if OR expression right arm is not plain Const, but with collation
specification, eg.
`where a = 'a' collate "C" or a = 'b' collate "C";`

then the rightop is not Const, it will be CollateExpr, it will not be
used in transformation.
---------------------------------------------------------------------------------------------------------------------
Maybe the previous thread mentioned it, but this thread is very long.
after apply
v16-0001-Transform-OR-clause-to-ANY-expressions.patch
and 0002-Teach-generate_bitmap_or_paths-to-build-BitmapOr-pat-20240212.patch
I found a performance degradation case:

drop table if exists test;
create table test as (select (random()*100)::int x, (random()*1000) y
from generate_series(1,1000000) i);
vacuum analyze test;

set enable_or_transformation to off;
explain(timing off, analyze, costs off)
select * from test where (x = 1 or x = 2 or x = 3 or x = 4 or x = 5 or
x = 6 or x = 7 or x = 8 or x = 9 ) \watch i=0.1 c=10
50.887 ms

set enable_or_transformation to on;
explain(timing off, analyze, costs off)
select * from test where (x = 1 or x = 2 or x = 3 or x = 4 or x = 5 or
x = 6 or x = 7 or x = 8 or x = 9 ) \watch i=0.1 c=10
92.001 ms

---------------------------------------------------------------------------------------------------------------------
but for aggregate count(*), it indeed increased the performance:

set enable_or_transformation to off;
explain(timing off, analyze, costs off)
select count(*) from test where (x = 1 or x = 2 or x = 3 or x = 4 or x
= 5 or x = 6 or x = 7 or x = 8 or x = 9 ) \watch i=0.1 c=10
46.818 ms

set enable_or_transformation to on;
explain(timing off, analyze, costs off)
select count(*) from test where (x = 1 or x = 2 or x = 3 or x = 4 or x
= 5 or x = 6 or x = 7 or x = 8 or x = 9 ) \watch i=0.1 c=10
35.376 ms

The time is the last result of the 10 iterations.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-02-13 00:11:50 Re: Improve WALRead() to suck data directly from WAL buffers when possible
Previous Message Jeff Davis 2024-02-12 23:56:19 Re: Improve WALRead() to suck data directly from WAL buffers when possible