Re: POC, WIP: OR-clause support for indexes

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, jian he <jian(dot)universality(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, teodor(at)sigaev(dot)ru, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2024-03-07 20:28:59
Message-ID: 90b67871-0263-484f-9fc0-606bdcdd84c5@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On 07.03.2024 17:51, Alexander Korotkov wrote:
> Hi!
>
> On Tue, Mar 5, 2024 at 9:59 AM Andrei Lepikhov
> <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> > On 5/3/2024 12:30, Andrei Lepikhov wrote:
> > > On 4/3/2024 09:26, jian he wrote:
> > ... and the new version of the patchset is attached.
>
> I made some revisions for the patchset.
> 1) Use hash_combine() to combine hash values.
> 2) Upper limit the number of array elements by MAX_SAOP_ARRAY_SIZE.
> 3) Better save the original order of clauses by putting hash entries
> and untransformable clauses to the same list.  A lot of differences in
> regression tests output have gone.
Thank you for your changes. I agree with them.
>
> One important issue I found.
>
> # create table t as (select i::int%100 i from generate_series(1,10000) i);
> # analyze t;
> # explain select * from t where i = 1 or i = 1;
>                      QUERY PLAN
> -----------------------------------------------------
>  Seq Scan on t  (cost=0.00..189.00 rows=200 width=4)
>    Filter: (i = ANY ('{1,1}'::integer[]))
> (2 rows)
>
> # set enable_or_transformation = false;
> SET
> # explain select * from t where i = 1 or i = 1;
>                      QUERY PLAN
> -----------------------------------------------------
>  Seq Scan on t  (cost=0.00..189.00 rows=100 width=4)
>    Filter: (i = 1)
> (2 rows)
>
> We don't make array values unique.  That might make query execution
> performance somewhat worse, and also makes selectivity estimation
> worse.  I suggest Andrei and/or Alena should implement making array
> values unique.
>
>
I have corrected this and some spelling mistakes. The
unique_any_elements_change.no-cfbot file contains changes.

While I was correcting the test results caused by such changes, I
noticed that the same behavior was when converting the IN expression,
and this can be seen in the result of the regression test:

 EXPLAIN (COSTS OFF)
 SELECT unique2 FROM onek2
 WHERE stringu1 IN ('A', 'A') AND (stringu1 = 'A' OR stringu1 = 'A');
                                QUERY PLAN
---------------------------------------------------------------------------
  Bitmap Heap Scan on onek2
    Recheck Cond: (stringu1 < 'B'::name)
   Filter: ((stringu1 = ANY ('{A,A}'::name[])) AND (stringu1 = 'A'::name))
    ->  Bitmap Index Scan on onek2_u2_prtl
 (4 rows)

--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
unique_any_elements_change.no-cfbot text/plain 6.0 KB
v21-0002-Teach-generate_bitmap_or_paths-to-build-BitmapOr-pat.patch text/x-patch 34.5 KB
v21-0001-Transform-OR-clauses-to-ANY-expression.patch text/x-patch 56.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2024-03-07 20:34:52 Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Previous Message Daniel Gustafsson 2024-03-07 20:08:56 Re: improve ssl error code, 2147483650