Re: POC, WIP: OR-clause support for indexes

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, teodor(at)sigaev(dot)ru, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2024-02-01 05:00:00
Message-ID: CACJufxFS-xcjaWq2Du2OyJUjRAyqCk12Q_zGOPxv61sgrdpw9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 31, 2024 at 7:10 PM Alena Rybakina
<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> Hi, thank you for your review and interest in this subject.
>
> On 31.01.2024 13:15, jian he wrote:
>
> On Wed, Jan 31, 2024 at 10:55 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>
> based on my understanding of
> https://www.postgresql.org/docs/current/xoper-optimization.html#XOPER-COMMUTATOR
> I think you need move commutator check right after the `if
> (get_op_rettype(opno) != BOOLOID)` branch
>
> I was wrong about this part. sorry for the noise.
>
>
> I have made some changes (attachment).
> * if the operator expression left or right side type category is
> {array | domain | composite}, then don't do the transformation.
> (i am not 10% sure with composite)
>
> To be honest, I'm not sure about this check, because we check the type of variable there:
>
> if (!IsA(orqual, OpExpr))
> {
> or_list = lappend(or_list, orqual);
> continue;
> }
> And below:
> if (IsA(leftop, Const))
> {
> opno = get_commutator(opno);
>
> if (!OidIsValid(opno))
> {
> /* Commuter doesn't exist, we can't reverse the order */
> or_list = lappend(or_list, orqual);
> continue;
> }
>
> nconst_expr = get_rightop(orqual);
> const_expr = get_leftop(orqual);
> }
> else if (IsA(rightop, Const))
> {
> const_expr = get_rightop(orqual);
> nconst_expr = get_leftop(orqual);
> }
> else
> {
> or_list = lappend(or_list, orqual);
> continue;
> }
>
> Isn't that enough?

alter table tenk1 add column arr int[];
set enable_or_transformation to on;
EXPLAIN (COSTS OFF)
SELECT count(*) FROM tenk1
WHERE arr = '{1,2,3}' or arr = '{1,2}';

the above query will not do the OR transformation. because array type
doesn't have array type.
`
scalar_type = entry->key.exprtype;
if (scalar_type != RECORDOID && OidIsValid(scalar_type))
array_type = get_array_type(scalar_type);
else
array_type = InvalidOid;
`

If either side of the operator expression is array or array related type,
we can be sure it cannot do the transformation
(get_array_type will return InvalidOid for anyarray type).
we can check it earlier, so hash related code will not be invoked for
array related types.

> Besides, some of examples (with ARRAY) works fine:
>
> postgres=# CREATE TABLE sal_emp (
> pay_by_quarter integer[],
> pay_by_quater1 integer[]
> );
> CREATE TABLE
> postgres=# INSERT INTO sal_emp
> VALUES (
> '{10000, 10000, 10000, 10000}',
> '{1,2,3,4}');
> INSERT 0 1
> postgres=# select * from sal_emp where pay_by_quarter[1] = 10000 or pay_by_quarter[1]=2;
> pay_by_quarter | pay_by_quater1
> ---------------------------+----------------
> {10000,10000,10000,10000} | {1,2,3,4}
> (1 row)
>
> postgres=# explain select * from sal_emp where pay_by_quarter[1] = 10000 or pay_by_quarter[1]=2;
> QUERY PLAN
> --------------------------------------------------------------
> Seq Scan on sal_emp (cost=0.00..21.00 rows=9 width=64)
> Filter: (pay_by_quarter[1] = ANY ('{10000,2}'::integer[]))
> (2 rows)
>
> * if the left side of the operator expression node contains volatile
> functions, then don't do the transformation.
>
> I'm also not sure about the volatility check function, because we perform such a conversion at the parsing stage, and at this stage we don't have a RelOptInfo variable and especially a RestictInfo such as PathTarget.
>
see the example in here:
https://www.postgresql.org/message-id/CACJufxGXhJ823cdAdp2Ho7qC-HZ3_-dtdj-myaAi_u9RQLn45g%40mail.gmail.com

set enable_or_transformation to on;
create or replace function retint(int) returns int as
$func$
begin raise notice 'hello';
return $1 + round(10 * random()); end
$func$ LANGUAGE plpgsql;

SELECT count(*) FROM tenk1 WHERE thousand = 42;
will return 10 rows.

SELECT count(*) FROM tenk1 WHERE thousand = 42 AND (retint(1) = 4 OR
retint(1) = 3);
this query I should return 20 notices 'hello', but now only 10.

EXPLAIN (COSTS OFF)
SELECT count(*) FROM tenk1
WHERE thousand = 42 AND (retint(1) = 4 OR retint(1) = 3);
QUERY PLAN
------------------------------------------------------------------------------
Aggregate
-> Seq Scan on tenk1
Filter: ((thousand = 42) AND (retint(1) = ANY ('{4,3}'::integer[])))
(3 rows)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ajin Cherian 2024-02-01 05:38:13 Re: Synchronizing slots from primary to standby
Previous Message vignesh C 2024-02-01 04:59:14 Re: Race condition in FetchTableStates() breaks synchronization of subscription tables