Re: POC, WIP: OR-clause support for indexes

From: "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Cc: "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, teodor(at)sigaev(dot)ru, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2023-09-29 17:35:13
Message-ID: 89fb769c-7d45-f0c5-a9b3-b706c9661c57@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I'm sorry I didn't write for a long time, but I really had a very
> difficult month, now I'm fully back to work.
>
> *I was able to implement the patches to the end and moved the
> transformation of "OR" expressions to ANY.* I haven't seen a big
> difference between them yet, one has a transformation before
> calculating selectivity (v7.1-Replace-OR-clause-to-ANY.patch), the
> other after (v7.2-Replace-OR-clause-to-ANY.patch). Regression tests
> are passing, I don't see any problems with selectivity, nothing has
> fallen into the coredump, but I found some incorrect transformations.
> What is the reason for these inaccuracies, I have not found, but, to
> be honest, they look unusual). Gave the error below.
>
> In the patch, I don't like that I had to drag three libraries from
> parsing until I found a way around it.The advantage of this approach
> compared to the other ([1]) is that at this stage all possible or
> transformations are performed, compared to the patch, where the
> transformation was done at the parsing stage. That is, here, for
> example, there are such optimizations in the transformation:
>
>
> I took the common element out of the bracket and the rest is converted
> to ANY, while, as noted by Peter Geoghegan, we did not have several
> bitmapscans, but only one scan through the array.
>
> postgres=# explain analyze SELECT p1.oid, p1.proname
> FROM pg_proc as p1
> WHERE prolang = 13 AND prolang=1 OR prolang = 13 AND prolang = 2 OR
> prolang = 13 AND prolang = 3;
>                                               QUERY PLAN
> -------------------------------------------------------------------------------------------------------
>  Seq Scan on pg_proc p1  (cost=0.00..151.66 rows=1 width=68) (actual
> time=1.167..1.168 rows=0 loops=1)
>    Filter: ((prolang = '13'::oid) AND (prolang = ANY (ARRAY['1'::oid,
> '2'::oid, '3'::oid])))
>    Rows Removed by Filter: 3302
>  Planning Time: 0.146 ms
>  Execution Time: 1.191 ms
> (5 rows)
> *Falls into coredump at me:*
> explain analyze SELECT p1.oid, p1.proname
> FROM pg_proc as p1
> WHERE prolang = 13 OR prolang = 2 AND prolang = 2 OR prolang = 13;
>
I continue to try to move transformations of "OR" expressions at the
optimization stage, unfortunately I have not been able to figure out
coredump yet, but I saw an important thing that it is already necessary
to process RestrictInfo expressions here. I corrected it.

To be honest, despite some significant advantages in the fact that we
are already processing pre-converted "or" expressions (logical
transformations have been performed and duplicates have been removed), I
have big doubts about this approach. We already have quite a lot of
objects at this stage that can refer to the RestrictInfo variable in
ReplOptInfo, and updating these links can be costly for us. By the way,
right now I suspect that the current coredump appeared precisely because
there is a link somewhere that refers to an un-updated RestrictInfo, but
so far I can't find this place. coredump occurs at the request execution
stage, looks like this:

Core was generated by `postgres: alena regression [local]
SELECT                                     '.
--Type <RET> for more, q to quit, c to continue without paging--
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00005565f3ec4947 in ExecInitExprRec (node=0x5565f530b290,
state=0x5565f53383d8, resv=0x5565f53383e0, resnull=0x5565f53383dd) at
execExpr.c:1331
1331                                            Expr       *arg = (Expr
*) lfirst(lc);
(gdb) bt
#0  0x00005565f3ec4947 in ExecInitExprRec (node=0x5565f530b290,
state=0x5565f53383d8, resv=0x5565f53383e0, resnull=0x5565f53383dd) at
execExpr.c:1331
#1  0x00005565f3ec2708 in ExecInitQual (qual=0x5565f531d950,
parent=0x5565f5337948) at execExpr.c:258
#2  0x00005565f3f2f080 in ExecInitSeqScan (node=0x5565f5309700,
estate=0x5565f5337700, eflags=32) at nodeSeqscan.c:172
#3  0x00005565f3ee70c9 in ExecInitNode (node=0x5565f5309700,
estate=0x5565f5337700, eflags=32) at execProcnode.c:210
#4  0x00005565f3edbe3a in InitPlan (queryDesc=0x5565f53372f0, eflags=32)
at execMain.c:968
#5  0x00005565f3edabe3 in standard_ExecutorStart
(queryDesc=0x5565f53372f0, eflags=32) at execMain.c:266
#6  0x00005565f3eda927 in ExecutorStart (queryDesc=0x5565f53372f0,
eflags=0) at execMain.c:145
#7  0x00005565f419921e in PortalStart (portal=0x5565f52ace90,
params=0x0, eflags=0, snapshot=0x0) at pquery.c:517
#8  0x00005565f4192635 in exec_simple_query (
    query_string=0x5565f5233af0 "SELECT p1.oid, p1.proname\nFROM
pg_proc as p1\nWHERE prolang = 13 AND (probin IS NULL OR probin = '' OR
probin = '-');") at postgres.c:1233
#9  0x00005565f41976ef in PostgresMain (dbname=0x5565f526ad10
"regression", username=0x5565f526acf8 "alena") at postgres.c:4652
#10 0x00005565f40b8417 in BackendRun (port=0x5565f525f830) at
postmaster.c:4439
#11 0x00005565f40b7ca3 in BackendStartup (port=0x5565f525f830) at
postmaster.c:4167
#12 0x00005565f40b40f1 in ServerLoop () at postmaster.c:1781
#13 0x00005565f40b399b in PostmasterMain (argc=8, argv=0x5565f522c110)
at postmaster.c:1465
#14 0x00005565f3f6560e in main (argc=8, argv=0x5565f522c110) at main.c:198

I have saved my experimental version of the "or" transfer in the diff
file, I am attaching the main patch in the ".patch" format so that the
tests are checked against this version. Let me remind you that the main
patch contains the code for converting "OR" expressions to "ANY" at the
parsing stage.

Attachment Content-Type Size
experimantal_version.diff text/x-patch 9.6 KB
v7.0-Replace-OR-clause-to-ANY.patch text/x-patch 32.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2023-09-29 17:39:07 [DOCS] HOT - correct claim about indexes not referencing old line pointers
Previous Message Justin Pryzby 2023-09-29 16:36:42 Re: [HACKERS] pg_upgrade failed with error - ERROR: column "a" in child table must be marked NOT NULL