| From: | Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru> | 
|---|---|
| To: | jian he <jian(dot)universality(at)gmail(dot)com> | 
| Cc: | Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, teodor(at)sigaev(dot)ru, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org> | 
| Subject: | Re: POC, WIP: OR-clause support for indexes | 
| Date: | 2024-02-13 10:03:20 | 
| Message-ID: | e3338e82-a28d-4631-9eec-b9c0984b32d5@postgrespro.ru | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On 13/2/2024 07:00, jian he wrote:
> + newa = makeNode(ArrayExpr);
> + /* array_collid will be set by parse_collate.c */
> + newa->element_typeid = scalar_type;
> + newa->array_typeid = array_type;
> + newa->multidims = false;
> + newa->elements = aexprs;
> + newa->location = -1;
> 
> I am confused by the comments `array_collid will be set by
> parse_collate.c`, can you further explain it?
I wonder if the second paragraph of comments on commit b310b6e will be 
enough to dive into details.
> if OR expression right arm is not plain Const, but with collation
> specification, eg.
> `where a  = 'a' collate "C" or a = 'b' collate "C";`
> 
> then the rightop is not Const, it will be CollateExpr, it will not be
> used in transformation.
Yes, it is done for simplicity right now. I'm not sure about corner 
cases of merging such expressions.
> 
> set enable_or_transformation to on;
> explain(timing off, analyze, costs off)
> select count(*) from test where (x = 1 or x = 2 or x = 3 or x = 4 or x
> = 5 or x = 6 or x = 7 or x = 8 or x = 9 ) \watch i=0.1 c=10
> 35.376 ms
> 
> The time is the last result of the 10 iterations.
The reason here - parallel workers.
If you see into the plan you will find parallel workers without 
optimization and absence of them in the case of optimization:
Gather  (cost=1000.00..28685.37 rows=87037 width=12)
	(actual rows=90363 loops=1)
    Workers Planned: 2
    Workers Launched: 2
    ->  Parallel Seq Scan on test
          Filter: ((x = 1) OR (x = 2) OR (x = 3) OR (x = 4) OR (x = 5) 
OR (x = 6) OR (x = 7) OR (x = 8) OR (x = 9))
Seq Scan on test  (cost=0.02..20440.02 rows=90600 width=12)
		  (actual rows=90363 loops=1)
    Filter: (x = ANY ('{1,2,3,4,5,6,7,8,9}'::integer[]))
Having 90600 tuples returned we estimate it into 87000 (less precisely) 
without transformation and 90363 (more precisely) with the transformation.
But if you play with parallel_tuple_cost and parallel_setup_cost, you 
will end up having these parallel workers:
  Gather  (cost=0.12..11691.03 rows=90600 width=12)
	 (actual rows=90363 loops=1)
    Workers Planned: 2
    Workers Launched: 2
    ->  Parallel Seq Scan on test
          Filter: (x = ANY ('{1,2,3,4,5,6,7,8,9}'::integer[]))
          Rows Removed by Filter: 303212
And some profit about 25%, on my laptop.
I'm not sure about the origins of such behavior, but it seems to be an 
issue of parallel workers, not this specific optimization.
-- 
regards,
Andrei Lepikhov
Postgres Professional
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrei Lepikhov | 2024-02-13 10:43:21 | Re: POC, WIP: OR-clause support for indexes | 
| Previous Message | shveta malik | 2024-02-13 09:59:45 | Re: Synchronizing slots from primary to standby |