GIN INdex is not used with && operator for a text array index

From: balasubramanian c r <crbs(dot)siebel(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: GIN INdex is not used with && operator for a text array index
Date: 2023-12-06 14:32:52
Message-ID: CANnzXMPHcGyv1v5J+jcJyP6hUwd2O7ZLA6yjQJxx332Ou+m63w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

HI Team

Sorry for the spam.

We have Postgres DB where the list of addresses are stored and for a given
complete address
trigram of addresses are stored in a column which is a text array.

after looking at the list of operators that are available for gin index I
decided to use array_ops operator.
select amop.amopopr::regoperator, amop.amopstrategy, opc.opcname from
pg_opclass opc, pg_opfamily opf, pg_am am, pg_amop amop where opf.oid =
opc.opcfamily and am.oid = opf.opfmethod and amop.amopfamily =
opc.opcfamily and am.amname = 'gin' and amop.amoplefttype = opc.opcintype;
amopopr | amopstrategy | opcname
-------------------------+--------------+----------------
&&(anyarray,anyarray) | 1 | array_ops
@>(anyarray,anyarray) | 2 | array_ops
<@(anyarray,anyarray) | 3 | array_ops
=(anyarray,anyarray) | 4 | array_ops

CREATE INDEX pentgram_idx ON address18 USING GIN(pentgram array_ops);

when Operator '@>' is used the index is used and the execution time is 60ms.

db=# EXPLAIN (costs, buffers, verbose, analyze) select similarity('jattan
kangra riyali 144 hp 176058', complete_address) from address18 where
pentgram @> show_trgm('jattan kangra riyali 144 hp 176058');

Bitmap Heap Scan on public.address18 (cost=261.25..262.52 rows=1 width=4)
(actual time=58.992..58.994 rows=1 loops=1)
Output: similarity('*****'::text, complete_address)
Recheck Cond: (address18.pentgram @> '{******}'::text[])
Heap Blocks: exact=1
Buffers: shared hit=1483
-> Bitmap Index Scan on pentgram_idx (cost=0.00..261.25 rows=1
width=0) (actual time=58.960..58.960 rows=1 loops=1)
Index Cond: (address18.pentgram @> '{***}'::text[])
Buffers: shared hit=1482
Query Identifier: -126591413296272164
Planning:
Buffers: shared hit=1
Planning Time: 0.679 ms
Execution Time: 60.373 ms
(13 rows)

when Operator '&&' is used the index is used and the execution time is 60ms.
It is performing sequential scan which is not expected.

db=# EXPLAIN (costs, buffers, verbose, analyze) select similarity('jattan
kangra riyali 144 hp 176058', complete_address) from address18 where
pentgram && show_trgm('jattan kangra riyali 144 hp 176058');

Seq Scan on public.address18 (cost=0.00..77215.11 rows=247741 width=4)
(actual time=0.063..1880.467 rows=247741 loops=1)
Output: similarity('****'::text, complete_address)
Filter: (address18.pentgram && '{"****}'::text[])
Buffers: shared hit=3592 read=69907
I/O Timings: shared/local read=267.274
Query Identifier: 2367846469053211383
Planning:
Buffers: shared hit=1
Planning Time: 0.386 ms
Execution Time: 1886.125 ms
(10 rows)

When i disable the sequential scan the execution time is increased
significantly.

db=# EXPLAIN (costs, buffers, verbose, analyze) select similarity('jattan
kangra riyali 144 hp 176058', complete_address) from address18 where
pentgram && show_trgm('jattan kangra riyali 144 hp 176058');

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on public.address18 (cost=2266.24..79481.36 rows=247741
width=4) (actual time=186.402..3285.090 rows=247741 loops=1)
Output: similarity('****'::text, complete_address)
Recheck Cond: (address18.pentgram && '{***********}'::text[])
Heap Blocks: exact=39632
Buffers: shared hit=1361 read=39155 written=6260
I/O Timings: shared/local read=981.127 write=544.491
-> Bitmap Index Scan on pentgram_idx (cost=0.00..2204.31 rows=247741
width=0) (actual time=182.462..182.463 rows=247741 loops=1)
Index Cond: (address18.pentgram && '{*****}'::text[])
Buffers: shared hit=884
Query Identifier: 2367846469053211383
Planning:
Buffers: shared hit=1
Planning Time: 6.707 ms
Execution Time: 3292.339 ms
(14 rows)

Not expecting this behavior currently.

Few parameters in my configuration
postgresql version is 15
OS is RHEL 8.8
effective_cache_size | 131072
huge_pages | off
maintenance_work_mem | 327680
max_parallel_maintenance_workers | 1

max_parallel_workers | 2
max_parallel_workers_per_gather | 1
shared_buffers
| 32768

Thanks
C.R.Bala

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ron Johnson 2023-12-06 14:43:27 Re: Store PDF files in PostgreDB
Previous Message Joe Conway 2023-12-06 13:49:13 Re: Emitting JSON to file using COPY TO