| From: | Kirill Reshke <reshkekirill(at)gmail(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | GIN index fast list search may become un-interruptible for long time. |
| Date: | 2026-07-04 17:38:54 |
| Message-ID: | CALdSSPisYC-zHiWJU36AP-zrPN-im-gEZrK-QD8TPn3Vb8xMvw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Recently I have found a case in one of our production clusters where
the query does not respond to pg_*_backend(pid).
I have found a small synthetic reproducer for this.
reshke=# create table tt(i bigint[]);
CREATE TABLE
reshke=# create index on tt using gin(i);
CREATE INDEX
reshke=# create extension pgstattuple ;
CREATE EXTENSION
reshke=# insert into tt select array(select generate_series(i, i +
2)) from generate_series(1,1e4) i;
INSERT 0 10000
reshke=# select * from pgstatginindex('tt_i_idx');
version | pending_pages | pending_tuples
---------+---------------+----------------
2 | 74 | 10000
(1 row)
reshke=# select * from tt where i && ARRAY(select generate_series(1e5,
1e6))::bigint[];
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
This is effectively stuck in pending list tuple scan:
(gdb) bt
#0 0x000055774bdbf393 in collectMatchesForHeapRow
(scan=scan(at)entry=0x55774d15d6c8, pos=pos(at)entry=0x7ffd3bcf6b70) at
ginget.c:1778
#1 0x000055774bdc03ba in scanPendingInsert (ntids=<synthetic
pointer>, tbm=0x55774d15de18, scan=0x55774d15d6c8) at ginget.c:1880
#2 gingetbitmap (scan=0x55774d15d6c8, tbm=0x55774d15de18) at ginget.c:1949
#3 0x000055774be0997d in index_getbitmap
(scan=scan(at)entry=0x55774d15d6c8, bitmap=bitmap(at)entry=0x55774d15de18)
at indexam.c:731
#4 0x000055774bf95673 in MultiExecBitmapIndexScan
(node=0x55774d153de0) at nodeBitmapIndexscan.c:104
#5 0x000055774bf82f61 in MultiExecProcNode (node=<optimized out>) at
execProcnode.c:524
#6 0x000055774bf94d15 in BitmapHeapNext (node=0x55774d153b40) at
nodeBitmapHeapscan.c:110
#7 0x000055774bf7b2cb in ExecProcNode (node=0x55774d153b40) at
../../../src/include/executor/executor.h:278
#8 ExecutePlan (dest=0x55774d146218, direction=<optimized out>,
numberTuples=0, sendTuples=true, operation=CMD_SELECT,
queryDesc=0x55774d151e50) at execMain.c:1689
#9 standard_ExecutorRun (queryDesc=0x55774d151e50,
direction=<optimized out>, count=0, execute_once=<optimized out>) at
execMain.c:360
#10 0x000055774c15139a in PortalRunSelect
(portal=portal(at)entry=0x55774d0e5890, forward=forward(at)entry=true,
count=0, count(at)entry=9223372036854775807,
dest=dest(at)entry=0x55774d146218) at pquery.c:922
#11 0x000055774c152b1e in PortalRun
(portal=portal(at)entry=0x55774d0e5890,
count=count(at)entry=9223372036854775807,
isTopLevel=isTopLevel(at)entry=true, run_once=run_once(at)entry=true,
dest=dest(at)entry=0x55774d146218,
altdest=altdest(at)entry=0x55774d146218, qc=0x7ffd3bcf6f80) at
pquery.c:766
#12 0x000055774c14e8f4 in exec_simple_query
(query_string=0x55774d065950 "select * from tt where i && ARRAY(select
generate_series(1e5, 1e6))::bigint[];") at postgres.c:1283
#13 0x000055774c150368 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4777
#14 0x000055774c14abe3 in BackendMain (startup_data=<optimized out>,
startup_data_len=<optimized out>) at backend_startup.c:105
#15 0x000055774c0a3f39 in postmaster_child_launch
(child_type=child_type(at)entry=B_BACKEND,
startup_data=startup_data(at)entry=0x7ffd3bcf7420 "",
startup_data_len=startup_data_len(at)entry=4,
client_sock=client_sock(at)entry=0x7ffd3bcf7440) at launch_backend.c:277
#16 0x000055774c0a7e27 in BackendStartup (client_sock=0x7ffd3bcf7440)
at postmaster.c:3582
#17 ServerLoop () at postmaster.c:1679
#18 0x000055774c0a9b68 in PostmasterMain (argc=argc(at)entry=3,
argv=argv(at)entry=0x55774d05f1a0) at postmaster.c:1377
#19 0x000055774bd97942 in main (argc=3, argv=0x55774d05f1a0) at main.c:199
(gdb) p InterruptHoldoffCount
$3 = 1
The thing is, we hold a fast-list buffer lock while scanning it, which
can take time. But backends are uninterruptible (via postgres cancel)
when holding a lock on buffer. This strikes me as not too good
concurrency design. For now, I post v1-0001 where we simply check if
the interrupt condition is pending, and if it is, we simply unlock our
fast list buffer and do CFI(), which should cancel the query at this
point. This is simply to show-case where we are stuck and also for
triggering problems which I fix in 0002.
I understand that my v1 is uncommittable because of possible
performance implications. I also think this
INTERRUPTS_PENDING_CONDITION is too clumsy. But right now I am looking
for a back-patchable fix, so maybe v1 is bad for master, but for
back-branches maybe fine.
When applying v1, I discovered queries can still be non-cancellable in
my reproducer. So, after applying v1, the query still does not respond
to cancel inside the startScanKey function.
The thing is, we are CFI() in O(key->nentries) busy loop if
triConsistent functions always return GIN_FALSE. I can see that CFI
was added in 0f21db36d663, so just moved it before break;
see v1-0002
--
Best regards,
Kirill Reshke
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Make-GIN-scan-more-cancellable.patch | application/octet-stream | 881 bytes |
| v1-0002-Move-CFI-check-before-break-condition.patch | application/octet-stream | 988 bytes |
| From | Date | Subject | |
|---|---|---|---|
| Previous Message | dinesh salve | 2026-07-04 17:07:19 | Re: explain plans for foreign servers |