From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: Non-reproducible AIO failure |
Date: | 2025-06-02 06:00:01 |
Message-ID: | 94f9bf81-e8ff-434a-aa9f-ec6591f527d2@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
31.05.2025 06:00, Alexander Lakhin wrote:
> Hello Thomas,
>
> It looks like I managed to restore all the conditions needed to reproduce
> that Assert more or less reliably (within a couple of hours), so I can
> continue experiments.
>
> I've added the following debugging:
> ...
With additional logging (the patch is attached), I can see the following:
!!!pgaio_io_reclaim [63817]| ioh: 0x1046b5660, ioh->op: 1, ioh->state: 6, ioh->result: 8192, ioh->num_callbacks: 2
!!!AsyncReadBuffers [63817] (1)| blocknum: 15, ioh: 0x1046b5660, ioh->op: 0, ioh->state: 1, ioh->result: 0,
ioh->num_callbacks: 0
!!!pgaio_io_start_readv [63817]| ioh: 0x1046b5660, ioh->op: 1
!!!pgaio_io_reclaim [63817]| ioh: 0x1046b5660, ioh->op: 1, ioh->state: 6, ioh->result: 8192, ioh->num_callbacks: 2
!!!AsyncReadBuffers [63817] (1)| blocknum: 16, ioh: 0x1046b5660, ioh->op: 0, ioh->state: 1, ioh->result: 0,
ioh->num_callbacks: 0
!!!pgaio_io_start_readv [63817]| ioh: 0x1046b5660, ioh->op: 1
!!!pgaio_io_reclaim [63817]| ioh: 0x1046b5660, ioh->op: 1, ioh->state: 6, ioh->result: 8192, ioh->num_callbacks: 2
!!!AsyncReadBuffers [63817] (1)| blocknum: 17, ioh: 0x1046b5660, ioh->op: 0, ioh->state: 1, ioh->result: 0,
ioh->num_callbacks: 0
!!!pgaio_io_start_readv [63817]| ioh: 0x1046b5660, ioh->op: 1
!!!pgaio_io_reclaim [63817]| ioh: 0x1046b5660, ioh->op: 1, ioh->state: 6, ioh->result: 8192, ioh->num_callbacks: 2
!!!AsyncReadBuffers [63817] (1)| blocknum: 18, ioh: 0x1046b5660, ioh->op: 0, ioh->state: 1, ioh->result: 0,
ioh->num_callbacks: 0
2025-06-02 00:45:11.391 EDT [63817:14] pg_regress/brin LOG: !!!pgaio_io_before_start| ioh: 0x1046b5660, ioh->op: 1,
ioh->state: 1, ioh->result: 0, ioh->num_callbacks: 2
TRAP: failed Assert("ioh->op == PGAIO_OP_INVALID"), File: "aio_io.c", Line: 164, PID: 63817
0 postgres 0x0000000102fdb60c ExceptionalCondition + 236
1 postgres 0x0000000102ce233c pgaio_io_before_start + 504
2 postgres 0x0000000102ce20a8 pgaio_io_start_readv + 36
3 postgres 0x0000000102d01200 FileStartReadV + 252
4 postgres 0x0000000102d544e0 mdstartreadv + 668
5 postgres 0x0000000102d57ac8 smgrstartreadv + 116
6 postgres 0x0000000102cea474 AsyncReadBuffers + 1996
7 postgres 0x0000000102ce9430 StartReadBuffersImpl + 1196
8 postgres 0x0000000102ce8f74 StartReadBuffers + 64
9 postgres 0x0000000102ce657c read_stream_start_pending_read + 1204
10 postgres 0x0000000102ce5c60 read_stream_look_ahead + 812
11 postgres 0x0000000102ce57f8 read_stream_next_buffer + 2356
12 postgres 0x00000001026da1b8 heap_fetch_next_buffer + 284
13 postgres 0x00000001026caf54 heapgettup_pagemode + 192
14 postgres 0x00000001026cb524 heap_getnextslot + 84
15 postgres 0x0000000102a4d984 table_scan_getnextslot + 340
16 postgres 0x0000000102a4d790 SeqNext + 148
17 postgres 0x0000000102a4de64 ExecScanFetch + 772
18 postgres 0x0000000102a4da38 ExecScanExtended + 164
19 postgres 0x0000000102a4d374 ExecSeqScanWithQualProject + 248
20 postgres 0x0000000102a13248 ExecProcNode + 68
21 postgres 0x0000000102a12e70 fetch_input_tuple + 180
22 postgres 0x0000000102a12bb4 agg_retrieve_direct + 1664
23 postgres 0x0000000102a0f914 ExecAgg + 240
...
2025-06-02 00:45:11.409 EDT [57192:4] LOG: client backend (PID 63817) was terminated by signal 6: Abort trap: 6
2025-06-02 00:45:11.409 EDT [57192:5] DETAIL: Failed process was running: DO $x$
DECLARE
r record;
r2 record;
cond text;
idx_ctids tid[];
ss_ctids tid[];
count int;
plan_ok bool;
plan_line text;
BEGIN
...
Best regards,
Alexander Lakhin
Neon (https://neon.tech)
Attachment | Content-Type | Size |
---|---|---|
aio-debugging-0602.patch | text/x-patch | 3.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | John Naylor | 2025-06-02 06:01:30 | Re: Batch TIDs lookup in ambulkdelete |
Previous Message | Peter Eisentraut | 2025-06-02 05:42:07 | Re: Add “FOR UPDATE NOWAIT” lock details to the log. |