Re: Non-reproducible AIO failure

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Non-reproducible AIO failure
Date: 2025-06-04 20:02:32
Message-ID: d5cheojbap2frawwyi66csyuj5k7eevblsmxywbgfhrqtijpkc@hyvfisha3fyy
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for working on investigating this.

On 2025-06-03 08:00:01 +0300, Alexander Lakhin wrote:
> 02.06.2025 09:00, Alexander Lakhin wrote:
> > With additional logging (the patch is attached), I can see the following:
> > ...
> > !!!pgaio_io_reclaim [63817]| ioh: 0x1046b5660, ioh->op: 1, ioh->state: 6, ioh->result: 8192, ioh->num_callbacks: 2
> > !!!AsyncReadBuffers [63817] (1)| blocknum: 18, ioh: 0x1046b5660,
> > ioh->op: 0, ioh->state: 1, ioh->result: 0, ioh->num_callbacks: 0
> > 2025-06-02 00:45:11.391 EDT [63817:14] pg_regress/brin LOG:
> > !!!pgaio_io_before_start| ioh: 0x1046b5660, ioh->op: 1, ioh->state: 1,
> > ioh->result: 0, ioh->num_callbacks: 2
> >
> > TRAP: failed Assert("ioh->op == PGAIO_OP_INVALID"), File: "aio_io.c", Line: 164, PID: 63817
>
> I've added logging of the ioh->generation field at the beginning and at
> the end of pgaio_io_reclaim() (the complete patch is attached) and got
> even more interesting output:
> ...
> !!!pgaio_io_reclaim [25175]| ioh: 0x104c3e1a0, ioh->op: 1, ioh->state: 6,
> ioh->result: 8192, ioh->num_callbacks: 2, ioh->generation: 21692
> !!!pgaio_io_reclaim [25175]| ioh: 0x104c3e1a0, ioh->op: 0, ioh->generation: 21693
> !!!AsyncReadBuffers [25175] (1)| blocknum: 40, ioh: 0x104c3e1a0, ioh->op: 0,
> ioh->state: 1, ioh->result: 0, ioh->num_callbacks: 0, ioh->generation: 21693
> !!!pgaio_io_start_readv [25175]| ioh: 0x104c3e1a0, ioh->op: 1, ioh->generation: 21693

IO starts.

> !!!pgaio_io_reclaim [25176]| ioh: 0x104c3e080, ioh->op: 1, ioh->state: 6,
> ioh->result: 8192, ioh->num_callbacks: 2, ioh->generation: 21408
> !!!pgaio_io_reclaim [25176]| ioh: 0x104c3e080, ioh->op: 0, ioh->generation: 21409
> !!!AsyncReadBuffers [25176] (1)| blocknum: 49, ioh: 0x104c3e080, ioh->op: 0,
> ioh->state: 1, ioh->result: 0, ioh->num_callbacks: 0, ioh->generation: 21409
> !!!pgaio_io_start_readv [25176]| ioh: 0x104c3e080, ioh->op: 1, ioh->generation: 21409

(other backend / other IO)

> !!!pgaio_io_reclaim [25175]| ioh: 0x104c3e1a0, ioh->op: 1, ioh->state: 6,
> ioh->result: 8192, ioh->num_callbacks: 2, ioh->generation: 21693
> !!!pgaio_io_reclaim [25175]| ioh: 0x104c3e1a0, ioh->op: 0, ioh->generation: 21694

I completes and is marked as idle, as we'd expect.

> !!!AsyncReadBuffers [25175] (1)| blocknum: 41, ioh: 0x104c3e1a0, ioh->op: 0,
> ioh->state: 1, ioh->result: 0, ioh->num_callbacks: 0, ioh->generation: 21694

And here, before starting a new IO, the op is 0 (i.e. PGAIO_OP_INVALID), as we
would expect.

> 2025-06-03 00:19:09.282 EDT [25175:1] LOG: !!!pgaio_io_before_start| ioh:
> 0x104c3e1a0, ioh->op: 1, ioh->state: 1, ioh->result: 0, ioh->num_callbacks:
> 2, ioh->generation: 21694

But here it somehow turned to 1 (PGAIO_OP_READV), despite there not being any
"normal" reason for that. We know that the IO wasn't actually started as
otherwise pgaio_io_start_readv() would have logged that fact.

Which then immediately leads to an assert failure:

> 2025-06-03 00:19:09.282 EDT [25175:2] STATEMENT:  explain (analyze, format 'json')
>           select count(*) from join_foo
>             left join (select b1.id, b1.t from join_bar b1 join join_bar b2 using (id)) ss
>             on join_foo.id < ss.id + 1 and join_foo.id > ss.id - 1;
>
> TRAP: failed Assert("ioh->op == PGAIO_OP_INVALID"), File: "aio_io.c", Line: 164, PID: 25175
>
> 0   postgres                            0x0000000103563550 ExceptionalCondition + 236
> 1   postgres                            0x000000010326a260 pgaio_io_before_start + 516
> 2   postgres                            0x0000000103269fb4 pgaio_io_start_readv + 36
> 3   postgres                            0x0000000103289144 FileStartReadV + 252
> 4   postgres                            0x00000001032dc424 mdstartreadv + 668
> 5   postgres                            0x00000001032dfa0c smgrstartreadv + 116
> 6   postgres                            0x00000001032723b8 AsyncReadBuffers + 2028
> 7   postgres                            0x0000000103271354 StartReadBuffersImpl + 1196
> 8   postgres                            0x0000000103270e98 StartReadBuffers + 64
> 9   postgres                            0x000000010326e4a0 read_stream_start_pending_read + 1204
> 10  postgres                            0x000000010326db84 read_stream_look_ahead + 812
> 11  postgres                            0x000000010326d71c read_stream_next_buffer + 2356

I don't see how this can happen.

There is no mac specific code in AIO as of now, why aren't we seeing this on
any other platform?

IME the theory that there is something outside of a subsystem randomly
overwriting memory is almost always wrong, but I'm kinda starting to suspect
it here nonetheless.

I wonder if it's worth trying to put a fair bit of "padding" memory
before/after AIO's shared memory region(s). If that were to stop the crashes,
it'd likely be something random writing there. Unfortunately continuing to
crash wouldn't necessarily guarantee it's aio specific.

Hm. We had issues with wrong size calculations of shmem regions before, what
about writing a patch that puts mprotect(PROT_NONE) pages before/after each
ShmemAlloc'd region?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2025-06-04 20:15:49 Re: pg_upgrade: warn about roles with md5 passwords
Previous Message Nathan Bossart 2025-06-04 19:48:14 a couple of small cleanup patches for DSM-related code