Re: Non-reproducible AIO failure

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Non-reproducible AIO failure
Date: 2025-05-25 02:45:52
Message-ID: CA+hUKG+kCOZbsiL7Qc=_1Ahd=JdAkrq0VnStrUvLEnky-H7yUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, May 25, 2025 at 9:00 AM Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
> Hello Thomas,
> 24.05.2025 14:42, Thomas Munro wrote:
> > On Sat, May 24, 2025 at 3:17 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> So it seems that "very low-probability issue in our Mac AIO code" is
> >> the most probable description.
> > There isn't any macOS-specific AIO code so my first guess would be
> > that it might be due to aarch64 weak memory reordering (though Andres
> > speculated that itt should all be one backend, huh), if it's not just
> > a timing luck thing. Alexander, were the other OSes you tried all on
> > x86?
>
> As I wrote off-list before, I had tried x86_64 only, but since then I
> tried to reproduce the issue on an aarch64 server with Ubuntu 24.04,
> running 10, then 40 instances of t/027_stream_regress.pl in parallel. I've
> also multiplied "test: brin ..." line x10. But the issue is still not
> reproduced (in 8+ hours).

Hmm. And I see now that this really is all in one backend. Could it
be some variation of the interrupt processing stuff from acad9093?

> However, I've managed to get an AIO-related assertion failure on macOS 14.5
...
> TRAP: failed Assert("ioh->op == PGAIO_OP_INVALID"), File: "aio_io.c", Line: 161, PID: 32355

Can you get a core and print *ioh in the debugger?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2025-05-25 03:02:50 Re: Relstats after VACUUM FULL and CLUSTER
Previous Message DEVOPS_WwIT 2025-05-25 00:58:13 Re: Retiring some encodings?