Re: IO worker crash in test_aio/002_io_workers

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: IO worker crash in test_aio/002_io_workers
Date: 2025-07-08 21:18:20
Message-ID: CA+hUKGKN87Rn89UgprGcrnwW+Ok6dbbM2b275O9cdtCbAZtvRw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 9, 2025 at 8:45 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> /* Got one. Clear idle flag. */
> io_worker_control->idle_worker_mask &= ~(UINT64_C(1) << MyIoWorkerId);
>
> /* See if we can wake up some peers. */
> nwakeups = Min(pgaio_worker_submission_queue_depth(),
> IO_WORKER_WAKEUP_FANOUT);
> for (int i = 0; i < nwakeups; ++i)
> {
> if ((worker = pgaio_choose_idle_worker()) < 0)
> break;
> latches[nlatches++] = io_worker_control->workers[worker].latch;
> }
>
> can return a worker that's actually not currently running and thus does not
> have a latch set.

Ugh, right, thanks. Annoyingly, I think I had already seen and
understood this while working on the dynamic worker pool sizing
patch[1] which starts and stops workers more often, and that patch of
course had to address that problem, but I somehow failed to spot or
maybe just remember that master needs that change too. Will fix.

> I suspect the reason that this was hit with Tomas' patch is that it adds use
> of streaming reads to index scans, and thus makes it plausible at all to hit
> AIO in the path.

Cool, been meaning to try that out...

[1] https://www.postgresql.org/message-id/flat/CA%2BhUKG%2Bm4xV0LMoH2c%3DoRAdEXuCnh%2BtGBTWa7uFeFMGgTLAw%2BQ%40mail.gmail.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2025-07-08 21:23:13 Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)
Previous Message Nathan Bossart 2025-07-08 21:06:16 Re: Horribly slow pg_upgrade performance with many Large Objects