Re: Potential "AIO / io workers" inter-worker locking issue in PG18?

From: Marco Boeringa <marco(at)boeringa(dot)demon(dot)nl>
To: Markus KARG <markus(at)headcrashing(dot)eu>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
Date: 2025-10-06 10:40:01
Message-ID: 934fc185-9583-4f03-902e-fa9f221fbea4@boeringa.demon.nl
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Markus,

On my Ubuntu virtual machine, io_uring cannot be started. Setting
"io_method = io_uring" and trying to restart the cluster, fails. It will
not start, I have attempted this multiple times. Only 'sync' and
'worker' allow restarting after modifying the PostgreSQL configuration
file.

As I understood, the PostgreSQL binary needs to be compiled with the
proper support, maybe my version on Ubuntu 24.04 that runs as a Windows
Hyper-V virtual machine, doesn't have it. Although I did notice when
installing PG18 from synaptic, that it installed an additional
'liburing' package or something named like that if I remember well...

As to your question about Python and scheduling conflict: this is not
the case. Python runs on the Windows host, not under Ubuntu inside the
VM. I only have PostgreSQL installed on Ubuntu, as I use it with
osm2pgsql there. I access the PostgreSQL instance via pyodbc or psycopg2
on the Windows host, so it is like a remote database server, just
running on local hardware.

Marco

> I am not a PostgreSQL contributor and have no clue what the actual
> technical details are in the new AIO code, but reading your report the
> following questions came to my mind:

>

> * Does the failure also happen with io_mode=io_uring? If no, it is a
proof that it is really bound to io_mode=worker, not to AIO in general.

> * Does the failure also happen with io_mode=worker when your Python code
> uses only 22 cores, and PostgreSQL uses only 22 workers (so Python and
> PostgreSQL do not share CPU cores)? If no, it might indicate that the
> problem could be solved by increasing the execution policy in favor of
> PostgreSQL to give a hint to the scheduler that a CPU core should be
> given to PostgreSQL FIRST as Python most likely is waiting on it to
> continue, but PostgreSQL could not continue because the schedule gave
> all the cores to Python... (classical deadlock; eventually resolves once
> enough CPU cores are free to eventually finish the starving thread).

> HTH

> -Markus

Browse pgsql-bugs by date

  From Date Subject
Next Message Álvaro Herrera 2025-10-06 10:44:28 Re: BUG #19074: pg_dump from v18 loses the NOT NULL flag in the inherited table field when dumping v17-databases
Previous Message Marko Tiikkaja 2025-10-06 10:38:55 Re: [BUGS] BUG #11500: PRIMARY KEY index not being used