From: | Marco Boeringa <marco(at)boeringa(dot)demon(dot)nl> |
---|---|
To: | Thom Brown <thom(at)linux(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: Potential "AIO / io workers" inter-worker locking issue in PG18? |
Date: | 2025-10-06 09:55:54 |
Message-ID: | 1342ae2a-6261-49ac-a106-cf0a4a2d173a@boeringa.demon.nl |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi Thom,
As an extension to what I already wrote: as the processing gets stuck
during UPDATEs, I realized the pg_aios view is likely not involved, as
the current AIO implementation of PG18 only affects the read operations
like sequential and bitmap heap scans.
So not seeing anything listed in the pg_aios view might be normal? That
said, I have attempted to view and refresh the view during other stages
of the processing, with pgAdmin showing apparently read operations, but
still no records displayed in pg_aios. Maybe I am hitting the "refresh"
button on the wrong time though...
But maybe the whole new AIO thing isn't involved in these, and it is
another issue in PG18. Just to summarize my observations once again:
- Multi-threaded processing implemented in Python using pyodbc and
concurrent.futures apparently getting stuck waiting for PostgreSQL to
return. The processing step involved should return in ***less than 10
seconds*** for the small Italy extract, but can take >1h (up to >6) when
it gets randomly stuck (some runs successful without delay, others not).
- pgAdmin showing all sessions associated with the threads as 'Active'
with no wait events nor blocking PIDs during the whole time the
processing appears stuck in PostgreSQL.
- No other sessions like VACUUM visible in pgAdmin during the time the
processing appears stuck except the main 'postgres' user session.
- All locks as shown in pg_locks are granted, and most if not all are
fastpath, with only AccessShareLock and RowExclusiveLock on the table
and its indexes involved. A couple of ExclusiveLock on virtualxid and
transactionid.
- 'Top' in Ubuntu showing multiple backend 'postgres' processes
continuously at high core usage, one for each thread (each Python thread
of course uses its own connection).
- pg_aios view empty, but the processing is UPDATEs, so probably no
surprise.
- The processing *DOES* eventually continue after this particular
anomaly, with no further consequences and expected results at the end of
the total processing flow, so it is not a true dead-lock.
- I have noticed it gets stuck when processing OpenStreetMap scrub or
grassland of the Italy extract of Geofabrik. However, as written above,
some processing runs are fine on the same data, while others get stuck
and delayed. The issue may or may not involve PostGIS though considering
this and the fact that the processing step getting stuck involves
PostGIS functions.
- In pgAdmin, the SQL statements as generated by my geoprocessing
workflow and as being processed by PostgreSQL when the processing is
stuck, look like this:
UPDATE osm.landcover_scrubs_small_scale_2_ply AS t1 SET area_geo =
t2.area_geo, perim_geo = t2.perim_geo, compact_geo = CASE WHEN
t2.area_geo > 0 THEN ((power(t2.perim_geo,2) / t2.area_geo) / (4 *
pi())) ELSE 0 END, npoints_geo = t2.npoints_geo, comp_npoints_geo = CASE
WHEN t2.npoints_geo > 0 THEN (CASE WHEN t2.area_geo > 0 THEN
((power(t2.perim_geo,2) / t2.area_geo) / (4 * pi())) ELSE 0 END /
t2.npoints_geo) ELSE 0 END, convex_ratio_geo = CASE WHEN
ST_Area(ST_ConvexHull(way)::geography,true) > 0 THEN (t2.area_geo /
ST_Area(ST_ConvexHull(way)::geography,true)) ELSE 1 END FROM (SELECT
objectid,ST_Area(way::geography,true) AS
area_geo,ST_Perimeter(way::geography,true) AS perim_geo,ST_NPoints(way)
AS npoints_geo FROM osm.landcover_scrubs_small_scale_2_ply) AS t2 WHERE
(t2.objectid = t1.objectid) AND t1.objectid IN (SELECT t3.objectid FROM
mini_test.osm.osm_tmp_28128_ch5 AS t3)
- All of this worked fine in PG <= 17.
Marco
>
> So, to confirm, you get the issue with as little as 3 io_workers?
>
> Also, what is pg_aios telling you during this time?
>
> Thom
>
From | Date | Subject | |
---|---|---|---|
Next Message | Marko Tiikkaja | 2025-10-06 10:38:55 | Re: [BUGS] BUG #11500: PRIMARY KEY index not being used |
Previous Message | Dilip Kumar | 2025-10-06 06:49:41 | Re: BUG #19074: pg_dump from v18 loses the NOT NULL flag in the inherited table field when dumping v17-databases |