Quick Links

Re: Potential "AIO / io workers" inter-worker locking issue in PG18?

From:	Marco Boeringa <marco(at)boeringa(dot)demon(dot)nl>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	pgsql-bugs(at)lists(dot)postgresql(dot)org, Thom Brown <thom(at)linux(dot)com>
Subject:	Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
Date:	2025-10-12 08:24:51
Message-ID:	6cf37645-9f1e-4ed6-9397-23ba5de04931@boeringa.demon.nl
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

Hi Andres,

I have been doing a bit more investigation. As I explained before, the
problematic multi-threaded geoprocessing step is not some stand-alone
query that can be easily reduced to small easily portable reproducible
case with attached data. In fact, this geoprocessing step is part of a
large custom build Python geoprocessing workflow, with total code
probably in the 25k code lines range.

However, based on the apparent poor query plan in PG18 / PostGIS 3.6.0,
I now reviewed the exact code once more. I noticed that just before
entering the multi-threaded code that emits the queries as seen below, I
am actually adding the primary key field 'objectid' as "GENERATED BY
DEFAULT AS IDENTITY" to the 'osm.landcover_scrubs_small_scale_2_ply' table.

Now I also noticed I did not run ANALYZE after that against the same
table. I have now added this to the code. Although it is still
preliminary, first tests seem to indicate that this resolves the issue,
and prevents the stalls or better said apparent hugely inefficient query
plan (remember: a < 10 sec process was turned into multi-hour). I still
need to do more thorough testing to be sure though.

However, this raises a couple of question:

- While ANALYZE is of course hugely important for proper statistics and
query planning, I have wondered if PostgreSQL shouldn't automatically
have updated the statistics for the addition of the primary key with
IDENTITY? It seems to me that based on the definition of the primary key
column and IDENTITY and table size, the actual distribution of values is
essentially already known even before any sampling of ANALYZE to update
statistics?

- Am I right to assume that only the statistics on the objectid field
play any role in this issue? As you can see, the WHERE clause does not
involve any other fields than the two objectid fields of the main table
and the chunk table specifying the job. All other values computed are
just derived straight from the geometry column.

- Were there any hints in the all the other data I supplied as to where
PG18's query planning without the updated statistics of the new ANALYZE
step added, is going wrong? And why this was never an issue in <= PG17?

I also did some preliminary test with the old PG17.6 / PostgGIS 3.6.0
cluster with the same Italy extract data. I still need to do more
thorough testing, both with and without the extra ANALYZE step, to fully
exclude that there isn't something related to the upgrade to PostGIS
3.6.0, but first indications are as I already saw with the PG17.6 /
PostgGIS 3.5.3, that there are no issue with <= PG17 / PostGIS
combination as regards this apparent planner issue.

Marco

Op 8-10-2025 om 21:08 schreef Andres Freund:
> In [1] you showed a query. Reformated that looks like this:
>
> UPDATE osm.landcover_scrubs_small_scale_2_ply AS t1
> SET area_geo = t2.area_geo,
> perim_geo = t2.perim_geo,
> compact_geo = CASE WHEN t2.area_geo > 0 THEN ((power(t2.perim_geo,2) / t2.area_geo) / (4 * pi())) ELSE 0 END,
> npoints_geo = t2.npoints_geo,
> comp_npoints_geo = CASE WHEN t2.npoints_geo > 0 THEN (CASE WHEN t2.area_geo > 0 THEN ((power(t2.perim_geo,2) / t2.area_geo) / (4 * pi())) ELSE 0 END / t2.npoints_geo) ELSE 0 END,
> convex_ratio_geo = CASE WHEN ST_Area(ST_ConvexHull(way)::geography,true) > 0 THEN (t2.area_geo / ST_Area(ST_ConvexHull(way)::geography,true)) ELSE 1 END
> FROM (
> SELECT
> objectid,
> ST_Area(way::geography,true) AS area_geo,
> ST_Perimeter(way::geography,true) AS perim_geo,
> ST_NPoints(way) AS npoints_geo
> FROM osm.landcover_scrubs_small_scale_2_ply) AS t2
> WHERE (t2.objectid = t1.objectid)
> AND t1.objectid IN (SELECT t3.objectid FROM mini_test.osm.osm_tmp_28128_ch5 AS t3)
>
>
> Which certainly fits with two nested loops, although I don't think I can infer
> which order it the joins are in.

In response to

Re: Potential "AIO / io workers" inter-worker locking issue in PG18? at 2025-10-08 19:08:28 from Andres Freund

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Alexander Lakhin	2025-10-12 18:00:00	Re: Potential deadlock in pgaio_io_wait()
Previous Message	David Rowley	2025-10-12 08:24:27	Re: [BUGS] BUG #11500: PRIMARY KEY index not being used