Quick Links

Re: Potential "AIO / io workers" inter-worker locking issue in PG18?

From:	Marco Boeringa <marco(at)boeringa(dot)demon(dot)nl>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	pgsql-bugs(at)lists(dot)postgresql(dot)org, Thom Brown <thom(at)linux(dot)com>
Subject:	Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
Date:	2025-10-09 07:40:03
Message-ID:	bc0cb8d6-c9ed-465b-9480-71948ac01625@boeringa.demon.nl
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

Last attempt to get the mail out in a proper format. I hate it when
software gets "smart" with formatting... ;-(

I added one minor detail to the last question of Andres at the entire
bottom, see below.

> Even just knowing whether the "normal query plan" is the same one as we see in
> profiles of "stuck" backends is valuable. Even if the query plan is perfectly
> normal, it *still* is very important to know in which order the joins are
> evaluated etc. But there also might be changes in the query plan between 17
> and 18 that trigger the issue...
>
> Without more details about what is expected to be run and what is actually
> happening, it's just about impossible for us to debug this without a
> reproducer that we can run and debug ourselves.
I now encountered the auto_explain option in the PostgreSQL help. May
sound stupid, but I hadn't been aware of this option. This might help in
getting an explain during the actual execution of my tool, if I
understand the option properly. This would be far more valuable - as
being the "real" thing - than some contrived reproduction case. I will
need to investigate this a bit more:

https://www.postgresql.org/docs/current/auto-explain.html

>>> Making vacuum more aggressive won't really help much if you have
>> longrunning queries/sessions, since vacuum can't clean up row versions that
>> are still visibile to some of the transactions.
>>
>> My code batches the updates in sets of 2000 records at a time and then
>> COMMITs, so the transactions themselves are limited in time and size, which
>> should allow vacuum to do its job.
> Are the "stuck" backends stuck within one 2000 record batch, or are they
> "just" slower processing each batch?
I can't tell. But to explain: each thread has its own set of jobs
assigned, and each job will be batched in sets of 2000 records until
COMMIT. So if one job has 100k records to process, 50 commits should
occur for that job by one Python thread. I take care to avoid to process
records totally randomly, which could cause conflicts and locking issues
between threads attempting to access the same locked database page,
significantly slowing down the processing. Records are assigned by
database page (and depending on some other parameters), which has worked
really well so far.

Note that this is just a simplified version of the different processing
modes I developed for different challenges and geoprocessing steps.

>> 26.48% postgres postgres [.] LWLockAttemptLock
>> |
>> ---LWLockAttemptLock
>> |
>> |--23.15%--heapam_index_fetch_tuple.lto_priv.0
>> | index_fetch_heap
>> | IndexNext
>> | ExecScan
>> | ExecNestLoop
>> | ExecNestLoop
>> | ExecModifyTable
>> | standard_ExecutorRun
>> | ProcessQuery
> So the query plan we have is a nested loop between at least three tables
> (there are two joins, c.f. the two ExecNestLoop calls), where there presumably
> are a lot of row [versions] on the inner side of the innermost join.
>
> In [1] you showed a query. Reformated that looks like this:
>
> UPDATE osm.landcover_scrubs_small_scale_2_ply AS t1
> SET area_geo = t2.area_geo,
> perim_geo = t2.perim_geo,
> compact_geo = CASE WHEN t2.area_geo > 0 THEN ((power(t2.perim_geo,2) / t2.area_geo) / (4 * pi())) ELSE 0 END,
> npoints_geo = t2.npoints_geo,
> comp_npoints_geo = CASE WHEN t2.npoints_geo > 0 THEN (CASE WHEN t2.area_geo > 0 THEN ((power(t2.perim_geo,2) / t2.area_geo) / (4 * pi())) ELSE 0 END / t2.npoints_geo) ELSE 0 END,
> convex_ratio_geo = CASE WHEN ST_Area(ST_ConvexHull(way)::geography,true) > 0 THEN (t2.area_geo / ST_Area(ST_ConvexHull(way)::geography,true)) ELSE 1 END
> FROM (
> SELECT
> objectid,
> ST_Area(way::geography,true) AS area_geo,
> ST_Perimeter(way::geography,true) AS perim_geo,
> ST_NPoints(way) AS npoints_geo
> FROM osm.landcover_scrubs_small_scale_2_ply) AS t2
> WHERE (t2.objectid = t1.objectid)
> AND t1.objectid IN (SELECT t3.objectid FROM mini_test.osm.osm_tmp_28128_ch5 AS t3)
>
>
> Which certainly fits with two nested loops, although I don't think I can infer
> which order it the joins are in.
>
>
> Is osm.landcover_scrubs_small_scale_2_ply.object_id unique?

Yes.

> Can there be multiple rows for one object_id in
> mini_test.osm.osm_tmp_28128_ch5?

No. This table contains the records to process, which are unique. It is
the job.

It is a one-to-one join.

> Are there indexes on mini_test.osm.osm_tmp_28128_ch5.unique_id and
> osm.landcover_scrubs_small_scale_2_ply?

Yes, the unique ids / objectid fields are indexed to allow an efficient
join.

Actually, the "*_ch<number>" database object that represents the records
to process for one job, references a database view. Each thread gets its
own view. All views reference the same secondary table that has an index
on the objectid.

In response to

Re: Potential "AIO / io workers" inter-worker locking issue in PG18? at 2025-10-08 19:08:28 from Andres Freund

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	PG Bug reporting form	2025-10-09 08:35:24	BUG #19078: Segfaults in tts_minimal_store_tuple() following pg_upgrade
Previous Message	Michael Paquier	2025-10-09 02:16:15	Re: TRAP: failed Assert("outerPlan != NULL") in postgres_fdw.c