| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Subject: | Re: BUG #19366: heap-use-after-free in pgaio_io_reclaim() detected with RELCACHE_FORCE_RELEASE |
| Date: | 2026-01-14 16:45:30 |
| Message-ID: | an3xpqvvga47xpazihhdijpsuor4offvt2shctqdfwkwh7liye@k2cqhszxqwva |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Hi,
On 2025-12-29 06:00:01 +0000, PG Bug reporting form wrote:
> The following bug has been logged on the website:
>
> Bug reference: 19366
> Logged by: Alexander Lakhin
> Email address: exclusion(at)gmail(dot)com
> PostgreSQL version: 18.1
> Operating system: Ubuntu 24.04
> Description:
Alexander pinged me about this - thanks, I had missed this thread!
> =================================================================
> ==1414701==ERROR: AddressSanitizer: heap-use-after-free on address
> 0x52d000160a10 at pc 0x6315765530f4 bp 0x7fff3a67b6d0 sp 0x7fff3a67b6c0
> WRITE of size 8 at 0x52d000160a10 thread T0
> #0 0x6315765530f3 in pgaio_io_reclaim
> .../src/backend/storage/aio/aio.c:698
> #1 0x6315765523dd in pgaio_io_process_completion
> [...]
> #5 0x6315765568ad in pgaio_closing_fd
> .../src/backend/storage/aio/aio.c:1279
> #6 0x6315765bf4dc in FileClose .../src/backend/storage/file/fd.c:1975
> #7 0x6315766d8285 in mdclose .../src/backend/storage/smgr/md.c:726
> #8 0x6315766e3264 in smgrrelease .../src/backend/storage/smgr/smgr.c:356
> #9 0x6315766e34af in smgrclose .../src/backend/storage/smgr/smgr.c:376
> #10 0x631576ee2edb in RelationCloseSmgr
> ../../../../src/include/utils/rel.h:597
> #11 0x631576efae6e in RelationInvalidateRelation
> .../src/backend/utils/cache/relcache.c:2527
> #12 0x631576efb3f8 in RelationClearRelation
> .../src/backend/utils/cache/relcache.c:2560
> #13 0x631576ef7582 in RelationCloseCleanup
> .../src/backend/utils/cache/relcache.c:2251
> #14 0x631576f247bf in ResOwnerReleaseRelation
> [...]
> #18 0x63157709ace5 in ResourceOwnerRelease
> .../src/backend/utils/resowner/resowner.c:661
> #19 0x631574fd4ac1 in AbortTransaction
> (.../tmp_install/usr/local/pgsql/bin/postgres+0x3437cf4) (BuildId:
> fb9da6221fd034ea4004b34de480b536444e54b6)
The problem is that for reasons I can't quite fathom, relcache cleanup happens
way earlier in resowner cleanup than I had realized. The resowner cleanup then
can trigger waiting for the IO as part of closing file descriptors, which in
turn will reference memory that was freed below AtAbort_Portals().
Importantly, at that point we haven't yet done this bit from
ResouceOwnerReleaseInternal():
while (!dlist_is_empty(&owner->aio_handles))
{
dlist_node *node = dlist_head_node(&owner->aio_handles);
pgaio_io_release_resowner(node, !isCommit);
}
which would have removed the reference to the local memory.
Besides that relcache cleanup happens early, I'm also somewhat surprised at
AtAbort_Portals() happen so early and that AtAbort_Portals() frees memory.
Note that
/*
* Abort processing for portals.
*
* At this point we run the cleanup hook if present, but we can't release the
* portal's memory until the cleanup call.
*/
void
AtAbort_Portals(void)
says that memory won't be released. Unfortunately, while that's kinda true, we
*do* already clean up some of the memory:
/*
* Although we can't delete the portal data structure proper, we can
* release any memory in subsidiary contexts, such as executor state.
* The cleanup hook was the last thing that might have needed data
* there. But leave active portals alone.
*/
if (portal->status != PORTAL_ACTIVE)
MemoryContextDeleteChildren(portal->portalContext);
Not yet quite sure how to best fix this.
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Pierre Forstmann | 2026-01-14 18:15:17 | Re: BUG #19369: Not documented that io_uring on kernel versions between 5.1 and below 5.6 does not work |
| Previous Message | Amit Langote | 2026-01-14 13:38:29 | Re: BUG #19099: Conditional DELETE from partitioned table with non-updatable partition raises internal error |