From: | vignesh C <vignesh21(at)gmail(dot)com> |
---|---|
To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Invalid pointer access in logical decoding after error |
Date: | 2025-07-03 14:54:51 |
Message-ID: | CALDaNm13ggU-HzAsD8RCEd5ZE=nDCC8WGdabHDZrqNzum6rb8g@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 2 Jul 2025 at 13:21, Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wed, Jul 2, 2025 at 2:42 PM vignesh C wrote:
>
> >
> > Hi,
> >
> > I encountered an invalid pointer access issue. Below are the steps to
> > reproduce the issue:
> ...
> > The error occurs because entry->columns is allocated in the entry
> > private context (entry->entry_cxt) by pub_collist_to_bitmapset(). This
> > context is a child of the PortalContext, which is cleared after an
> > error via: AbortTransaction
> > -> AtAbort_Portals ->
> > MemoryContextDeleteChildren -> MemoryContextDelete ->
> > MemoryContextDeleteOnly
> > As a result, the memory backing entry->columns is freed, but the
> > RelationSyncCache which resides in CacheMemoryContext and thus
> > survives the error still holds a dangling pointer to this freed
> > memory, causing it to pfree an invalid pointer.
> > In the normal (positive) execution flow, pgoutput_shutdown() is called
> > to clean up the RelationSyncCache. This happens via:
> > FreeDecodingContext -> shutdown_cb_wrapper -> pgoutput_shutdown But
> > this is not called in case of an error case. To handle this case
> > safely, I suggest calling FreeDecodingContext in the PG_CATCH block to
> > ensure pgoutput_shutdown is invoked and the stale cache is cleared appropriately.
> > Attached patch has the changes for the same.
> > Thoughts?
>
> Thank you for reporting the issue and providing a fix.
>
> I recall that we identified this general issue with the hash table in pgoutput
> in other threads as well [1]. The basic consensus [2] is that calling
> FreeDecodingContext() within PG_CATCH is not ideal, as this function includes
> user code, increasing the risk of encountering another error within PG_CATCH.
> This scenario could prevent execution of subsequent code to invalidate syscache
> entries, which is problematic.
Yes, let's avoid this.
> I think a better fix could be to introduce a memory context reset callback(on
> data->cachectx) and perform the actions of pgoutput_shutdown() within it.
The attached v2 version patch has the changes for the same.
Regards,
Vignesh
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Fix-referencing-invalid-pointer-in-logical-decodi.patch | text/x-patch | 2.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Álvaro Herrera | 2025-07-03 14:54:59 | Re: add tab-complete for ALTER DOMAIN ADD... |
Previous Message | Dmitry Dolgov | 2025-07-03 14:49:34 | Re: Adding basic NUMA awareness |