Re: Invalid pointer access in logical decoding after error

From: vignesh C <vignesh21(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Invalid pointer access in logical decoding after error
Date: 2025-07-03 14:54:51
Message-ID: CALDaNm13ggU-HzAsD8RCEd5ZE=nDCC8WGdabHDZrqNzum6rb8g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2 Jul 2025 at 13:21, Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wed, Jul 2, 2025 at 2:42 PM vignesh C wrote:
>
> >
> > Hi,
> >
> > I encountered an invalid pointer access issue. Below are the steps to
> > reproduce the issue:
> ...
> > The error occurs because entry->columns is allocated in the entry
> > private context (entry->entry_cxt) by pub_collist_to_bitmapset(). This
> > context is a child of the PortalContext, which is cleared after an
> > error via: AbortTransaction
> > -> AtAbort_Portals ->
> > MemoryContextDeleteChildren -> MemoryContextDelete ->
> > MemoryContextDeleteOnly
> > As a result, the memory backing entry->columns is freed, but the
> > RelationSyncCache which resides in CacheMemoryContext and thus
> > survives the error still holds a dangling pointer to this freed
> > memory, causing it to pfree an invalid pointer.
> > In the normal (positive) execution flow, pgoutput_shutdown() is called
> > to clean up the RelationSyncCache. This happens via:
> > FreeDecodingContext -> shutdown_cb_wrapper -> pgoutput_shutdown But
> > this is not called in case of an error case. To handle this case
> > safely, I suggest calling FreeDecodingContext in the PG_CATCH block to
> > ensure pgoutput_shutdown is invoked and the stale cache is cleared appropriately.
> > Attached patch has the changes for the same.
> > Thoughts?
>
> Thank you for reporting the issue and providing a fix.
>
> I recall that we identified this general issue with the hash table in pgoutput
> in other threads as well [1]. The basic consensus [2] is that calling
> FreeDecodingContext() within PG_CATCH is not ideal, as this function includes
> user code, increasing the risk of encountering another error within PG_CATCH.
> This scenario could prevent execution of subsequent code to invalidate syscache
> entries, which is problematic.

Yes, let's avoid this.

> I think a better fix could be to introduce a memory context reset callback(on
> data->cachectx) and perform the actions of pgoutput_shutdown() within it.

The attached v2 version patch has the changes for the same.

Regards,
Vignesh

Attachment Content-Type Size
v2-0001-Fix-referencing-invalid-pointer-in-logical-decodi.patch text/x-patch 2.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-07-03 14:54:59 Re: add tab-complete for ALTER DOMAIN ADD...
Previous Message Dmitry Dolgov 2025-07-03 14:49:34 Re: Adding basic NUMA awareness