RE: memory leak in logical WAL sender with pgoutput's cachectx

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, 'Xuneng Zhou' <xunengzhou(at)gmail(dot)com>
Cc: 赵宇鹏(宇彭) <zhaoyupeng(dot)zyp(at)alibaba-inc(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: memory leak in logical WAL sender with pgoutput's cachectx
Date: 2025-08-15 02:10:21
Message-ID: OS0PR01MB5716FB74667B3AA9B4A68FD19434A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, August 14, 2025 8:49 PM Hayato Kuroda (Fujitsu) <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Xuneng,
>
> > Is it safe to free the substructure from within rel_sync_cache_relation_cb()?
>
> You referred the comment in rel_sync_cache_relation_cb() right? I understood
> like that we must not access to any *system caches*, from the comment. Here
> we do not re-build caches so that we do not access to the syscaches - it is
> permitted.

We should also avoid freeing the cache within the callback function, as
entries might still be accessed after the callback is invoked. This callback can
be triggered during the execution of pgoutput_change, which poses a risk of
concurrent access issues. For reference, see commit 7f481b8, which addresses a
similar issue. Ideally, cache release should occur within the normal code
execution path, such as within pgoutput_change() or get_rel_sync_entry().

>
> > I’ also interested in the reasoning behind setting
> > NINVALIDATION_THRESHOLD to 100.
>
> This is the debatable point of this implementation. I set to 100 because it was
> sufficient with the provided workload, but this may cause the replication lag.
> We may have to consider a benchmark workload, measure data, and consider
> the appropriate value. 100 is just an initial point.

I think it's worthwhile to test the performance impact of this approach. If the
table is altered but not dropped, removing the entries could introduce
additional overhead. Although this overhead might be negligible if within an
acceptable threshold, it still warrants consideration. Additionally, if the
number of invalidations is significant, the cleanup process may take a
considerable amount of time, potentially resulting in performance degradation.
In such cases, it might be beneficial to consider destroying the hash table and
creating a new one. Therefore, analyzing these scenarios is essential.

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2025-08-15 02:12:35 Re: index prefetching
Previous Message Thomas Munro 2025-08-15 01:49:05 Re: index prefetching