Quick Links

Re: memory leak in logical WAL sender with pgoutput's cachectx

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	"Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc:	Xuneng Zhou <xunengzhou(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, 赵宇鹏(宇彭) <zhaoyupeng(dot)zyp(at)alibaba-inc(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: memory leak in logical WAL sender with pgoutput's cachectx
Date:	2025-08-15 08:09:50
Message-ID:	CAD21AoCWkk7p+mceg+_94q2mP1RtsfjeHtVYbaoc-QPRfvqe1w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Aug 14, 2025 at 10:26 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Friday, August 15, 2025 10:59 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> > Thanks for your clarification!
> >
> > On Fri, Aug 15, 2025 at 10:10 AM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
> > wrote:
> > >
> > > On Thursday, August 14, 2025 8:49 PM Hayato Kuroda (Fujitsu)
> > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > > >
> > > > Dear Xuneng,
> > > >
> > > > > Is it safe to free the substructure from within
> > rel_sync_cache_relation_cb()?
> > > >
> > > > You referred the comment in rel_sync_cache_relation_cb() right? I
> > > > understood like that we must not access to any *system caches*, from
> > > > the comment. Here we do not re-build caches so that we do not access
> > > > to the syscaches - it is permitted.
> > >
> > > We should also avoid freeing the cache within the callback function,
> > > as entries might still be accessed after the callback is invoked. This
> > > callback can be triggered during the execution of pgoutput_change,
> > > which poses a risk of concurrent access issues. For reference, see
> > > commit 7f481b8, which addresses a similar issue. Ideally, cache
> > > release should occur within the normal code execution path, such as within
> > pgoutput_change() or get_rel_sync_entry().
> >
> > I am still getting familiar with this module, so I cannot make a sound
> > confirmation for this yet. I'll take a look at the commit mentioned by Zhijie and
> > trace the callback paths.
> >
> > > >
> > > > > I’ also interested in the reasoning behind setting
> > > > > NINVALIDATION_THRESHOLD to 100.
> > > >
> > > > This is the debatable point of this implementation. I set to 100
> > > > because it was sufficient with the provided workload, but this may cause
> > the replication lag.
> > > > We may have to consider a benchmark workload, measure data, and
> > > > consider the appropriate value. 100 is just an initial point.
> > >
> > > I think it's worthwhile to test the performance impact of this
> > > approach. If the table is altered but not dropped, removing the
> > > entries could introduce additional overhead. Although this overhead
> > > might be negligible if within an acceptable threshold, it still
> > > warrants consideration. Additionally, if the number of invalidations
> > > is significant, the cleanup process may take a considerable amount of time,
> > potentially resulting in performance degradation.
> > > In such cases, it might be beneficial to consider destroying the hash
> > > table and creating a new one. Therefore, analyzing these scenarios is
> > essential.
> >
> > Intuitively, a heuristic threshold is straightforward but also can be hard to get it
> > "right" since the workload varies.
> > Designing and writing tests to reflect the real-world use cases well may not be
> > that easy as well. I'm wondering whether it's beneficial to brainstorm other
> > potential alternatives before settling on the current approach and start
> > benchmarking and performance tuning.
> >
> > Sequential cleaning could be costful in certain cases. I don't know much about
> > RelationSyncCache's life cycle, but it seems that we should be cautious on
> > shortening it without causing subtle issues.
>
> I don't have a great idea that can avoid introducing overhead in all cases.
>
> An alternative approach I considered before is deleting the hash entry upon
> invalidation if the entry is not in use. This would require adding an in_use
> flag in RelationSyncEntry, setting it to true when get_rel_sync_entry is called,
> and marking it as false once the entry is no longer in use. This approach is
> inspired from the relcache invalidation. I am slightly not sure if it's worth
> the effort, but just share this idea with a POC patch for reference, which has
> been modified based on Kuroda-San's version.

IIUC the patch adds logic to close entries and sets in_use to false
after processing each change. For example, in pgoutput_change(), we
close the entries even after we send its change:

@@ -1621,6 +1635,8 @@ cleanup:
ExecDropSingleTupleTableSlot(new_slot);
}

+ close_rel_sync_entry(relentry);
+
MemoryContextSwitchTo(old);
MemoryContextReset(data->context);

Given that cache invalidation is executed upon replaying
REORDER_BUFFER_CHANGE_INVALIDATION and the end of a transaction
replay, in which case do we keep the relcache (i.e. just setting
replicate_valid=false) because of in_use=true?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

RE: memory leak in logical WAL sender with pgoutput's cachectx at 2025-08-15 05:25:50 from Zhijie Hou (Fujitsu)

Responses

RE: memory leak in logical WAL sender with pgoutput's cachectx at 2025-08-15 12:07:30 from Hayato Kuroda (Fujitsu)

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Chao Li	2025-08-15 08:35:00	Clarify comment in _bt_set_startikey
Previous Message	Kirill Reshke	2025-08-15 07:36:00	Re: Sequence Access Methods, round two