Re: memory leak in logical WAL sender with pgoutput's cachectx

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Xuneng Zhou <xunengzhou(at)gmail(dot)com>, 赵宇鹏(宇彭) <zhaoyupeng(dot)zyp(at)alibaba-inc(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: memory leak in logical WAL sender with pgoutput's cachectx
Date: 2025-08-21 10:11:17
Message-ID: CAA4eK1+o8E4hO0D4w4Lo=H8b_vdvkYq_D-XbfiajYxGw6V=rjg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 21, 2025 at 10:53 AM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> > I have concerns about the performance implications of iterating
> > through all entries in the caches within
> > maybe_cleanup_rel_sync_cache(). If the cache contains numerous
> > entries, this iteration could potentially cause the walsender to
> > stall. If we use a larger number NINVALIDATION_THRESHOLD, we can
> > reduce the number of times we need sequential scans on the hash table
> > but it would in turn need to free more entries (probably we can have a
> > cap of the number of entries we can free in one cycle?).
>
> Exactly.
>

So, at least we can try some tests before completely giving up on this idea.

> > An alternative approach would be to implement a dedicated list (such
> > as dclist) specifically for tracking invalidated entries. Entries
> > would be removed from this list when they are reused. We could then
> > implement a threshold-based cleanup mechanism where invalidated
> > entries are freed once the list exceeds a predetermined size. While
> > this approach would minimize the overhead of freeing invalidated
> > entries, it would incur some additional cost for maintaining the list.
>
> Firstly I also considered but did not choose because of the code complexity.
> After considering more, it is not so difficult, PSA new file.
>

The other idea I was thinking of is if somehow we can decode the DROP
TABLE WAL record, say delete of relid from pg_class then we can use
that to remove the corresponding entry from RelationSyncCache.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Shlok Kyal 2025-08-21 10:50:32 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Previous Message Amit Kapila 2025-08-21 09:55:29 Re: memory leak in logical WAL sender with pgoutput's cachectx