Quick Links

Re: Report reorder buffer size

From:	Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Report reorder buffer size
Date:	2025-08-28 04:35:48
Message-ID:	CAExHW5s+Co-BFwoTj2v2_FjpvYSmjLfvG=X3SCMAqaWbsQ0oLA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Aug 28, 2025 at 5:56 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Aug 26, 2025 at 2:45 AM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > Hi Masahiko,
> > Thanks for your inputs.
> >
> > On Tue, Aug 26, 2025 at 2:17 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > > On Wed, Aug 13, 2025 at 5:40 AM Ashutosh Bapat
> > > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > >
> > > Thank you for starting the thread about this topic. This is one of the
> > > topics that I've been eager to address. Here are some random comments:
> > >
> >
> > Glad that more people think there is a need for improving the reorder
> > buffer statistics.
> >
> > > >
> > > > With the attached WIP patch, we can report, at a given point in time,
> > > > total size of reorder buffer i.e. sum of the size of all the changes
> > > > belonging to the transactions active in the reorder buffer whether
> > > > those changes are in memory, on disk or sent downstream as part of an
> > > > unfinished streamed transaction. By sampling this value at regular
> > > > intervals, one can observe the reorder buffer trendline as shown in
> > > > the plots attached.
> > >
> > > Alternative idea (or an additional metric) would be the high-watermark
> > > of memory usage. That way, users won't miss memory usage spikes that
> > > might be missed by sampling the total memory usage.
> >
> > I think the trendline is important to make right trade-offs in case
> > setting logical_decoding_work_mem to maximum is not possible. With a
> > high watermark the question is how long does a high remain high? As
> > the load changes, a high that was once may become lower or higher than
> > the new high and soon irrelevant. Probably we could reset the
> > high-watermark every time the view is sampled, so that we provide a
> > trendline for the high-water mark as well. But I want to be cautious
> > about adding that complexity of tracking maxima accurately and then
> > maintain it forever. If sampling is frequent enough usually it will
> > capture a maxima and minima good enough for practical purposes. The
> > users will need to consider the trendline as approximate anyway since
> > the load will show slight variations over the time.
>
> Agreed with the importance of trendline.
>
> > Please share your idea of reporting high-watermark.
>
> I was thinking of this high-watermark idea since it doesn't require
> any external system/tool to get the information for tuning
> logical_decoding_work_mem. It would be easy to use when users want to
> figure out maximum data usage for logical decoding of the particular
> workload.

Ok. I am interested in knowing how you think we should keep track of
the high watermark.

Do we keep it updated as new maxima are found? And keep reporting the
last high watermark seen till a new maxima is reached? With that,
users would end up setting logical_decoding_work_mem (and hence
provisioning for it) at a higher value, even if the workload changes
so that the reorder buffer never reaches that high watermark. So, I
think we should reset the high watermark. Do you think so? If yes,
When do we do that? How often do we reset it?

Tracking maxima and minima of other parameters in pg_stat_replication
like replication lag might also help users e.g. to tune their
networks. But we don't track their watermarks. External tools are used
for that. I was envisioning something like that for reorder buffer
size as well.

>
> >
> > > >
> > > > 2. Is the size of the reorder buffer enough or we want to also track
> > > > the size of changes on disk and the size of changes sent downstream as
> > > > part of unfinished streamed transactions separately? Also the number
> > > > of transactions being tracked by the reorder buffer?
> > >
> > > For the purpose of tuning logical_decoding_work_mem, the additional
> > > metrics and statistics we need might not be many. But in order to make
> > > logical decoding more visible for users for debugging or diagnosing
> > > purposes, more statistics like the number of transactions being
> > > tracked by the reorder buffer might be required.
> > >
> >
> > Ok. We could add the total number of transactions in the reorder
> > buffer at a given point in time to the report easily. How about
> > subtransactions? How about prepared but not committed/aborted
> > transactions?
>
> For debugging or diagnosing purposes, this information might be
> useful, but I'm not sure we need this information of logical decodings
> that are running.

I was not clear. Sorry. I meant reporting the number of
subtransactions and prepared transactions in the reorder buffer (not
the actual subtransactions or prepared transactions themselves). Do
you think that reporting just the number of transactions in the
reorder buffer is enough? Do you think that reporting the number of
subtransactions and the number of prepared transactions is not
required?

--
Best Wishes,
Ashutosh Bapat

In response to

Re: Report reorder buffer size at 2025-08-28 00:26:17 from Masahiko Sawada

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	shveta malik	2025-08-28 05:06:07	Re: Avoid retaining conflict-related data when no tables are subscribed
Previous Message	Dilip Kumar	2025-08-28 04:14:34	Re: Potential problem in commit f777d773878 and 4f7f7b03758