Re: Report bytes and transactions actually sent downtream

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, michael(at)paquier(dot)xyz, bertranddrouvot(dot)pg(at)gmail(dot)com, andres(at)anarazel(dot)de, shveta(dot)malik(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Report bytes and transactions actually sent downtream
Date: 2026-06-29 09:08:38
Message-ID: CAE9k0PkFYru=y9MLGWGgXQH1Ba7=aouBOMxdaMaDBnOK=N-9_g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Amit,

On Sun, Jun 28, 2026 at 12:26 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Jun 25, 2026 at 4:37 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Tue, Jun 16, 2026 at 2:06 AM Ashutosh Bapat
> > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > >
> > >
> > > Those are logical message types which are part of the logical change
> > > data - without those messages it's not possible to process the logical
> > > change data. So they are included. But the keepalive messages, for
> > > example, aren't part of the logical change data.
> >
> > I think logical replication messages like STREAM START/STREAM STOP and
> > BEGIN/END, versus messages like keepalive and standby/primary status
> > updates, operate at different layers. The former are the contents of
> > the logical output and seem to belong naturally to the replication
> > slot's statistics. I'm not sure the latter should be included as well
> > -- I'm concerned that counting those bytes would become noise when
> > analyzing the statistics over time, since they have no relation to the
> > volume of logical changes.
> >
>
> Both you and Ashutosh seem to make the similar points and sound
> reasonable, so let's do it that way.
>
> > If we do want a statistic showing the literal total bytes sent
> > downstream including protocol messages, ISTM that should be available
> > for both logical and physical replication: physical replication also
> > uses keepalive messages and adds a header to each message. In other
> > words, that kind of "bytes on the wire" metric isn't really specific
> > to a logical replication slot, so the slot's statistics don't seem
> > like the right place for it.
> >
> > The proposed column name 'sent_bytes' is also confusing to me, because
> > I don't think we can call it "total bytes actually sent" in the
> > logical decoding SQL API case. A name like 'plugin_total_bytes' seems
> > more straightforward and conveys the intent that protocol messages are
> > not included.
> >
>
> The SQL API point is genuine and if we display sent_bytes via SQL API
> then pg_logical_slot_get_changes() will show nonzero sent_bytes even
> though nothing was ever sent anywhere. OTOH, adding plugin_* prefix
> also starts to make it sound like stats are plugin specific, how about
> calling it as 'output_bytes'? It pairs cleanly with the existing
> column: total_bytes = decoded into the reorder buffer, output_bytes =
> decoded and handed to the consumer.
>
> If we use output_bytes, then we can describe the new stats on the
> lines of following text in the docs:
> <para>
> Amount of decoded data produced for this slot's consumer by the output
> plugin, after applying any output plugin filters and converting the
> changes into the output plugin's format. This counts the transaction
> changes together with the messages that delimit them (such as the
> begin and commit messages), but not connection-management messages
> such as keepalives, which are generated by the server rather than the
> output plugin and are therefore not included.
> </para>
> <para>
> This value can differ from <structfield>total_bytes</structfield>: it
> may be smaller because filtered changes are not output, or larger
> because the output plugin's format can be more verbose than the
> decoded changes. For these reasons
> <structfield>output_bytes</structfield> is not directly comparable to
> <structfield>total_bytes</structfield>.
> </para>
>

Please find the attached patch that renames sent_bytes to
output_bytes. Kindly have a look and feel free to share any comments
or suggestions.

--
With Regards,
Ashutosh Sharma.

Attachment Content-Type Size
v20260629-0001-Report-output-bytes-in-pg_stat_replication_slots.patch application/octet-stream 26.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2026-06-29 09:09:59 Re: Fix GROUP BY ALL handling of ORDER BY operator semantics
Previous Message Kirill Reshke 2026-06-29 09:02:38 Re: PostgreSQL select-only CTE removal is too aggressive?