Re: Report bytes and transactions actually sent downtream

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, ashu(dot)coek88(at)gmail(dot)com, michael(at)paquier(dot)xyz, bertranddrouvot(dot)pg(at)gmail(dot)com, andres(at)anarazel(dot)de, shveta(dot)malik(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Report bytes and transactions actually sent downtream
Date: 2026-06-24 23:07:06
Message-ID: CAD21AoC+JoYKBAgTXR4fDWeL71cRDOvx-JmCOVQhWhCHG6r34w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 16, 2026 at 2:06 AM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Mon, Jun 15, 2026 at 7:23 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, Jun 15, 2026 at 2:34 PM Kyotaro Horiguchi
> > <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > >
> > > At Mon, 15 Jun 2026 13:52:36 +0530, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote in
> > > > Sorry for chiming in - I may well be misunderstanding this, but here's
> > > > how I'm currently thinking about it:
> > > >
> > > > Total transaction bytes refers to the size of decoded transactional
> > > > data accumulated in the reorder buffer for a given transaction.
> > > >
> > > > Sent bytes (as I understand from the patch) refers to the size of the
> > > > downstream output that the output plugin produces from that decoded
> > > > data, after any filtering and format conversion.
> > > >
> > > > To illustrate: if a transaction's decoded changes occupy 550 bytes in
> > > > the reorder buffer, but the output plugin filters some out and emits
> > > > only 300 bytes downstream, then total transaction bytes = 550 and sent
> > > > bytes = 300. Conversely, if all 550 bytes are converted into a more
> > > > verbose format and emitted as 700 bytes, total transaction bytes
> > > > remains 550 while sent bytes becomes 700.
> > > >
> > > > If I'm reading this right, since total bytes for a transaction is the
> > > > baseline from which transaction-derived downstream output is produced,
> > > > I wonder whether sent bytes should include only that
> > > > transaction-derived downstream output, or also downstream protocol
> > > > traffic such as keepalive messages, which are sent downstream but are
> > > > not derived from transaction bytes in the reorder buffer.
> > > >
> > > > My instinct is that if sent bytes are meant to measure
> > > > transaction-output throughput, keepalive messages probably shouldn't
> > > > be included, since they have no basis in transaction data and might
> > > > distort any comparison with total bytes. But I could be wrong - happy
> > > > to be corrected!
> > >
> > > Thank you for the explanation.
> > >
> > > I think I understand the distinction you are making. However, my
> > > question is one step earlier than the keepalive-message question. I am
> > > wondering whether the new metric needs to be defined in terms of
> > > logical-change output in the first place.
> > >
> > > If I understand the use case correctly, I think the discussion here is
> > > primarily about relatively high-volume logical replication
> > > workloads. My point is that, in that situation, I would expect the
> > > amount of logical-change output and the amount of data actually sent
> > > over the replication connection to show broadly similar trends.
> > >
> > > The latter seems easier to interpret, while still providing a useful
> > > signal for monitoring and capacity-planning purposes. It also seems
> > > more intuitive, since it corresponds directly to the amount of data
> > > sent over the replication connection.
> > >
> >
> > BTW, the patch internally counts other protocol messages like START
> > STREAM/STOP STREAM, BEGIN/END, and quite a few others that help apply
> > workers to understand transaction boundaries and messages. So, I feel
> > in that sense we are already counting protocol bytes as part of patch,
> > so why leave the additional messages that are being discussed here.
>
> Those are logical message types which are part of the logical change
> data - without those messages it's not possible to process the logical
> change data. So they are included. But the keepalive messages, for
> example, aren't part of the logical change data.

I think logical replication messages like STREAM START/STREAM STOP and
BEGIN/END, versus messages like keepalive and standby/primary status
updates, operate at different layers. The former are the contents of
the logical output and seem to belong naturally to the replication
slot's statistics. I'm not sure the latter should be included as well
-- I'm concerned that counting those bytes would become noise when
analyzing the statistics over time, since they have no relation to the
volume of logical changes.

If we do want a statistic showing the literal total bytes sent
downstream including protocol messages, ISTM that should be available
for both logical and physical replication: physical replication also
uses keepalive messages and adds a header to each message. In other
words, that kind of "bytes on the wire" metric isn't really specific
to a logical replication slot, so the slot's statistics don't seem
like the right place for it.

The proposed column name 'sent_bytes' is also confusing to me, because
I don't think we can call it "total bytes actually sent" in the
logical decoding SQL API case. A name like 'plugin_total_bytes' seems
more straightforward and conveys the intent that protocol messages are
not included.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Baji Shaik 2026-06-24 23:22:20 Re: uuidv7 improperly accepts dates before 1970-01-01
Previous Message Richard Guo 2026-06-24 23:05:28 Re: Add enable_groupagg GUC parameter to control GroupAggregate usage