Re: Report bytes and transactions actually sent downtream

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: ashu(dot)coek88(at)gmail(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, michael(at)paquier(dot)xyz, bertranddrouvot(dot)pg(at)gmail(dot)com, andres(at)anarazel(dot)de, shveta(dot)malik(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Report bytes and transactions actually sent downtream
Date: 2026-06-15 13:53:00
Message-ID: CAA4eK1L48XxBgE12Byp6tAq631PikciBFYSkGD8G4O++=C1=OA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 15, 2026 at 2:34 PM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> At Mon, 15 Jun 2026 13:52:36 +0530, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote in
> > Sorry for chiming in - I may well be misunderstanding this, but here's
> > how I'm currently thinking about it:
> >
> > Total transaction bytes refers to the size of decoded transactional
> > data accumulated in the reorder buffer for a given transaction.
> >
> > Sent bytes (as I understand from the patch) refers to the size of the
> > downstream output that the output plugin produces from that decoded
> > data, after any filtering and format conversion.
> >
> > To illustrate: if a transaction's decoded changes occupy 550 bytes in
> > the reorder buffer, but the output plugin filters some out and emits
> > only 300 bytes downstream, then total transaction bytes = 550 and sent
> > bytes = 300. Conversely, if all 550 bytes are converted into a more
> > verbose format and emitted as 700 bytes, total transaction bytes
> > remains 550 while sent bytes becomes 700.
> >
> > If I'm reading this right, since total bytes for a transaction is the
> > baseline from which transaction-derived downstream output is produced,
> > I wonder whether sent bytes should include only that
> > transaction-derived downstream output, or also downstream protocol
> > traffic such as keepalive messages, which are sent downstream but are
> > not derived from transaction bytes in the reorder buffer.
> >
> > My instinct is that if sent bytes are meant to measure
> > transaction-output throughput, keepalive messages probably shouldn't
> > be included, since they have no basis in transaction data and might
> > distort any comparison with total bytes. But I could be wrong - happy
> > to be corrected!
>
> Thank you for the explanation.
>
> I think I understand the distinction you are making. However, my
> question is one step earlier than the keepalive-message question. I am
> wondering whether the new metric needs to be defined in terms of
> logical-change output in the first place.
>
> If I understand the use case correctly, I think the discussion here is
> primarily about relatively high-volume logical replication
> workloads. My point is that, in that situation, I would expect the
> amount of logical-change output and the amount of data actually sent
> over the replication connection to show broadly similar trends.
>
> The latter seems easier to interpret, while still providing a useful
> signal for monitoring and capacity-planning purposes. It also seems
> more intuitive, since it corresponds directly to the amount of data
> sent over the replication connection.
>

BTW, the patch internally counts other protocol messages like START
STREAM/STOP STREAM, BEGIN/END, and quite a few others that help apply
workers to understand transaction boundaries and messages. So, I feel
in that sense we are already counting protocol bytes as part of patch,
so why leave the additional messages that are being discussed here.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2026-06-15 13:54:54 Re: [PATCH] Change wait_time column of pg_stat_lock to double precision
Previous Message Heikki Linnakangas 2026-06-15 13:45:22 Re: [Patch] Fix pg_upgrade/t/007_multixact_conversion.pl