| From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
|---|---|
| To: | ashu(dot)coek88(at)gmail(dot)com |
| Cc: | ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, michael(at)paquier(dot)xyz, amit(dot)kapila16(at)gmail(dot)com, bertranddrouvot(dot)pg(at)gmail(dot)com, andres(at)anarazel(dot)de, shveta(dot)malik(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Report bytes and transactions actually sent downtream |
| Date: | 2026-06-15 09:04:01 |
| Message-ID: | 20260615.180401.1834123951330189307.horikyota.ntt@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello.
At Mon, 15 Jun 2026 13:52:36 +0530, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote in
> Sorry for chiming in - I may well be misunderstanding this, but here's
> how I'm currently thinking about it:
>
> Total transaction bytes refers to the size of decoded transactional
> data accumulated in the reorder buffer for a given transaction.
>
> Sent bytes (as I understand from the patch) refers to the size of the
> downstream output that the output plugin produces from that decoded
> data, after any filtering and format conversion.
>
> To illustrate: if a transaction's decoded changes occupy 550 bytes in
> the reorder buffer, but the output plugin filters some out and emits
> only 300 bytes downstream, then total transaction bytes = 550 and sent
> bytes = 300. Conversely, if all 550 bytes are converted into a more
> verbose format and emitted as 700 bytes, total transaction bytes
> remains 550 while sent bytes becomes 700.
>
> If I'm reading this right, since total bytes for a transaction is the
> baseline from which transaction-derived downstream output is produced,
> I wonder whether sent bytes should include only that
> transaction-derived downstream output, or also downstream protocol
> traffic such as keepalive messages, which are sent downstream but are
> not derived from transaction bytes in the reorder buffer.
>
> My instinct is that if sent bytes are meant to measure
> transaction-output throughput, keepalive messages probably shouldn't
> be included, since they have no basis in transaction data and might
> distort any comparison with total bytes. But I could be wrong - happy
> to be corrected!
Thank you for the explanation.
I think I understand the distinction you are making. However, my
question is one step earlier than the keepalive-message question. I am
wondering whether the new metric needs to be defined in terms of
logical-change output in the first place.
If I understand the use case correctly, I think the discussion here is
primarily about relatively high-volume logical replication
workloads. My point is that, in that situation, I would expect the
amount of logical-change output and the amount of data actually sent
over the replication connection to show broadly similar trends.
The latter seems easier to interpret, while still providing a useful
signal for monitoring and capacity-planning purposes. It also seems
more intuitive, since it corresponds directly to the amount of data
sent over the replication connection.
Regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Kapila | 2026-06-15 09:06:09 | Re: Proposal: Conflict log history table for Logical Replication |
| Previous Message | shveta malik | 2026-06-15 09:02:09 | Re: Proposal: Conflict log history table for Logical Replication |