| From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
|---|---|
| To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
| Cc: | ashu(dot)coek88(at)gmail(dot)com, michael(at)paquier(dot)xyz, amit(dot)kapila16(at)gmail(dot)com, bertranddrouvot(dot)pg(at)gmail(dot)com, andres(at)anarazel(dot)de, shveta(dot)malik(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Report bytes and transactions actually sent downtream |
| Date: | 2026-06-15 09:16:24 |
| Message-ID: | CAExHW5vw53yaTncGr-mv7hNU8qieL71B=tWTcymdV38FK+T69g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Jun 15, 2026 at 2:34 PM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> Hello.
>
> At Mon, 15 Jun 2026 13:52:36 +0530, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote in
> > Sorry for chiming in - I may well be misunderstanding this, but here's
> > how I'm currently thinking about it:
> >
> > Total transaction bytes refers to the size of decoded transactional
> > data accumulated in the reorder buffer for a given transaction.
> >
> > Sent bytes (as I understand from the patch) refers to the size of the
> > downstream output that the output plugin produces from that decoded
> > data, after any filtering and format conversion.
> >
> > To illustrate: if a transaction's decoded changes occupy 550 bytes in
> > the reorder buffer, but the output plugin filters some out and emits
> > only 300 bytes downstream, then total transaction bytes = 550 and sent
> > bytes = 300. Conversely, if all 550 bytes are converted into a more
> > verbose format and emitted as 700 bytes, total transaction bytes
> > remains 550 while sent bytes becomes 700.
> >
> > If I'm reading this right, since total bytes for a transaction is the
> > baseline from which transaction-derived downstream output is produced,
> > I wonder whether sent bytes should include only that
> > transaction-derived downstream output, or also downstream protocol
> > traffic such as keepalive messages, which are sent downstream but are
> > not derived from transaction bytes in the reorder buffer.
> >
> > My instinct is that if sent bytes are meant to measure
> > transaction-output throughput, keepalive messages probably shouldn't
> > be included, since they have no basis in transaction data and might
> > distort any comparison with total bytes. But I could be wrong - happy
> > to be corrected!
>
> Thank you for the explanation.
>
> I think I understand the distinction you are making. However, my
> question is one step earlier than the keepalive-message question. I am
> wondering whether the new metric needs to be defined in terms of
> logical-change output in the first place.
>
> If I understand the use case correctly, I think the discussion here is
> primarily about relatively high-volume logical replication
> workloads. My point is that, in that situation, I would expect the
> amount of logical-change output and the amount of data actually sent
> over the replication connection to show broadly similar trends.
>
> The latter seems easier to interpret, while still providing a useful
> signal for monitoring and capacity-planning purposes. It also seems
> more intuitive, since it corresponds directly to the amount of data
> sent over the replication connection.
The total data (bytes) sent over the network are available from
network monitoring systems. That includes both protocol messages and
logical change data. Out of these two logical change data is processed
through a pipeline which actually loads the downstream. The protocol
messages are usually processed and responded to quickly. They are not
passed further down the pipeline that consumes the logical changes.
Hence amount of logical change data is important; not so much the
amount of protocol messages. Protocol messages form a small amount
usually when there is high volume of data processed, but that need not
be true generally. total_bytes is no indicator of high or low volume.
The system may be generating huge amounts of WAL but publication may
be filtering most of it and sending only a small amount downstream.
Amount of only logical changes is accurate measure of load that is
being created on the downstream pipeline.
--
Best Wishes,
Ashutosh Bapat
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Etsuro Fujita | 2026-06-15 09:31:02 | Re: postgres_fdw: fix cumulative stats after imported foreign-table stats |
| Previous Message | Chao Li | 2026-06-15 09:11:11 | Re: pg_restore handles extended statistics inconsistently with statistics data |