| From: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> |
|---|---|
| To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
| Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, michael(at)paquier(dot)xyz, bertranddrouvot(dot)pg(at)gmail(dot)com, andres(at)anarazel(dot)de, shveta(dot)malik(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Report bytes and transactions actually sent downtream |
| Date: | 2026-06-29 09:08:38 |
| Message-ID: | CAE9k0PkFYru=y9MLGWGgXQH1Ba7=aouBOMxdaMaDBnOK=N-9_g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Amit,
On Sun, Jun 28, 2026 at 12:26 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Jun 25, 2026 at 4:37 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Tue, Jun 16, 2026 at 2:06 AM Ashutosh Bapat
> > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > >
> > >
> > > Those are logical message types which are part of the logical change
> > > data - without those messages it's not possible to process the logical
> > > change data. So they are included. But the keepalive messages, for
> > > example, aren't part of the logical change data.
> >
> > I think logical replication messages like STREAM START/STREAM STOP and
> > BEGIN/END, versus messages like keepalive and standby/primary status
> > updates, operate at different layers. The former are the contents of
> > the logical output and seem to belong naturally to the replication
> > slot's statistics. I'm not sure the latter should be included as well
> > -- I'm concerned that counting those bytes would become noise when
> > analyzing the statistics over time, since they have no relation to the
> > volume of logical changes.
> >
>
> Both you and Ashutosh seem to make the similar points and sound
> reasonable, so let's do it that way.
>
> > If we do want a statistic showing the literal total bytes sent
> > downstream including protocol messages, ISTM that should be available
> > for both logical and physical replication: physical replication also
> > uses keepalive messages and adds a header to each message. In other
> > words, that kind of "bytes on the wire" metric isn't really specific
> > to a logical replication slot, so the slot's statistics don't seem
> > like the right place for it.
> >
> > The proposed column name 'sent_bytes' is also confusing to me, because
> > I don't think we can call it "total bytes actually sent" in the
> > logical decoding SQL API case. A name like 'plugin_total_bytes' seems
> > more straightforward and conveys the intent that protocol messages are
> > not included.
> >
>
> The SQL API point is genuine and if we display sent_bytes via SQL API
> then pg_logical_slot_get_changes() will show nonzero sent_bytes even
> though nothing was ever sent anywhere. OTOH, adding plugin_* prefix
> also starts to make it sound like stats are plugin specific, how about
> calling it as 'output_bytes'? It pairs cleanly with the existing
> column: total_bytes = decoded into the reorder buffer, output_bytes =
> decoded and handed to the consumer.
>
> If we use output_bytes, then we can describe the new stats on the
> lines of following text in the docs:
> <para>
> Amount of decoded data produced for this slot's consumer by the output
> plugin, after applying any output plugin filters and converting the
> changes into the output plugin's format. This counts the transaction
> changes together with the messages that delimit them (such as the
> begin and commit messages), but not connection-management messages
> such as keepalives, which are generated by the server rather than the
> output plugin and are therefore not included.
> </para>
> <para>
> This value can differ from <structfield>total_bytes</structfield>: it
> may be smaller because filtered changes are not output, or larger
> because the output plugin's format can be more verbose than the
> decoded changes. For these reasons
> <structfield>output_bytes</structfield> is not directly comparable to
> <structfield>total_bytes</structfield>.
> </para>
>
Please find the attached patch that renames sent_bytes to
output_bytes. Kindly have a look and feel free to share any comments
or suggestions.
--
With Regards,
Ashutosh Sharma.
| Attachment | Content-Type | Size |
|---|---|---|
| v20260629-0001-Report-output-bytes-in-pg_stat_replication_slots.patch | application/octet-stream | 26.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2026-06-29 09:09:59 | Re: Fix GROUP BY ALL handling of ORDER BY operator semantics |
| Previous Message | Kirill Reshke | 2026-06-29 09:02:38 | Re: PostgreSQL select-only CTE removal is too aggressive? |