Re: Report bytes and transactions actually sent downtream

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Report bytes and transactions actually sent downtream
Date: 2025-12-18 12:52:52
Message-ID: CAExHW5s9=H5SNsTiyCkFAdDJmtqTZzvOcnr0uxWgRdCFnY+quA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 18, 2025 at 7:56 AM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>
>
>
> > On Dec 17, 2025, at 13:55, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > Thanks for pointing this out. I have fixed it my code. However, at
> > this point I am looking for a design review, especially to verify that
> > the new implementation addresses Andres's concern raised in [1] while
> > not introducing any design issues raised earlier e.g. those raised in
> > threads [2], [3] and [4]
> >
> > [1] https://www.postgresql.org/message-id/zzidfgaowvlv4opptrcdlw57vmulnh7gnes4aerl6u35mirelm@tj2vzseptkjk
> >>> [2] https://www.postgresql.org/message-id/CAA4eK1KzYaq9dcaa20Pv44ewomUPj_PbbeLfEnvzuXYMZtNw0A%40mail.gmail.com
> >>> [3] https://www.postgresql.org/message-id/aNZ1T5vYC1BtKs4M@ip-10-97-1-34.eu-west-3.compute.internal
> >>> [4] https://www.postgresql.org/message-id/CAExHW5tfVHABuv1moL_shp7oPrWmg8ha7T8CqwZxiMrKror7iw%40mail.gmail.com
> >
> > --
> > Best Wishes,
> > Ashutosh Bapat
>
>
> Hi Ashutosh,
>
> Yeah, I owe you a review. I committed to review this patch but I forgot, sorry about that.
>
> From design perspective, I agree increasing counters should belong to the core, plugin should return properly values following the contract. And I got some more comments:
>
> 1. I just feel a bool return value might not be clear enough. For example:
>
> ```
> - ctx->callbacks.change_cb(ctx, txn, relation, change);
> + if (!ctx->callbacks.change_cb(ctx, txn, relation, change))
> + cache->filteredBytes += ReorderBufferChangeSize(change);
> ```
>
> You increase filteredBytes when change_cb returns false. But if we look at pgoutput_change(), there are many reasons to return false. Counting all the cases to filteredBytes seems wrong.

I am not able to understand this. Every "return false" from
pgoutput_change() indicates that the change was filtered out and hence
the size of corresponding change is being added to filteredBytes by
the caller. Which "return false" does not indicate a filtered out
change?

>
> 2.
> ```
> - ctx->callbacks.truncate_cb(ctx, txn, nrelations, relations, change);
> + if (!ctx->callbacks.truncate_cb(ctx, txn, nrelations, relations, change))
> + cache->filteredBytes += ReorderBufferChangeSize(change);
> ```
>
> Row filter doesn’t impact TRUNCATE, why increase filteredBytes after truncate_cb()?

A TRUNCATE of a relation which is not part of the publication will be
filtered out.

>
> 3.
> ```
> - ctx->callbacks.prepare_cb(ctx, txn, prepare_lsn);
> + if (ctx->callbacks.prepare_cb(ctx, txn, prepare_lsn))
> + cache->sentTxns++;
> ```
>
> For 2-phase commit, it increase sentTxns after prepare_cb, and
> ```
> + if (ctx->callbacks.stream_abort_cb(ctx, txn, abort_lsn))
> + cache->sentTxns++;
> ```
>
> If the transaction is aborted, sentTxns is increased again, which is confusing. Though for aborting there is some data (a notification) is streamed, but I don’t think that should be counted as a transaction.
>
> After commit, sentTxns is also increased, so that, a 2-phase commit is counted as two transactions, which feels also confusing. IMO, a 2-phase commit should still be counted as one transaction.

stream_commit/abort_cb is called after stream_prepare_cb not after prepare_cb.

>
> 4. You add sentBytes and filteredBytes. I am thinking if it makes sense to also add sentRows and filteredRows. Because tables could be big or small, bytes + rows could show a more clear picture to users.

We don't have corresponding total_rows and streamed_rows counts. I
think that's because we haven't come across a use case for them. Do
you have a use case in mind?

--
Best Wishes,
Ashutosh Bapat

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2025-12-18 12:57:09 Re: Do not emit FPW for unlogged relations in BRIN empty-page
Previous Message Ashutosh Bapat 2025-12-18 12:52:40 Re: Report bytes and transactions actually sent downtream