RE: Failed transaction statistics to measure the logical replication progress

From: Osumi, Takamichi/大墨 昂道 <osumi(dot)takamichi(at)fujitsu(dot)com>
To: 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Failed transaction statistics to measure the logical replication progress
Date: 2021-09-30 02:23:54
Message-ID: OSBPR01MB4888274FAEED04F28A8A038AEDAA9@OSBPR01MB4888.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, September 29, 2021 7:51 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Wed, Sep 29, 2021 at 11:35 AM osumi(dot)takamichi(at)fujitsu(dot)com
> <osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
> > Thank you, Amit-san and Sawada-san for the discussion.
> > On Tuesday, September 28, 2021 7:05 PM Amit Kapila
> <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > Another idea could be to have a separate view, say
> > > > pg_stat_subscription_xact but I'm not sure it's a better idea.
> > > >
> > >
> > > Yeah, that is another idea but I am afraid that having three
> > > different views for subscription stats will be too much. I think it
> > > would be better if we can display these additional stats via the
> > > existing view pg_stat_subscription or the new view
> > > pg_stat_subscription_errors (or whatever name we want to give it).
> > pg_stat_subscription_errors specializes in showing an error record.
> > So, it would be awkward to combine it with other normal xact stats.
> >
> >
> > > > > > Then, if, we proceed in this direction, the place to implement
> > > > > > those stats would be on the LogicalRepWorker struct, instead ?
> > > > > >
> > > > >
> > > > > Or, we can make existing stats persistent and then add these
> > > > > stats on top of it. Sawada-San, do you have any thoughts on this
> matter?
> > > >
> > > > I think that making existing stats including received_lsn and
> > > > last_msg_receipt_time persistent by using stats collector could
> > > > cause massive reporting messages. We can report these messages
> > > > with a certain interval to reduce the amount of messages but we
> > > > will end up seeing old stats on the view.
> > > >
> > >
> > > Can't we keep the current and new stats both in-memory and persist on
> disk?
> > > So, the persistent stats data will be used to fill the in-memory
> > > counters after restarting of workers, otherwise, we will always refer to
> in-memory values.
> > I felt this isn't impossible.
> > When we have to update the values of the xact stats is the end of
> > message apply for COMMIT, COMMIT PREPARED, STREAM_ABORT and etc
> or the
> > time when an error happens during apply. Then, if we want, we can
> > update xact stats values at such moments accordingly.
> > I'm thinking that we will have a hash table whose key is a pair of
> > subid + relid and entry is a proposed stats structure and update the
> > entry, depending on the above timings.
> >
>
> Are you thinking of a separate hash table then what we are going to create for
> Sawada-San's patch related to error stats? Isn't it possible to have stats in the
> same hash table and same file?
IIUC, this would be possible.

At the beginning, I thought we don't use stats collector at all for the xact stats
with the existing stats like received_lsn and last_msg_receipt_time,
considering the concern of too many reporting messages for them
to keep them always updated. But, when we send
messages to stats collector only strictly limited times,
like successful exit and some regular interval as you explained in your other email,
this would disappear and I thought those new xact stats + moved stats can coexist in the hash
proposed by you in [1] and we can write those new stats in the same file.
Does everyone agree ?

> > Here, one thing a bit unclear to me is whether we should move existing
> > stats of pg_stat_subscription (such as last_lsn and reply_lsn) to the
> > hash entry or not.
> >
>
> I think we should move it to hash entry. I think that is an improvement over what
> we have now because now after restart those stats gets lost.
Okay !

[1] - https://www.postgresql.org/message-id/CAA4eK1JRCQ-bYnbkwUrvcVcbLURjtiW%2BirFVvXzeG%2Bj%3Dy6jVgA%40mail.gmail.com

Best Regards,
Takamichi Osumi

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-09-30 02:47:38 Re: Failed transaction statistics to measure the logical replication progress
Previous Message Greg Nancarrow 2021-09-30 02:15:15 Re: On login trigger: take three