Re: Design of pg_stat_subscription_workers vs pgstats

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Design of pg_stat_subscription_workers vs pgstats
Date: 2022-02-24 15:49:37
Message-ID: CAD21AoB_yY6veu5h4mgdCzH64KY4Y9Ln2L2r7wmurV1+y+ig4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thank you for the comments!

On Thu, Feb 24, 2022 at 4:20 PM tanghy(dot)fnst(at)fujitsu(dot)com
<tanghy(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Thu, Feb 24, 2022 9:33 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Thank you for the comments! I've attached the latest version patch
> > that incorporated all comments I got so far. The primary change from
> > the previous version is that the subscription statistics live globally
> > rather than per-database.
> >
>
> Thanks for updating the patch.
>
> Few comments:
>
> 1.
> I think we should add some doc for column stats_reset in pg_stat_subscription_activity view.

Added.

>
> 2.
> +CREATE VIEW pg_stat_subscription_activity AS
> SELECT
> - w.subid,
> + a.subid,
> s.subname,
> ...
> + a.apply_error_count,
> + a.sync_error_count,
> + a.stats_reset
> + FROM pg_subscription as s,
> + pg_stat_get_subscription_activity(oid) as a;
>
> The line "a.stats_reset" uses a Tab, and we'd better use spaces here.

Fixed.

On Thu, Feb 24, 2022 at 5:54 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi. Below are my review comments for the v2 patch.
>
> ======
>
> 1. Commit message
>
> This patch changes the pg_stat_subscription_workers view (introduced
> by commit 8d74fc9) so that it stores only statistics counters:
> apply_error_count and sync_error_count, and has one entry for
> subscription.
>
> SUGGESTION
> "and has one entry for subscription." --> "and has one entry for each
> subscription."

Fixed.

>
> ~~~
>
> 2. Commit message
>
> After removing these error details, there are no longer relation
> information, so the subscription statistics are now a cluster-wide
> statistics.
>
> SUGGESTION
> "there are no longer relation information," --> "there is no longer
> any relation information,"

Fixed.

>
> ~~~
>
> 3. doc/src/sgml/monitoring.sgml
>
> - <para>
> - The error message
> + Number of times the error occurred during the application of changes
> </para></entry>
> </row>
>
> <row>
> <entry role="catalog_table_entry"><para role="column_definition">
> - <structfield>last_error_time</structfield> <type>timestamp
> with time zone</type>
> + <structfield>sync_error_count</structfield> <type>bigint</type>
> </para>
> <para>
> - Last time at which this error occurred
> + Number of times the error occurred during the initial table
> + synchronization
> </para></entry>
>
> SUGGESTION (both places)
> "Number of times the error occurred ..." --> "Number of times an error
> occurred ..."

Fixed.

>
> ~~~
>
> 4. doc/src/sgml/monitoring.sgml - missing column
>
> (duplicate - also reported by [Tang-v2])
>
> The PG docs for the new "stats_reset" column are missing.
>
> ~~~
>
> 5. src/backend/catalog/system_views.sql - whitespace
>
> (duplicate - also reported by [Tang-v2])
>
> - JOIN pg_subscription s ON (w.subid = s.oid);
> + a.apply_error_count,
> + a.sync_error_count,
> + a.stats_reset
> + FROM pg_subscription as s,
> + pg_stat_get_subscription_activity(oid) as a;
>
> inconsistent tab/space indenting for 'a.stats_reset'.
>
> ~~~
>
> 6. src/backend/postmaster/pgstat.c - function name
>
> +/* ----------
> + * pgstat_reset_subscription_counter() -
> + *
> + * Tell the statistics collector to reset a single subscription
> + * counter, or all subscription counters (when subid is InvalidOid).
> + *
> + * Permission checking for this function is managed through the normal
> + * GRANT system.
> + * ----------
> + */
> +void
> +pgstat_reset_subscription_counter(Oid subid)
>
> SUGGESTION (function name)
> "pgstat_reset_subscription_counter" -->
> "pgstat_reset_subscription_counters" (plural)

Fixed.

>
> ~~
>
> 7. src/backend/postmaster/pgstat.c - pgstat_recv_resetsubcounter
>
> @@ -5645,6 +5598,51 @@
> pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
> }
> }
>
> +/* ----------
> + * pgstat_recv_resetsubcounter() -
> + *
> + * Reset some subscription statistics of the cluster.
> + * ----------
> + */
> +static void
> +pgstat_recv_resetsubcounter(PgStat_MsgResetsubcounter *msg, int len)
>
>
> "Reset some" seems a bit vague. Why not describe that it is all or
> none according to the msg->m_subid?

I think it reset none, one, or all statistics, actually. Given other
pgstat_recv_reset* functions also have similar comments, I think we
can use it rather than elaborating.

>
> ~~~
>
> 8. src/backend/postmaster/pgstat.c - pgstat_recv_resetsubcounter
>
> + if (!OidIsValid(msg->m_subid))
> + {
> + HASH_SEQ_STATUS sstat;
> +
> + /* Clear all subscription counters */
> + hash_seq_init(&sstat, subscriptionStatHash);
> + while ((subentry = (PgStat_StatSubEntry *) hash_seq_search(&sstat)) != NULL)
> + pgstat_reset_subscription(subentry, ts);
> + }
> + else
> + {
> + /* Get the subscription statistics to reset */
> + subentry = pgstat_get_subscription_entry(msg->m_subid, false);
> +
> + /*
> + * Nothing to do if the given subscription entry is not found. This
> + * could happen when the subscription with the subid is removed and
> + * the corresponding statistics entry is also removed before receiving
> + * the reset message.
> + */
> + if (!subentry)
> + return;
> +
> + /* Reset the stats for the requested replication slot */
> + pgstat_reset_subscription(subentry, ts);
> + }
> +}
>
> Why not reverse the if/else?
>
> Checking OidIsValid(...) seems more natural than checking !OidIsValid(...)

Yes, but it's because we use the same pattern in the near function
(see pgstat_recv_resetreplslotcounter()).

>
> ~~~
>
> 9. src/backend/postmaster/pgstat.c - pgstat_recv_subscription_purge
>
> static void
> pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
> {
> /* Return if we don't have replication subscription statistics */
> if (subscriptionStatHash == NULL)
> return;
>
> /* Remove from hashtable if present; we don't care if it's not */
> (void) hash_search(subscriptionStatHash, (void *) &(msg->m_subid),
> HASH_REMOVE, NULL);
> }
>
> SUGGESTION
> Wouldn't the above code be simpler written like:
>
> if (subscriptionStatHash)
> {
> /* Remove from hashtable if present; we don't care if it's not */
> (void) hash_search(subscriptionStatHash, (void *) &(msg->m_subid),
> HASH_REMOVE, NULL);
> }

Similarly, as Amit also mentioned, there is a similar pattern in the
near function. So keep it as it is

> ~~~
>
> 10. src/backend/replication/logical/worker.c
>
> (from my previous [Peter-v1] #9)
>
> >> + /* Report the worker failed during table synchronization */
> >> + pgstat_report_subscription_error(MyLogicalRepWorker->subid, false);
> >>
> >> and
> >>
> >> + /* Report the worker failed during the application of the change */
> >> + pgstat_report_subscription_error(MyLogicalRepWorker->subid, true);
> >>
> >>
> >> Why don't these use MySubscription->oid instead of MyLogicalRepWorker->subid?
>
> > It's just because we used to use MyLogicalRepWorker->subid, is there
> > any particular reason why we should use MySubscription->oid here?
>
> I felt MySubscription->oid is a more natural and more direct way of
> expressing the same thing.
>
> Consider: "the oid of the current subscription" versus "the oid of
> the subscription of the current worker". IMO the first one is simpler.
> YMMV.
>
> Also, it is shorter :)

Changed.

>
> ~~~
>
> 11. src/include/pgstat.h - enum order
>
> (follow-on from my previous v1 review comment #10)
>
> >I assume you're concerned about binary compatibility or something. I
> >think it should not be a problem since both
> >PGSTAT_MTYPE_SUBWORKERERROR and PGSTAT_MTYPE_SUBSCRIPTIONPURGE are
> >introduced to PG15.
>
> Yes, maybe it is OK for those ones. But now in v2 there is a new
> PGSTAT_MTYPE_RESETSUBCOUNTER.
>
> Shouldn't at least that one be put at the end for the same reason?

I think it's better to put it near similar RESET enums.

>
> ~~~
>
> 12. src/include/pgstat.h - PgStat_MsgResetsubcounter
>
> Maybe a better name for this is "PgStat_MsgResetsubcounters" (plural)?

I think it's better to be consistent with other similar message
structs (e.g., msg_resetsharedcounter and msg_resetslrucounter).

>
> ~~~
>
> 14. src/tools/pgindent/typedefs.list
>
> PgStat_MsgResetsubcounter (from pgstat.h) is missing?
>

Added.

> ------
>
> 15. pg_stat_subscription_activity view name?
>
> Has the view name already been decided or still under discussion - I
> was not sure?
>
> If is it already decided then fine, but if not then my vote would be
> for something different like:
> e.g.1 - pg_stat_subscription_errors
> e.g.2 - pg_stat_subscription_counters
> e.g.3 - pg_stat_subscription_metrics
>
> Maybe "activity" was chosen to be deliberately vague in case some
> future unknown stats columns get added? But it means now there is a
> corresponding function "pg_stat_reset_subscription_activity", when in
> practice you don't really reset activity - what you want to do is
> reset some statistics *about* the activity... so it all seems a bit
> odd to me.

Yes, it still needs discussion.

I've attached the updated patch.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachment Content-Type Size
v3-0001-Reconsider-pg_stat_subscription_workers-view.patch application/octet-stream 71.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Brar Piening 2022-02-24 16:07:07 Re: Add id's to various elements in protocol.sgml
Previous Message Andres Freund 2022-02-24 15:46:48 Re: create_index test fails when synchronous_commit = off @ master