Re: Skipping logical replication transactions on subscriber side

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Skipping logical replication transactions on subscriber side
Date: 2021-09-13 15:43:09
Message-ID: CAD21AoB0dHrh8K+Qk39-nzd2T=Jak28-gGNgqM76EYs=GTyfkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 10, 2021 at 8:46 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
>
> On Fri, Sep 10, 2021 at 12:33 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Sorry for the late response. I've attached the updated patches that
> > incorporate all comments unless I missed something. Please review
> > them.
> >
>
> Here's some review comments for the v13-0001 patch:

Thank you for the comments!

>
> doc/src/sgml/monitoring.sgml
>
> (1)
> There's an extra space in the following line, before "processing":
>
> + OID of the relation that the worker was processing when the

Fixed.

>
> (2) Suggested wording update:
> BEFORE:
> + field is always NULL if the error is reported by
> AFTER:
> + field is always NULL if the error is reported by the

Fixed.

>
> (3) Suggested wording update:
> BEFORE:
> + by <literal>tablesync</literal> worker.
> AFTER:
> + by the <literal>tablesync</literal> worker.

Fixed.

>
> (4)
> Missing "." at end of following description (inconsistent with other doc):
>
> + Time at which these statistics were last reset

Fixed.

>
> (5) Suggested wording update:
> BEFORE:
> + can be granted EXECUTE to run the function.
> AFTER:
> + can be granted EXECUTE privilege to run the function.

Since descriptions of other stats reset functions don't use "EXECUTE
privilege" so I think it'd be better to leave it for consistency.

>
>
> src/backend/postmaster/pgstat.c
>
> (6) Suggested wording update:
> BEFORE:
> + * for this relation already completes or the table is no
> AFTER:
> + * for this relation already completed or the table is no

Fixed.

>
>
> (7)
> In the code below, since errmsg.m_nentries only ever gets incremented
> by the 1st IF condition, it's probably best to include the 2nd IF
> block within the 1st IF condition. Then can avoid checking
> "errmsg.m_nentries" each loop iteration.
>
> + if (hash_search(not_ready_rels_htab, (void *) &(errent->relid),
> + HASH_FIND, NULL) == NULL)
> + errmsg.m_relids[errmsg.m_nentries++] = errent->relid;
> +
> + /*
> + * If the message is full, send it out and reinitialize to
> + * empty
> + */
> + if (errmsg.m_nentries >= PGSTAT_NUM_SUBSCRIPTIONERRPURGE)
> + {
> + len = offsetof(PgStat_MsgSubscriptionErrPurge, m_relids[0])
> + + errmsg.m_nentries * sizeof(Oid);
> +
> + pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
> + pgstat_send(&errmsg, len);
> + errmsg.m_nentries = 0;
> + }

Agreed. Instead of including the 2nd if block within the 1st if block,
I changed the 1st if condition to check the opposite condition and
continued the loop if it's true (i.g., the table is still under table
synchronization).

>
>
> (8)
> + * Tell the collector about reset the subscription error.
>
> Is this meant to say "Tell the collector to reset the subscription error." ?

Yes, fixed.

>
>
> (9)
> I think the following:
>
> + len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg);
>
> should be:
>
> + len = offsetof(PgStat_MsgSubscriptionErr, m_errmsg[0]) + strlen(errmsg) + 1;
>
> to account for the \0 terminator.

Fixed.

>
> (10)
> I don't think that using the following Assert is really correct here,
> because PgStat_MsgSubscriptionErr is not setup to have the maximum
> number of m_errmsg[] entries to fill up to PGSTAT_MAX_MSG_SIZE (as are
> some of the other pgstat structs):
>
> + Assert(len < PGSTAT_MAX_MSG_SIZE);
>
> (the max size of all of the pgstat structs is statically asserted anyway)
>
> It would be correct to do the following instead:
>
> + Assert(strlen(errmsg) < PGSTAT_SUBSCRIPTIONERR_MSGLEN);
>
> The overflow is guarded by the strlcpy() in any case.

Agreed. Fixed.

>
> (11)
> Would be better to write:
>
> + rc = fwrite(&nerrors, sizeof(nerrors), 1, fpout);
>
> instead of:
>
> + rc = fwrite(&nerrors, sizeof(int32), 1, fpout);
>
>
> (12)
> Would be better to write:
>
> + if (fread(&nerrors, 1, sizeof(nerrors), fpin) != sizeof(nerrors))
>
> instead of:
>
> + if (fread(&nerrors, 1, sizeof(int32), fpin) != sizeof(int32))
>
>

Agreed.

> src/include/pgstat.h
>
> (13)
> BEFORE:
> + * update/reset the error happening during logical
> AFTER:
> + * update/reset the error occurring during logical
>

Fixed.

> (14)
> Typo: replicatoin -> replication
>
> + * an error that occurred during application of logical replicatoin or
>

Fixed.

>
> (15) Suggested wording update:
> BEFORE:
> + * there is no table sync error, where is the common case in practice.
> AFTER:
> + * there is no table sync error, which is the common case in practice.
>

Fixed.

I'll submit the updated patches.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Cleveland 2021-09-13 15:49:48 64 bit TID?
Previous Message Euler Taveira 2021-09-13 15:09:20 Re: TAP test for recovery_end_command