Re: Skipping logical replication transactions on subscriber side

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Skipping logical replication transactions on subscriber side
Date: 2021-11-07 14:21:20
Message-ID: CAD21AoAvqBA3FiZS-2Jbh0bj_Hp0AmWu5o-RsSLNgFr1BMiYow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 5, 2021 at 12:57 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Fri, Oct 29, 2021 at 10:55 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > > > On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > >
> > > > > >
> > > > > > I've attached updated patches.
> > > >
> > > > Thank you for the comments!
> > > >
> > > > >
> > > > > Few comments:
> > > > > ==============
> > > > > 1. Is the patch cleaning tablesync error entries except via vacuum? If
> > > > > not, can't we send a message to remove tablesync errors once tablesync
> > > > > is successful (say when we reset skip_xid or when tablesync is
> > > > > finished) or when we drop subscription? I think the same applies to
> > > > > apply worker. I think we may want to track it in some way whether an
> > > > > error has occurred before sending the message but relying completely
> > > > > on a vacuum might be the recipe of bloat. I think in the case of a
> > > > > drop subscription we can simply send the message as that is not a
> > > > > frequent operation. I might be missing something here because in the
> > > > > tests after drop subscription you are expecting the entries from the
> > > > > view to get cleared
> > > >
> > > > Yes, I think we can have tablesync worker send a message to drop stats
> > > > once tablesync is successful. But if we do that also when dropping a
> > > > subscription, I think we need to do that only the transaction is
> > > > committed since we can drop a subscription that doesn't have a
> > > > replication slot and rollback the transaction. Probably we can send
> > > > the message only when the subscritpion does have a replication slot.
> > > >
> > >
> > > Right. And probably for apply worker after updating skip xid.
> >
> > I'm not sure it's better to drop apply worker stats after resetting
> > skip xid (i.g., after skipping the transaction). Since the view is a
> > cumulative view and has last_error_time, I thought we can have the
> > apply worker stats until the subscription gets dropped. Since the
> > error reporting message could get lost, no entry in the view doesn’t
> > mean the worker doesn’t face an issue.
> >
> > >
> > > > In other cases, we can remember the subscriptions being dropped and
> > > > send the message to drop the statistics of them after committing the
> > > > transaction but I’m not sure it’s worth having it.
> > > >
> > >
> > > Yeah, let's not go to that extent. I think in most cases subscriptions
> > > will have corresponding slots.
> >
> > Agreed.
> >
> > >
> > > FWIW, we completely
> > > > rely on pg_stat_vacuum_stats() for cleaning up the dead tables and
> > > > functions. And we don't expect there are many subscriptions on the
> > > > database.
> > > >
> > >
> > > True, but we do send it for the database, so let's do it for the cases
> > > you explained in the first paragraph.
> >
> > Agreed.
> >
> > I've attached a new version patch. Since the syntax of skipping
> > transaction id is under the discussion I've attached only the error
> > reporting patch for now.
>
> Thanks for the updated patch, few comments:
> 1) This check and return can be moved above CreateTemplateTupleDesc so
> that the tuple descriptor need not be created if there is no worker
> statistics
> + BlessTupleDesc(tupdesc);
> +
> + /* Get subscription worker stats */
> + wentry = pgstat_fetch_subworker(subid, subrelid);
> +
> + /* Return NULL if there is no worker statistics */
> + if (wentry == NULL)
> + PG_RETURN_NULL();
> +
> + /* Initialise values and NULL flags arrays */
> + MemSet(values, 0, sizeof(values));
> + MemSet(nulls, 0, sizeof(nulls));
>
> 2) "NULL for the main apply worker" is mentioned as "null for the main
> apply worker" in case of pg_stat_subscription view, we can mention it
> similarly.
> + <para>
> + OID of the relation that the worker is synchronizing; NULL for the
> + main apply worker
> + </para></entry>
>
> 3) Variable assignment can be done during declaration and this the
> assignment can be removed
> + i = 0;
> + /* subid */
> + values[i++] = ObjectIdGetDatum(subid);
>
> 4) I noticed that the worker error is still present when queried from
> pg_stat_subscription_workers even after conflict is resolved in the
> subscriber and the worker proceeds with applying the other
> transactions, should this be documented somewhere?
>
> 5) This needs to be aligned, the columns in select have used TAB, we
> should align it using spaces.
> +CREATE VIEW pg_stat_subscription_workers AS
> + SELECT
> + w.subid,
> + s.subname,
> + w.subrelid,
> + w.relid,
> + w.command,
> + w.xid,
> + w.error_count,
> + w.error_message,
> + w.last_error_time,
> + w.stats_reset
>

Thank you for the comments! These comments are incorporated into the
latest (v20) patch I just submitted[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoAT42mhcqeB1jPfRL1%2BEUHbZk8MMY_fBgsyZvJeKNpG%2Bw%40mail.gmail.com

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhihong Yu 2021-11-07 14:48:24 Re: [PROPOSAL] new diagnostic items for the dynamic sql
Previous Message Masahiko Sawada 2021-11-07 14:19:48 Re: Skipping logical replication transactions on subscriber side