Re: Single transaction in the tablesync worker?

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Single transaction in the tablesync worker?
Date: 2021-01-23 00:25:11
Message-ID: CAHut+PskLBw3vW-Qa07KK1JsG8Wuj8p4xs-Ppjs6oa6mbRMJSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 21, 2021 at 9:17 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Jan 19, 2021 at 2:32 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi Amit.
> >
> > PSA the v17 patch for the Tablesync Solution1.
> >
>
> Thanks for the updated patch. Below are few comments:
> 1. Why are we changing the scope of PG_TRY in DropSubscription()?
> Also, it might be better to keep the replication slot drop part as it
> is.
>

The latest patch [v18] was re-designed to make tablesync slots as
TEMPORARY [ak0122], so this code in DropSubscription is modified a
lot. This review comment is not applicable anymore.

> 2.
> - * - Tablesync worker finishes the copy and sets table state to SYNCWAIT;
> - * waits for state change.
> + * - Tablesync worker does initial table copy; there is a
> FINISHEDCOPY state to
> + * indicate when the copy phase has completed, so if the worker crashes
> + * before reaching SYNCDONE the copy will not be re-attempted.
>
> In the last line, shouldn't the state be FINISHEDCOPY instead of SYNCDONE?
>

OK. The code comment was correct, but maybe confusing. I have reworded
it in the latest patch [v18].

> 3.
> +void
> +tablesync_cleanup_at_interrupt(void)
> +{
> + bool drop_slot_needed;
> + char originname[NAMEDATALEN] = {0};
> + RepOriginId originid;
> + TimeLineID tli;
> + Oid subid = MySubscription->oid;
> + Oid relid = MyLogicalRepWorker->relid;
> +
> + elog(DEBUG1,
> + "tablesync_cleanup_at_interrupt for relid = %d",
> + MyLogicalRepWorker->relid);
>
> The function name and message makes it sound like that we drop slot
> and origin at any interrupt. Isn't it better to name it as
> tablesync_cleanup_at_shutdown()?
>

The latest patch [v18] was re-designed to make tablesync slots as
TEMPORARY [ak0122], so this cleanup function is removed. This review
comment is not applicable anymore.

> 4.
> + drop_slot_needed =
> + wrconn != NULL &&
> + MyLogicalRepWorker->relstate != SUBREL_STATE_SYNCDONE &&
> + MyLogicalRepWorker->relstate != SUBREL_STATE_READY;
> +
> + if (drop_slot_needed)
> + {
> + char syncslotname[NAMEDATALEN] = {0};
> + bool missing_ok = true; /* no ERROR if slot is missing. */
>
> I think we can avoid using missing_ok and drop_slot_needed variables.
>

The latest patch [v18] was re-designed to make tablesync slots as
TEMPORARY [ak0122], so this code no longer exists. This review comment
is not applicable anymore.

> 5. Can we drop the origin along with the slot in
> process_syncing_tables_for_sync() instead of
> process_syncing_tables_for_apply()? I think this is possible because
> of the other changes you made in origin.c. Also, if possible, we can
> try to use the same code to drop the slot and origin in
> tablesync_cleanup_at_interrupt and process_syncing_tables_for_sync.
>

No, the origin tracking cannot be dropped by the tablesync worker for
the normal use-case even with my modified origin.c; it would fail
during the commit TX because while trying to do
replorigin_session_advance it would find the asserted origin id was
not there anymore.

Also, the latest patch [v18] was re-designed to make tablesync slots
as TEMPORARY [ak0122], so the tablesync_cleanup_at_interrupt function
no longer exists (so the origin.c change of v17 has also been
removed).

> 6.
> + if (MyLogicalRepWorker->relstate == SUBREL_STATE_FINISHEDCOPY)
> + {
> + /*
> + * The COPY phase was previously done, but tablesync then crashed/etc
> + * before it was able to finish normally.
> + */
>
> There seems to be a typo (crashed/etc) in the above comment.
>

OK. Fixed in latest patch [v18].

----
[ak0122] = https://www.postgresql.org/message-id/CAA4eK1LS0_mdVx2zG3cS%2BH88FJiwyS3kZi7zxijJ_gEuw2uQ2g%40mail.gmail.com
[v18] = https://www.postgresql.org/message-id/CAHut%2BPvm0R%3DMn_uVN_JhK0scE54V6%2BEDGHJg1WYJx0Q8HX_mkQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-01-23 02:37:23 Re: Some more hackery around cryptohashes (some fixes + SHA1)
Previous Message Peter Smith 2021-01-23 00:16:43 Re: Single transaction in the tablesync worker?