Re: Single transaction in the tablesync worker?

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Single transaction in the tablesync worker?
Date: 2021-02-01 02:54:37
Message-ID: CAA4eK1K_g43=H4NgUU1jJPBDQhPddJPVOK5eSh8XKZtAHvSisQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 1, 2021 at 6:48 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Sun, Jan 31, 2021 at 12:19 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > I have made the below changes in the patch. Let me know what you think
> > about these?
> > 1. It was a bit difficult to understand the code in DropSubscription
> > so I have rearranged the code to match the way we are doing in HEAD
> > where we drop the slots at the end after finishing all the other
> > cleanup.
>
> There was a reason why the v22 logic was different from HEAD.
>
> The broken connection leaves dangling slots which is unavoidable.
>

I think this is true only when the user specifically requested it by
the use of "ALTER SUBSCRIPTION ... SET (slot_name = NONE)", right?
Otherwise, we give an error on a broken connection. Also, if that is
true then is there a reason to pass missing_ok as true while dropping
tablesync slots?

> But,
> whereas the user knows the name of the Subscription slot (they named
> it), there is no easy way for them to know the names of the remaining
> tablesync slots unless we log them.
>
> That is why the v22 code was written to process the tablesync slots
> even for wrconn == NULL so their name could be logged:
> elog(WARNING, "no connection; cannot drop tablesync slot \"%s\".",
> syncslotname);
>
> The v23 patch removed this dangling slot name information, so it makes
> it difficult for the user to know what tablesync slots to cleanup.
>

Okay, then can we think of combining with the existing error of the
replication slot? I think that might produce a very long message, so
another idea could be to LOG a separate WARNING for each such slot
just before giving the error.

> > 2. In AlterSubscription_refresh(), we can't allow workers to be
> > stopped at commit time as we have already dropped the slots because
> > the worker can access the dropped slot. We need to stop the workers
> > before dropping slots. This makes all the code related to
> > logicalrep_worker_stop_at_commit redundant.
>
> OK.
>
> > 3. In AlterSubscription_refresh(), we need to acquire the lock on
> > pg_subscription_rel only when we try to remove any subscription rel.
>
> + if (!sub_rel_locked)
> + {
> + rel = table_open(SubscriptionRelRelationId, AccessExclusiveLock);
> + sub_rel_locked = true;
> + }
>
> OK. But the sub_rel_locked bool is not really necessary. Why not just
> check for rel == NULL? e.g.
> if (!rel)
> rel = table_open(SubscriptionRelRelationId, AccessExclusiveLock);
>

Okay, that seems to be better, will change accordingly.

> > 4. Added/Changed quite a few comments.
> >
>
> @@ -1042,6 +1115,31 @@ DropSubscription(DropSubscriptionStmt *stmt,
> bool isTopLevel)
> }
> list_free(subworkers);
>
> + /*
> + * Tablesync resource cleanup (slots and origins).
>
> The comment is misleading; this code is only dropping origins.
>

Okay, will change to something like: "Cleanup of tablesync replication origins."

> @@ -73,20 +73,6 @@ typedef struct LogicalRepWorkerId
> Oid relid;
> } LogicalRepWorkerId;
>
> -typedef struct StopWorkersData
> -{
> - int nestDepth; /* Sub-transaction nest level */
> - List *workers; /* List of LogicalRepWorkerId */
> - struct StopWorkersData *parent; /* This need not be an immediate
> - * subtransaction parent */
> -} StopWorkersData;
>
> Since v23 removes that typedef from the code, don't you also have to
> remove it from src/tools/pgindent/typedefs.list?
>

Yeah.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2021-02-01 03:06:09 Re: Asynchronous Append on postgres_fdw nodes.
Previous Message Amit Langote 2021-02-01 02:46:15 Re: [sqlsmith] Failed assertion during partition pruning