Re: Single transaction in the tablesync worker?

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Single transaction in the tablesync worker?
Date: 2021-02-09 06:32:00
Message-ID: CAHut+PuoDeMTLrkH26A0oi8mW=7VkHaxa5YnPuc-D9wmFLqtKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are my feedback comments for the V29 patch.

====

FILE: logical-replication.sgml

+ slots have generated names:
<quote><literal>pg_%u_sync_%u_%llu</literal></quote>
+ (parameters: Subscription <parameter>oid</parameter>,
+ Table <parameter>relid</parameter>, system
identifier<parameter>sysid</parameter>)
+ </para>

1.
There is a missing space before the sysid parameter.

=====

FILE: subscriptioncmds.c

+ * SUBREL_STATE_FINISHEDCOPY. The apply worker can also
+ * concurrently try to drop the origin and by this time the
+ * origin might be already removed. For these reasons,
+ * passing missing_ok = true from here.
+ */
+ snprintf(originname, sizeof(originname), "pg_%u_%u", sub->oid, relid);
+ replorigin_drop_by_name(originname, true, false);
+ }

2.
Don't really need to say "from here".
(same comment applies multiple places, in this file and in tablesync.c)

3.
Previously the tablesync origin name format was encapsulated in a
common function. IMO it was cleaner/safer how it was before, instead
of the same "pg_%u_%u" cut/paste and scattered in many places.
(same comment applies multiple places, in this file and in tablesync.c)

4.
Calls like replorigin_drop_by_name(originname, true, false); make it
unnecessarily hard to read code when the boolean params are neither
named as variables nor commented. I noticed on another thread [et0205]
there was an idea that having no name/comments is fine because anyway
it is not difficult to figure out when using a "modern IDE", but since
my review tools are only "vi" and "meld" I beg to differ with that
justification.
(same comment applies multiple places, in this file and in tablesync.c)

[et0205] https://www.postgresql.org/message-id/c1d9833f-eeeb-40d5-89ba-87674e1b7ba3%40www.fastmail.com

=====

FILE: tablesync.c

5.
Previously there was a function tablesync_replorigin_drop which was
encapsulating the tablesync origin name formatting. I thought that was
better than the V29 code which now has the same formatting scattered
over many places.
(same comment applies for worker_internal.h)

+ * Determine the tablesync slot name.
+ *
+ * The name must not exceed NAMEDATALEN - 1 because of remote node constraints
+ * on slot name length. We do append system_identifier to avoid slot_name
+ * collision with subscriptions in other clusters. With current scheme
+ * pg_%u_sync_%u_UINT64_FORMAT (3 + 10 + 6 + 10 + 20 + '\0'), the maximum
+ * length of slot_name will be 50.
+ *
+ * The returned slot name is either:
+ * - stored in the supplied buffer (syncslotname), or
+ * - palloc'ed in current memory context (if syncslotname = NULL).
+ *
+ * Note: We don't use the subscription slot name as part of tablesync slot name
+ * because we are responsible for cleaning up these slots and it could become
+ * impossible to recalculate what name to cleanup if the subscription slot name
+ * had changed.
+ */
+char *
+ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char
syncslotname[NAMEDATALEN])
+{
+ if (syncslotname)
+ sprintf(syncslotname, "pg_%u_sync_%u_" UINT64_FORMAT, suboid, relid,
+ GetSystemIdentifier());
+ else
+ syncslotname = psprintf("pg_%u_sync_%u_" UINT64_FORMAT, suboid, relid,
+ GetSystemIdentifier());
+
+ return syncslotname;
+}

6.
"We do append" --> "We append"
"With current scheme" -> "With the current scheme"

7.
Maybe consider to just assign GetSystemIdentifier() to a static
instead of calling that function for every slot?
static uint64 sysid = GetSystemIdentifier();
IIUC the sysid value is never going to change for a process, right?

------
Kind Regards,
Peter Smith.
Fujitsu Australia

On Mon, Feb 8, 2021 at 9:59 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Feb 5, 2021 at 8:40 PM Petr Jelinek
> <petr(dot)jelinek(at)enterprisedb(dot)com> wrote:
> >
> > Hi,
> >
> > We had a bit high-level discussion about this patches with Amit
> > off-list, so I decided to also take a look at the actual code.
> >
>
> Thanks for the discussion and a follow-up review.
>
> > My main concern originally was the potential for left-over slots on
> > publisher, but I think the state now is relatively okay, with couple of
> > corner cases that are documented and don't seem much worse than the main
> > slot.
> >
> > I wonder if we should mention the max_slot_wal_keep_size GUC in the
> > table sync docs though.
> >
>
> I have added the reference of this in Alter Subscription where we
> mentioned the risk of leftover slots. Let me know if you have
> something else in mind?
>
> > Another thing that might need documentation is that the the visibility
> > of changes done by table sync is not anymore isolated in that table
> > contents will show intermediate progress to other backends, rather than
> > switching from nothing to state consistent with rest of replication.
> >
>
> Agreed and updated the docs accordingly.
>
> >
> > Some minor comments about code:
> >
> > > + else if (res->status == WALRCV_ERROR && missing_ok)
> > > + {
> > > + /* WARNING. Error, but missing_ok = true. */
> > > + ereport(WARNING,
> >
> > I wonder if we need to add error code to the WalRcvExecResult and check
> > for the appropriate ones here. Because this can for example return error
> > because of timeout, not because slot is missing.
> >
>
> I think there are both pros and cons of distinguishing the error
> ("slot doesnot exist" from others). The benefit is if there a network
> glitch then the user can probably retry the commands Alter/Drop and it
> will be successful next time. OTOH, say the network is broken for a
> long time and the user wants to proceed but there won't be any way to
> proceed for Alter Subscription ... Refresh or Drop Command. So by
> giving WARNING at least we can provide a way to proceed and then they
> can drop such slots later. We have mentioned this in docs as well. I
> think we can go either way here, let me know what do you think is a
> better way?
>
> > Not sure if it matters
> > for current callers though (but then maybe don't call the param
> > missign_ok?).
> >
>
> Sure, if we decide not to change the behavior as suggested by you then
> this makes sense.
>
> >
> > > +ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char syncslotname[NAMEDATALEN])
> > > +{
> > > + if (syncslotname)
> > > + sprintf(syncslotname, "pg_%u_sync_%u", suboid, relid);
> > > + else
> > > + syncslotname = psprintf("pg_%u_sync_%u", suboid, relid);
> > > +
> > > + return syncslotname;
> > > +}
> >
> > Given that we are now explicitly dropping slots, what happens here if we
> > have 2 different downstreams that happen to get same suboid and reloid,
> > will one of the drop the slot of the other one? Previously with the
> > cleanup being left to temp slot we'd at maximum got error when creating
> > it but with the new logic in LogicalRepSyncTableStart it feels like we
> > could get into situation where 2 downstreams are fighting over slot no?
> >
>
> As discussed, added system_identifier to distinguish subscriptions
> between different clusters.
>
> Apart from fixing the above comment, I have integrated it with the new
> replorigin_drop_by_name() API being discussed in the thread [1] and
> posted that patch just for ease. I have also integrated Osumi-San's
> test case patch with minor modifications.
>
> [1] - https://www.postgresql.org/message-id/CAA4eK1L7mLhY%3DwyCB0qsEGUpfzWfncDSS9_0a4Co%2BN0GUyNGNQ%40mail.gmail.com
>
> --
> With Regards,
> Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-02-09 06:47:42 Re: Support for NSS as a libpq TLS backend
Previous Message Ajin Cherian 2021-02-09 06:27:29 Re: repeated decoding of prepared transactions