Re: Single transaction in the tablesync worker?

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Single transaction in the tablesync worker?
Date: 2021-01-12 12:13:04
Message-ID: CAHut+Ps2nVDzMuk-qyyOUJ1L5At_wBM2CMvO1O0Rse77gm_Uag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 9, 2021 at 5:44 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Jan 8, 2021 at 2:55 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > On Fri, Jan 8, 2021 at 1:02 PM Hou, Zhijie <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com> wrote:
> > >
> >
> > > 3.
> > > + /*
> > > + * To build a slot name for the sync work, we are limited to NAMEDATALEN -
> > > + * 1 characters.
> > > + *
> > > + * The name is calculated as pg_%u_sync_%u (3 + 10 + 6 + 10 + '\0'). (It's
> > > + * actually the NAMEDATALEN on the remote that matters, but this scheme
> > > + * will also work reasonably if that is different.)
> > > + */
> > > + StaticAssertStmt(NAMEDATALEN >= 32, "NAMEDATALEN too small"); /* for sanity */
> > > +
> > > + syncslotname = psprintf("pg_%u_sync_%u", suboid, relid);
> > >
> > > The comments says syncslotname is limit to NAMEDATALEN - 1 characters.
> > > But the actual size of it is (3 + 10 + 6 + 10 + '\0') = 30,which seems not NAMEDATALEN - 1.
> > > Should we change the comment here?
> > >
> >
> > The comment wording is a remnant from older code which had a
> > differently format slot name.
> > I think the comment is still valid, albeit maybe unnecessary since in
> > the current code the tablesync slot
> > name length is fixed. But I left the older comment here as a safety reminder
> > in case some future change would want to modify the slot name. What do
> > you think?
> >
>
> I find it quite confusing. The comments should reflect the latest
> code. You can probably say in some form that the length of slotname
> shouldn't exceed NAMEDATALEN because of remote node constraints on
> slot name length. Also, probably the StaticAssert on NAMEDATALEN is
> not required.

Modified comment in latest patch [v14]

>
> 1.
> + <para>
> + Additional table synchronization slots are normally transient, created
> + internally and dropped automatically when they are no longer needed.
> + These table synchronization slots have generated names:
> + <quote><literal>pg_%u_sync_%u</literal></quote> (parameters:
> Subscription <parameter>oid</parameter>, Table
> <parameter>relid</parameter>)
> + </para>
>
> The last line seems too long. I think we are not strict for 80 char
> limit in docs but it is good to be close to that, however, this
> appears quite long.

Fixed in latest patch [v14]

>
> 2.
> + /*
> + * Cleanup any remaining tablesync resources.
> + */
> + {
> + char originname[NAMEDATALEN];
> + RepOriginId originid;
> + char state;
> + XLogRecPtr statelsn;
>
> I have already mentioned previously that let's not use this new style
> of code (start using { to localize the scope of variables). I don't
> know about others but I find it difficult to read such a code. You
> might want to consider moving this whole block to a separate function.
>

Removed extra code block in latest patch [v14]

> 3.
> /*
> + * XXX - Should optimize this to avoid multiple
> + * connect/disconnect.
> + */
> + wrconn = walrcv_connect(sub->conninfo, true, sub->name, &err);
>
> I think it is better to avoid multiple connect/disconnect here. In
> this same function, we have connected to the publisher, we should be
> able to use the same connection.
>

Fixed in latest patch [v14]

> 4.
> process_syncing_tables_for_sync()
> {
> ..
> + /*
> + * Cleanup the tablesync slot.
> + */
> + syncslotname = ReplicationSlotNameForTablesync(
> + MySubscription->oid,
> + MyLogicalRepWorker->relid);
> + PG_TRY();
> + {
> + elog(DEBUG1, "process_syncing_tables_for_sync: dropping the
> tablesync slot \"%s\".", syncslotname);
> + ReplicationSlotDropAtPubNode(wrconn, syncslotname);
> + }
> + PG_FINALLY();
> + {
> + pfree(syncslotname);
> + }
> + PG_END_TRY();
> ..
> }
>
> Both here and in DropSubscription(), it seems we are using
> PG_TRY..PG_FINALLY just to free the memory even though
> ReplicationSlotDropAtPubNode already has try..finally. Can we arrange
> code to move allocation of syncslotname inside
> ReplicationSlotDropAtPubNode to avoid additional try..finaly? BTW, if
> the usage of try..finally here is only to free the memory, I am not
> sure if it is required because I think we will anyway Reset the memory
> context where this memory is allocated as part of error handling.
>

Eliminated need for TRY/FINALLY to free syncslotname in latest patch [v14]

> 5.
> #define SUBREL_STATE_DATASYNC 'd' /* data is being synchronized (sublsn
> * NULL) */
> +#define SUBREL_STATE_TCOPYDONE 't' /* tablesync copy phase is completed
> + * (sublsn NULL) */
> #define SUBREL_STATE_SYNCDONE 's' /* synchronization finished in front of
> * apply (sublsn set) */
>
> I am not very happy with the new state name SUBREL_STATE_TCOPYDONE as
> it is quite different from other adjoining state names and somehow not
> going well with the code. How about SUBREL_STATE_ENDCOPY 'e' or
> SUBREL_STATE_FINISHEDCOPY 'f'?
>

Using SUBREL_STATE_FINISHEDCOPY in latest patch [v14]

---
[v14] = https://www.postgresql.org/message-id/CAHut%2BPsPO2vOp%2BP7U2szROMy15PKKGanhUrCYQ0ffpy9zG1V1A%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message torikoshia 2021-01-12 12:21:18 Re: Get memory contexts of an arbitrary backend process
Previous Message Peter Smith 2021-01-12 11:53:51 Re: Single transaction in the tablesync worker?