Re: Deadlock between logrep apply worker and tablesync worker

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Deadlock between logrep apply worker and tablesync worker
Date: 2023-02-07 04:12:00
Message-ID: CAHut+PuE3fjHXfHhfUyO5gcrBPpg50EVrjoVQwthOUOHnuuVUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 3, 2023 at 6:58 PM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
...
> > Right, I think that case could be addressed by Tom's patch to some extent but
> > I am thinking we should also try to analyze if we can completely avoid the need
> > to remove origins from both processes. One idea could be to introduce
> > another relstate something like PRE_SYNCDONE and set it in a separate
> > transaction before we set the state as SYNCDONE and remove the slot and
> > origin in tablesync worker.
> > Now, if the tablesync worker errors out due to some reason during the second
> > transaction, it can remove the slot and origin after restart by checking the state.
> > However, it would add another relstate which may not be the best way to
> > address this problem. Anyway, that can be accomplished as a separate patch.
>
> Here is an attempt to achieve the same.
> Basically, the patch removes the code that drop the origin in apply worker. And
> add a new state PRE_SYNCDONE after synchronization finished in front of apply
> (sublsn set), but before dropping the origin and other final cleanups. The
> tablesync will restart and redo the cleanup if it failed after reaching the new
> state. Besides, since the changes can already be applied on the table in
> PRE_SYNCDONE state, so I also modified the check in
> should_apply_changes_for_rel(). And some other conditions for the origin drop
> in subscription commands are were adjusted in this patch.
>

Here are some review comments for the 0001 patch

======
General Comment

0.
The idea of using the extra relstate for clean-up seems OK, but the
implementation of the new state in this patch appears misordered and
misnamed to me.

The state name should indicate what it is doing (PRE_SYNCDONE is
meaningless). The patch describes in several places that this state
means "synchronized, but not yet cleaned up" therefore IMO it means
the SYNCDONE state should be *before* this new state. And since this
new state is for "cleanup" then let's call it something like that.

To summarize, I don’t think the meaning of SYNCDONE should be touched.
SYNCDONE means the synchronization is done, same as before. And your
new "cleanup" state belongs directly *after* that. IMO it should be
like this:

1. STATE_INIT
2. STATE_DATASYNC
3. STATE_FINISHEDCOPY
4. STATE_SYNCDONE
5. STATE_CLEANUP <-- new relstate
6. STATE_READY

Of course, this is going to impact almost every aspect of the patch,
but I think everything will be basically the same as you have it now
-- only all the state names and comments need to be adjusted according
to the above.

======
Commit Message

1.
Previously, we allowed the apply worker to drop the origin to avoid the case
that the tablesync worker fails to the origin(due to crash). In this case we
don't restart the tablesync worker, and the apply worker can clean the origin.

~

There seem to be some words missing in this paragraph.

SUGGESTION
Previously, we allowed the apply worker to drop the origin as a way to
recover from the scenario where the tablesync worker failed to drop it
(due to crash).

~~~

2.
To improve this, we introduce a new relstate SUBREL_STATE_PRE_SYNCDONE which
will be set after synchronization finished in front of apply (sublsn set), but
before dropping the origin and other final cleanups. The apply worker will
restart tablesync worker if the relstate is SUBREL_STATE_PRE_SYNCDONE. This
way, even if the tablesync worker error out in the transaction that tries to
drop the origin, the apply worker will restart the tablesync worker to redo the
cleanup(for origin and other stuff) and then directly exit.

~

2a.
This is going to be impacted by my "General Comment". Notice how you
describe again "will be set after synchronization finished". Evidence
again this means
the new CLEANUP state should directly follow the SYNCDONE state.

2b.
"error out” --> "encounters an error"

2c.
"cleanup(for origin" --> space before the "("

======
doc/src/sgml/catalogs.sgml

3.
@@ -8071,7 +8071,8 @@ SCRAM-SHA-256$<replaceable>&lt;iteration
count&gt;</replaceable>:<replaceable>&l
<literal>i</literal> = initialize,
<literal>d</literal> = data is being copied,
<literal>f</literal> = finished table copy,
- <literal>s</literal> = synchronized,
+ <literal>p</literal> = synchronized but not yet cleaned up,
+ <literal>s</literal> = synchronization done,
<literal>r</literal> = ready (normal replication)
</para></entry>
</row>
@@ -8082,8 +8083,8 @@ SCRAM-SHA-256$<replaceable>&lt;iteration
count&gt;</replaceable>:<replaceable>&l
</para>
<para>
Remote LSN of the state change used for synchronization coordination
- when in <literal>s</literal> or <literal>r</literal> states,
- otherwise null
+ when in <literal>p</literal>, <literal>s</literal> or
+ <literal>r</literal> states, otherwise null
</para></entry>
</row>
</tbody>

This state order and choice of the letter are impacted by my "General Comment".

IMO it should be more like this:

State code: i = initialize, d = data is being copied, f = finished
table copy, s = synchronization done, c = clean-up done, r = ready
(normal replication)

======
src/backend/commands/subscriptioncmds.c

4. AlterSubscription_refresh

Some adjustments are needed according to my "General Comment".

~~~

5. DropSubscription

Some adjustments are needed according to my "General Comment".

======
src/backend/replication/logical/tablesync.c

6.

+ * Update the state of the table to SUBREL_STATE_SYNCDONE and cleanup the
+ * tablesync slot and drop the tablesync's origin tracking.
+ */
+static void
+finish_synchronization(bool restart_after_crash)

6a.
Suggest calling this function something like 'cleanup_after_synchronization'

~

6b.
Some adjustments to states and comments are needed according to my
"General Comment".

~~~

7. process_syncing_tables_for_sync

- MyLogicalRepWorker->relstate = SUBREL_STATE_SYNCDONE;
+ MyLogicalRepWorker->relstate = SUBREL_STATE_PRE_SYNCDONE;
MyLogicalRepWorker->relstate_lsn = current_lsn;

This should just be setting SUBREL_STATE_SYNCDONE how it previously did.

Other states/comments in this function to change according to my
"General Comments".

~

8.
if (rstate->state == SUBREL_STATE_SYNCDONE)
{
/*
* Apply has caught up to the position where the table sync has
* finished. Mark the table as ready so that the apply will just
* continue to replicate it normally.
*/

That should now be checking for SUBREL_STATE_CLEANUPDONE according to
me "General Comment"

~~~

9. process_syncing_tables_for_apply

Some adjustments to states and comments are needed according to my
"General Comment".

~~~

10. LogicalRepSyncTableStart

Some adjustments to states and comments are needed according to my
"General Comment".

======
src/backend/replication/logical/worker.c

11. should_apply_changes_for_rel

Some adjustments to states according to my "General Comment".

======
src/include/catalog/pg_subscription_rel.h

12.
@@ -62,8 +62,10 @@
DECLARE_UNIQUE_INDEX_PKEY(pg_subscription_rel_srrelid_srsubid_index,
6117, Subsc
* NULL) */
#define SUBREL_STATE_FINISHEDCOPY 'f' /* tablesync copy phase is completed
* (sublsn NULL) */
-#define SUBREL_STATE_SYNCDONE 's' /* synchronization finished in front of
- * apply (sublsn set) */
+#define SUBREL_STATE_PRE_SYNCDONE 'p' /* synchronization finished in front of
+ * apply (sublsn set), but the final
+ * cleanup has not yet been performed */
+#define SUBREL_STATE_SYNCDONE 's' /* synchronization complete */
#define SUBREL_STATE_READY 'r' /* ready (sublsn set) */

Some adjustments to states and comments are needed according to my
"General Comment".

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2023-02-07 04:37:06 Re: Time delayed LR (WAS Re: logical replication restrictions)
Previous Message Amit Kapila 2023-02-07 03:40:01 Re: Time delayed LR (WAS Re: logical replication restrictions)