Re: Deadlock between logrep apply worker and tablesync worker

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Deadlock between logrep apply worker and tablesync worker
Date: 2023-02-08 09:11:42
Message-ID: CAHut+PsJFB=03C8QipFUq+5Nbm-ZTP=mgg7fajr09tf3YdM9rA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 7, 2023 at 6:46 PM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Tuesday, February 7, 2023 12:12 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > On Fri, Feb 3, 2023 at 6:58 PM houzj(dot)fnst(at)fujitsu(dot)com <houzj(dot)fnst(at)fujitsu(dot)com>
> > wrote:
> > >
> > ...
> > > > Right, I think that case could be addressed by Tom's patch to some
> > > > extent but I am thinking we should also try to analyze if we can
> > > > completely avoid the need to remove origins from both processes. One
> > > > idea could be to introduce another relstate something like
> > > > PRE_SYNCDONE and set it in a separate transaction before we set the
> > > > state as SYNCDONE and remove the slot and origin in tablesync worker.
> > > > Now, if the tablesync worker errors out due to some reason during
> > > > the second transaction, it can remove the slot and origin after restart by
> > checking the state.
> > > > However, it would add another relstate which may not be the best way
> > > > to address this problem. Anyway, that can be accomplished as a separate
> > patch.
> > >
> > > Here is an attempt to achieve the same.
> > > Basically, the patch removes the code that drop the origin in apply
> > > worker. And add a new state PRE_SYNCDONE after synchronization
> > > finished in front of apply (sublsn set), but before dropping the
> > > origin and other final cleanups. The tablesync will restart and redo
> > > the cleanup if it failed after reaching the new state. Besides, since
> > > the changes can already be applied on the table in PRE_SYNCDONE state,
> > > so I also modified the check in should_apply_changes_for_rel(). And
> > > some other conditions for the origin drop in subscription commands are
> > were adjusted in this patch.
> > >
> >
> > Here are some review comments for the 0001 patch
> >
> > ======
> > General Comment
> >
> > 0.
> > The idea of using the extra relstate for clean-up seems OK, but the
> > implementation of the new state in this patch appears misordered and
> > misnamed to me.
> >
> > The state name should indicate what it is doing (PRE_SYNCDONE is
> > meaningless). The patch describes in several places that this state means
> > "synchronized, but not yet cleaned up" therefore IMO it means the SYNCDONE
> > state should be *before* this new state. And since this new state is for
> > "cleanup" then let's call it something like that.
> >
> > To summarize, I don’t think the meaning of SYNCDONE should be touched.
> > SYNCDONE means the synchronization is done, same as before. And your new
> > "cleanup" state belongs directly *after* that. IMO it should be like this:
> >
> > 1. STATE_INIT
> > 2. STATE_DATASYNC
> > 3. STATE_FINISHEDCOPY
> > 4. STATE_SYNCDONE
> > 5. STATE_CLEANUP <-- new relstate
> > 6. STATE_READY
> >
> > Of course, this is going to impact almost every aspect of the patch, but I think
> > everything will be basically the same as you have it now
> > -- only all the state names and comments need to be adjusted according to the
> > above.
>
> Although I agree the CLEANUP is easier to understand, but I am a bit concerned
> that the changes would be a bit invasive.
>
> If we add a CLEANUP state at the end as suggested, it will change the meaning
> of the existing SYNCDONE state, before the change it means both data sync and
> cleanup have been done, but after the change it only mean the data sync is
> over. This also means all the current C codes that considered the SYNCDONE as
> the final state of table sync will need to be changed. Moreover, it's common
> for user to query the relation state from pg_subscription_rel to identify if
> the table sync of a table is finished(e.g. check relstate IN ('r', 's')), but
> if we add a new state(CLEANUP) as the final state, then all these SQLs would
> need to be changed as they need to check like relstate IN ('r', 'x'(new cleanup
> state)).
>

IIUC, you are saying that we still want to keep the SYNCDONE state as
the last state before READY mainly because otherwise there is too much
impact on user/test SQL that is currently checking those ('s','r')
states.

OTOH, in the current 001 patch you had the SUBREL_STATE_PRE_SYNCDONE
meaning "synchronized but not yet cleaned up" (that's verbatim from
your PGDOCS). And there is C code where you are checking
SUBREL_STATE_PRE_SYNCDONE and essentially giving the state before the
SYNCDONE an equal status to the SYNCDONE (e.g.
should_apply_changes_for_rel seemed to be doing this).

It seems to be trying to have an each-way bet...

~~~

But I think there may be an easy way out of this problem:

Current HEAD
1. STATE_INIT 'i'
2. STATE_DATASYNC 'd'
3. STATE_FINISHEDCOPY 'f'
4. STATE_SYNCDONE 's'
5. STATE_READY 'r'

The patch 0001
1. STATE_INIT 'i'
2. STATE_DATASYNC 'd'
3. STATE_FINISHEDCOPY 'f'
4. STATE_PRESYNCDONE 'p' <-- new relstate
4. STATE_SYNCDONE 's'
5. STATE_READY 'r'

My previous suggestion (which you acknowledge is easier to understand,
but might cause hassles for existing SQL)
1. STATE_INIT 'i'
2. STATE_DATASYNC 'd'
3. STATE_FINISHEDCOPY 'f'
4. STATE_SYNCDONE 's'
5. STATE_CLEANUP 'x' <-- new relstate
6. STATE_READY 'r'

SUGGESTED (hack to solve everything?)
1. STATE_INIT 'i'
2. STATE_DATASYNC 'd'
3. STATE_FINISHEDCOPY 'f'
4. STATE_SYNCDONE_PRE_CLEANUP 'x' <-- change the char code for this
existing relstate (was SYNCDONE 's')
5. STATE_SYNCDONE_WITH_CLEANUP 's' <-- new relstate using 's'
6. STATE_READY 'r'

By commandeering the 's' flag for the new CLEANUP state it means no
existing user code or test code needs to change - IIUC everything will
work the same as before.

~

Hmmm -- In hindsight, perhaps I have gone around in a big circle here
and the solution I am describing here is almost exactly the same as
your patch 0001 only with better names for the relstates.

Thoughts?

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message André Verwijs 2023-02-08 09:46:42 PostgreSQL 16 Dev apt-based Linux unable to install
Previous Message Kirill Reshke 2023-02-08 09:10:55 Re: Add sub-transaction overflow status in pg_stat_activity