Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, exclusion(at)gmail(dot)com
Subject: Re: Fix race condition in InvalidatePossiblyObsoleteSlot()
Date: 2024-02-15 11:24:59
Message-ID: Zc30i8fB7VYrTVw+@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Feb 15, 2024 at 02:09:45PM +0900, Michael Paquier wrote:
> On Thu, Jan 18, 2024 at 02:20:28PM +0000, Bertrand Drouvot wrote:
> > On Thu, Jan 18, 2024 at 04:59:39PM +0530, Bharath Rupireddy wrote:
> >> I'm not sure if it
> >> can happen at all, but I think we can rely on previous conflict reason
> >> instead of previous effective_xmin/effective_catalog_xmin/restart_lsn.
> >
> > I'm not sure what you mean here. I think we should still keep the "initial" LSN
> > so that the next check done with it still makes sense. The previous conflict
> > reason as you're proposing also makes sense to me but for another reason: PANIC
> > in case the issue still happen (for cases we did not think about, means not
> > covered by what the added previous LSNs are covering).
>
> Using a PANIC seems like an overreaction to me for this path. Sure,
> we should not record twice a conflict reason, but wouldn't an
> assertion be more adapted enough to check that a termination is not
> logged twice for this code path run in the checkpointer?

Thanks for looking at it!

Agree, using an assertion instead in v3 attached.

>
> + if (!terminated)
> + {
> + initial_restart_lsn = s->data.restart_lsn;
> + initial_effective_xmin = s->effective_xmin;
> + initial_catalog_effective_xmin = s->effective_catalog_xmin;
> + }
>
> It seems important to document why we are saving this data here; while
> we hold the LWLock for repslots, before we perform any termination,
> and we want to protect conflict reports with incorrect data if the
> slot data got changed while the lwlock is temporarily released while
> there's a termination.

Yeah, comments added in v3.

> >> 3. Is there a way to reproduce this racy issue, may be by adding some
> >> sleeps or something? If yes, can you share it here, just for the
> >> records and verify the whatever fix provided actually works?
> >
> > Alexander was able to reproduce it on a slow machine and the issue was not there
> > anymore with v1 in place. I think it's tricky to reproduce as it would need the
> > slot to advance between the 2 checks.
>
> I'd really want a test for that in the long term. And an injection
> point stuck between the point where ReplicationSlotControlLock is
> released then re-acquired when there is an active PID should be
> enough, isn't it?

Yeah, that should be enough.

> For example just after the kill()? It seems to me
> that we should stuck the checkpointer, perform a forced upgrade of the
> xmins, release the checkpointer and see the effects of the change in
> the second loop. Now, modules/injection_points/ does not allow that,
> yet, but I've posted a patch among these lines that can stop a process
> and release it using a condition variable (should be 0006 on [1]). I
> was planning to start a new thread with a patch posted for the next CF
> to add this kind of facility with a regression test for an old bug,
> the patch needs a few hours of love, first. I should be able to post
> that next week.

Great, that looks like a good idea!

Are we going to fix this on master and 16 stable first and then later on add a
test on master with the injection points?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v3-0001-Fix-race-condition-in-InvalidatePossiblyObsoleteS.patch text/x-diff 3.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2024-02-15 11:26:13 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message Hayato Kuroda (Fujitsu) 2024-02-15 11:23:16 RE: speed up a logical replica setup