Re: Assertion failure in SnapBuildInitialSnapshot()

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Pradeep Kumar <spradeepkumar29(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Assertion failure in SnapBuildInitialSnapshot()
Date: 2025-12-18 19:42:18
Message-ID: CAD21AoCFv+eRHRfk09GHjDGfpN6BG9ZYHYeGNVuc-rdc_1BFPQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 9, 2025 at 7:32 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wednesday, December 10, 2025 7:25 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Tue, Nov 25, 2025 at 10:25 PM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > On Wednesday, November 26, 2025 2:57 AM Masahiko Sawada
> > <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > > > On Tue, Nov 25, 2025 at 4:02 AM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com>
> > > > wrote:
> > > > >
> > > > > On Tuesday, November 25, 2025 3:30 AM Masahiko Sawada
> > > > <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > >
> > > > > >
> > > > > > Given that the computation of xmin and catalog_xmin among all slots
> > > > > > could be executed concurrently, could the following scenario happen
> > > > > > where
> > > > > > procArray->replication_slot_xmin and replication_slot_catalog_xmin
> > > > > > procArray->are retreat to a non-invalid
> > > > > > XID?
> > > > > >
> > > > > > 1. Suppose the initial value procArray->replication_slot_catalog_xmin
> > is
> > > > 50.
> > > > > > 2. Process-A updates its owned slot's catalog_xmin to 100, and
> > > > > > computes the new catalog_xmin as 100 while holding
> > > > > > ReplicationSlotControlLock in a shared mode in
> > > > > > ReplicationSlotsComputeRequiredLSN(). But it doesn't update the
> > > > procArray's catalog_xmin value yet.
> > > > > > 3. Process-B updates its owned slot's catalog_xmin to 150, and
> > > > > > computes the new catalog_xmin as 150.
> > > > > > 4. Process-B updates the procArray->replication_slot_catalog_xmin to
> > > > 150.
> > > > > > 5. Process-A updates the procArray->repilcation_slot_catalog_xmin to
> > > > > > 100, which was 150.
> > > > >
> > > > > After further investigation, I think that steps 3 and 4 cannot occur
> > > > > because Process-B must have already encountered the catalog_xmin
> > > > > maintained by Process-A, either 50 or 100. Consequently, Process-B
> > > > > will refrain from updating the catalog_xmin to a more recent value, such
> > as
> > > > 150.
> > > >
> > > > Right. But the following scenario seems to happen:
> > > >
> > > > 1. Both processes have a slot with effective_catalog_xmin = 100.
> > > > 2. Process-A updates effective_catalog_xmin to 150, and computes the
> > new
> > > > catalog_xmin as 100 because process-B slot still has
> > effective_catalog_xmin =
> > > > 100.
> > > > 3. Process-B updates effective_catalog_xmin to 150, and computes the
> > new
> > > > catalog_xmin as 150.
> > > > 4. Process-B updates procArray->replication_slot_catalog_xmin to 150.
> > > > 5. Process-A updates procArray->replication_slot_catalog_xmin to 100.
> > >
> > > I think this scenario can occur, but is not harmful. Because the catalog rows
> > > removed prior to xid:150 would no longer be used, as both slots have
> > advanced
> > > their catalog_xmin and flushed the value to disk. Therefore, even if
> > > replication_slot_catalog_xmin regresses, it should be OK.
> > >
> > > Considering all above, I think allowing concurrent xmin computation, as the
> > > patch does, is acceptable. What do you think ?
> >
> > I agree with your analysis. Another thing I'd like to confirm is that
> > in an extreme case, if the server crashes suddenly after removing
> > catalog tuples older than XID 100 and logical decoding restarts, it
> > ends up missing necessary catalog tuples? I think it's not a problem
> > as long as the subscriber knows the next commit LSN they want but
> > could it be problematic if the user switches to use the logical
> > decoding SQL API? I might be worrying too much, though.
>
> I think this case is not a problem because:
>
> In LogicalConfirmReceivedLocation, the updated restart_lsn and catalog_xmin are
> flushed to disk before the effective_catalog_xmin is updated. Thus, once
> replication_slot_catalog_xmin advances to a certain value, even in the event of
> a crash, users won't encounter any removed tuples when consuming from slots
> after a restart. This is because all slots have their updated restart_lsn
> flushed to disk, ensuring that upon restarting, changes are decoded from the
> updated position where older catalog tuples are no longer needed.

Agreed.

>
> BTW, I assume you meant catalog tuples older than XID 150 are removed, since in
> the previous example, tuples older than XID 100 are already not useful.

Right. Thank you for pointing this out.

I think we can proceed with the idea proposed by Hou-san. Regarding
the patch, while it mostly looks good, it might be worth considering
adding more comments:

- If the caller passes already_locked=true to
ReplicationSlotsComputeRequiredXmin(), the lock order should also be
considered (must acquire RepliationSlotControlLock and then
ProcArrayLock).
- ReplicationSlotsComputeRequiredXmin() can concurrently run by
multiple process, resulting in temporarily moving
procArray->replication_slot_catalog_xmin backward, but it's harmless
because a smaller catalog_xmin is conservative: it merely prevents
VACUUM from removing catalog tuples that could otherwise be pruned. It
does not lead to premature deletion of required data.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-12-18 19:49:49 Re: Qual push down to table AM
Previous Message Maxime Schoemans 2025-12-18 19:40:31 Re: Qual push down to table AM