Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: cca5507 <cca5507(at)qq(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state
Date: 2025-06-27 10:29:53
Message-ID: CAFPTHDYSQipcO_+GNt-ZQsk6cidt9Lc4PkcdvO7jnrugiUw0eg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 27, 2025 at 3:48 PM cca5507 <cca5507(at)qq(dot)com> wrote:
>
> Hi,
>
> I refactor the code and fix the git apply warning according to [1].
>
> Here are the new version patches.
>
> --
> Regards,
> ChangAo Chen
>
> [1] https://www.postgresql.org/message-id/Zrmh7X8jYCbFYXjH%40ip-10-97-1-34.eu-west-3.compute.internal

I see this problem is similar to the bug reported in [1], and your fix
also addresses the issue reported there. Although I like your approach
of tracking changes starting from the BUILDING_SNAPSHOT state, I’d
like to suggest an alternative.

While debugging that issue, my plan was not to track catalog changes
prior to SNAPBUILD_CONSISTENT, but instead to ensure we don’t use
snapshots built before SNAPBUILD_CONSISTENT, since we don’t track
catalog changes in those states. We should discard previously built
snapshots and rebuild them once we reach the SNAPBUILD_CONSISTENT
state. At that point, all necessary transactions would have been
committed, and builder->xmin would have advanced enough to decode all
transactions from then on.

The problem is that previously built snapshots hang around without the
latest xmin and xmax, and we tend to reuse them. We should ensure that
all txn->base_snapshot and builder->snapshot snapshots built in the
SNAPBUILD_FULL_SNAPSHOT state are rebuilt once we reach
SNAPBUILD_CONSISTENT. For this, we need to track when the snapshot was
built. There is already a field in ReorderBufferTXN -
'base_snapshot_lsn' which we can use. If base_snapshot_lsn <
builder->start_decoding_at, then we should rebuild the snapshot. Just
a thought.

regards,
Ajin Cherian
Fujitsu Australia

[1] - https://www.postgresql.org/message-id/18509-983f064d174ea880%40postgresql.org

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2025-06-27 10:59:09 RE: Suggestion to add --continue-client-on-abort option to pgbench
Previous Message Shlok Kyal 2025-06-27 10:14:14 Re: Skipping schema changes in publication