| From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
|---|---|
| To: | ocean_li_996 <ocean_li_996(at)163(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Cc: | "cca5507(at)qq(dot)com" <cca5507(at)qq(dot)com> |
| Subject: | RE: Fix logical decoding not track transaction during SNAPBUILD_BUILDING_SNAPSHOT |
| Date: | 2026-01-27 02:39:11 |
| Message-ID: | TY4PR01MB16907DEEC21EA1D8CE5C280809490A@TY4PR01MB16907.jpnprd01.prod.outlook.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Saturday, November 22, 2025 5:28 PM ocean_li_996 <ocean_li_996(at)163(dot)com> wrote:
> # Problem description
>
> When in the BUILDING_SNAPSHOT state, the snapshot builder does not track the
> status of any transaction. It can lead to missing transaction states when: --
> The transaction commits before the builder reaches FULL_SNAPSHOT state, and --
> The transaction's xid is greater than or equal to builder->xmin when the builder
> reaches FULL_SNAPSHOT state.
>
> Once in FULL_SNAPSHOT state, the builder constructs a base snapshot using
> incomplete transaction state information. This results in an incorrect base
> snapshot, which can cause unpredictable behavior during subsequent decoding. The
> case provided in v6-0002 attachment reproduces the issue (provided by ChangAo
> Chen).
I agree that this is clearly a bug and should be fixed.
>
> # Code-level analysis
>
> SnapBuildCommitTxn does consider transaction processing during the
> BUILDING_SNAPSHOT state. However, it is only called from xact_decode ->
> DecodeCommit. xact_decode does not process any xact record when snapshot builder
> have not yet reached the FULL_SNAPSHOT state, meaning those commits are ignored.
> Similarly, other functions marking transaction having catalog changes (e.g.,
> heap2_decode) also do not handle records before reaching the FULL_SNAPSHOT
> state.
The analysis looks correct to me.
> # Possible fixes
>
> 1. Replace snapshot at the time we reach CONSISTENT state.
>
> Ajin Cherian in [4] and my initial thought was that although the snapshot at
> FULL_SNAPSHOT state might be wrong, the snapshot at CONSISTENT state is
> guaranteed to be correct. Since decoding always starts after reaching
> CONSISTENT state, we could update both the reorder buffer and the builder
> snapshot with the one captured at CONSISTENT state. However, IMUC, this would
> still cause changes generated before CONSISTENT to carry a wrong snapshot (see
> SnapBuildDistributeSnapshotAndInval).
>
> 2. Track transactions during BUILDING_SNAPSHOT state for snapshot builder If the
> builder does not track transactions in BUILDING_SNAPSHOT state, then we make it
> track them.
>
> 1) ChangAo Chen in v6-0001 attachment provided a fix, already reviewed by
> several people (including me). Bertrand Drouvot in [5] considered the logic a
> bit messy. And I prefer we should make the behavior of snapshot building similar
> in both BUILDING_SNAPSHOT and FULL_SNAPSHOT states, except in cases where a base
> snapshot is needed.
>
> 2) Based on v6-0001, I have provided a minimal fix in v6-0003 (not yet
> reviewed). AFAICS, it resolves the problem, though it records additional useless
> information in the reorder buffer during BUILDING_SNAPSHOT state (which is
> discarded later). This increases memory usage and slightly impacts performance.
> But since snapshot building is infrequent, I consider this acceptable.
>
> 3) I have also prepared a cleaner and more efficient fix in v6-0004 than
> v6-0003, albeit more complex (similar to v6-0001). Provided as an alternative
> reference.
I also thought about how to fix this issue, and my draft idea is similar to what
your v6-0004 implements, e.g., track transactions during BUILDING state but also
avoid unnecessary cost during decoding. So, I think it's on the right track.
Best Regard,
Hou zj
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fujii Masao | 2026-01-27 02:58:01 | Re: Is abort() still needed in WalSndShutdown()? |
| Previous Message | Amit Langote | 2026-01-27 02:39:00 | Re: unnecessary executor overheads around seqscans |