Backward movement of confirmed_flush resulting in data duplication.

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Subject: Backward movement of confirmed_flush resulting in data duplication.
Date: 2025-05-13 10:17:55
Message-ID: CAJpy0uDZ29P=BYB1JDWMCh-6wXaNqMwG1u1mB4=10Ly0x7HhwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi All,

It is a spin-off thread from earlier discussions at [1] and [2].

While analyzing the slot-sync BF failure as stated in [1], it was
observed that there are chances that confirmed_flush_lsn may move
backward depending on the feedback messages received from the
downstream system. It was suspected that the backward movement of
confirmed_flush_lsn may result in data duplication issues. Earlier we
were able to successfully reproduce the issue with two_phase enabled
subscriptions (see[2]). Now on further analysing, it seems possible
that data duplication issues may happen without two-phase as well.

With the attached injection-point patch and test-script (provided by
Hou-San), the data duplication issue is reproduced without using the
twophase option. The problem arises when changes applied by the table
sync worker are duplicated by the apply worker if the
confirmed_flush_lsn moves backward after the table sync is completed.

The error expected on sub after executing the test script:
# ERROR: conflict detected on relation "public.tab2": conflict=insert_exists
# DETAIL: Key already exists in unique index "tab2_a_key", modified
in transaction ....
# Key (a)=(2); existing local tuple (2); remote tuple (2).

The general steps followed by attached script are:

1. After adding a new table, tab2, to the publication, refresh the
subscription to initiate a table sync for tab2. Before the state
reaches SYNCDONE, insert some data into tab2. This new insertion will
be replicated by the table sync worker.
2. Disable the subscription and stop the apply worker before changing
the state to READY.
3. Re-enable the subscription and wait for the table sync to finish.
Notice that the origin position should not have progressed since step
1.
4. Disable and re-enable the subscription. Given that the origin
position is less than the slot's confirmed_flush_lsn, control the
walsender to stop when the confirmed_flush_lsn moves backward.
5. Disable and re-enable the subscription once more, causing the slot
to retain the previous confirmed_flush_lsn, and the insertion from
step 2 will be replicated again.

The test script uses 3 injection points to control the race condition
mentioned in above steps. Also the LOG_SNAPSHOT_INTERVAL_MS is
increased to prevent the restart_lsn from increasing beyond the
insert. To fix this issue, we need to prevent confirmed_flush_lsn
from moving backward. Attached the fix patch for the same.

With the given script, the problem reproduces on Head and PG17. We are
trying to reproduce the issue on PG16 and below where injection points
are not there.

[1]: https://www.postgresql.org/message-id/OS3PR01MB5718BC899AEE1D04755C6E7594832%40OS3PR01MB5718.jpnprd01.prod.outlook.com
[2]: https://www.postgresql.org/message-id/OS0PR01MB57164AB5716AF2E477D53F6F9489A%40OS0PR01MB5716.jpnprd01.prod.outlook.com

thanks
Shveta

Attachment Content-Type Size
reproduce_data_duplicate_without_twophase.sh text/x-sh 8.4 KB
v1-0001-Injection-points-to-reproduce-the-confirmed_flush.patch application/octet-stream 4.7 KB
v1-0001-Fix-confirmed_flush-backward-movement-issue.patch application/octet-stream 1.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2025-05-13 10:52:39 Re: Backward movement of confirmed_flush resulting in data duplication.
Previous Message Alena Rybakina 2025-05-13 09:49:02 Re: Vacuum statistics