| From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
|---|---|
| To: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
| Cc: | Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: tablecmds: fix bug where index rebuild loses replica identity on partitions |
| Date: | 2026-03-21 10:29:15 |
| Message-ID: | CABPTF7U_eJ6yeiGQcBPr5SVQS8wvGoo6j7b+6=_Kq3VTk3wSeg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Mar 19, 2026 at 1:07 PM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>
>
>
> > On Feb 26, 2026, at 14:59, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
> >
> >
> >
> >> On Jan 28, 2026, at 10:49, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
> >>
> >>
> >>
> >>> On Jan 27, 2026, at 16:30, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
> >>>
> >>>
> >>>
> >>>> On Jan 27, 2026, at 15:59, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
> >>>>
> >>>>
> >>>>
> >>>>> On Jan 27, 2026, at 15:39, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >>>>>
> >>>>> On Tue, Jan 27, 2026 at 01:13:32PM +0800, Chao Li wrote:
> >>>>>> I found this bug while working on a related patch [1].
> >>>>>>
> >>>>>> When ALTER TABLE ... ALTER COLUMN TYPE causes an index rebuild, and
> >>>>>> that index is used as REPLICA IDENTITY on a partitioned table, the
> >>>>>> replica identity marking on partitions can be silently lost after the
> >>>>>> rebuild.
> >>>>>
> >>>>> I am slightly confused by the tests included in the proposed patch.
> >>>>> On HEAD, if I undo the proposed changes of tablecmds.c, the tests
> >>>>> pass. If I run the tests of the patch with the changes of
> >>>>> tablecmds.c, the tests also pass.
> >>>>
> >>>> Oops, that isn’t supposed to be so. I’ll check the test.
> >>>>
> >>>
> >>> Okay, I see the problem is here:
> >>> ```
> >>> +CREATE UNIQUE INDEX test_replica_identity_partitioned_pkey ON test_replica_identity_partitioned (id);
> >>> ```
> >>>
> >>> I missed to add column “val” into the index, so that alter type of val didn’t cause index rebuild.
> >>>
> >>> Ideally, it’s better to also verify that index OIDs should have changed before and after alter column type, but I haven’t figured out how to do so. Do you have an idea?
> >>
> >> I just updated the test to store index OIDs before and after rebuild into 2 temp tables, so that we can compare the OIDs to verify rebuild happens and replica identity preserved.
> >>
> >> I tried to port the test to master branch, and the test failed. From the test diff file, we can see replica identity lost on 3 leaf partitions:
> >> ```
> >> @@ -360,9 +360,9 @@
> >> ORDER BY b.index_name;
> >> index_name | rebuilt | ri_lost
> >> ---------------------------------------------------+---------+---------
> >> - test_replica_identity_partitioned_p1_id_val_idx | t | f
> >> - test_replica_identity_partitioned_p2_1_id_val_idx | t | f
> >> - test_replica_identity_partitioned_p2_2_id_val_idx | t | f
> >> + test_replica_identity_partitioned_p1_id_val_idx | t | t
> >> + test_replica_identity_partitioned_p2_1_id_val_idx | t | t
> >> + test_replica_identity_partitioned_p2_2_id_val_idx | t | t
> >> test_replica_identity_partitioned_p2_id_val_idx | t | f
> >> test_replica_identity_partitioned_pkey | t | f
> >> (5 rows)
> >> ```
> >>
> >> With this patch, the test passes and all replica identity are preserved.
> >>
> >> PFA v3:
> >> * Enhanced the test.
> >> * A small change in find_partition_replica_identity_indexes(): if we will not update a partition, then unlock it.
> >>
> >> Best regards,
> >> --
> >> Chao Li (Evan)
> >> HighGo Software Co., Ltd.
> >> https://www.highgo.com/
> >>
> >>
> >>
> >>
> >> <v3-0001-tablecmds-fix-bug-where-index-rebuild-loses-repli.patch>
> >
> > The CF asked for a rebase, thus rebased as v4.
> >
Hi, I reproduced this with the test case, and the patch appears
to resolve it.
Some comments on v5:
-- Whether it makes sense to use a single list of pair structs instead
of two parallel OID lists (replicaIdentityIndexOids +
replicaIdentityTableOids) to avoid accidental desync.
-- It would be better to make lock handling in
find_partition_replica_identity_indexes() consistent
(relation_open(..., NoLock) if child is already locked, and avoid
mixed relation_close(..., lockmode)/NoLock behavior).
-- Some typos in comments/tests (partion/parition).
--
Best,
Xuneng
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Marco Nenciarini | 2026-03-21 10:37:40 | Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery |
| Previous Message | Xuneng Zhou | 2026-03-21 10:07:16 | Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery |