Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Kirill Reshke <reshkekirill(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes
Date: 2026-05-09 08:36:34
Message-ID: D373B03F-4560-4168-9CFF-5CE59F0A8E8D@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On May 9, 2026, at 01:47, Kirill Reshke <reshkekirill(at)gmail(dot)com> wrote:
>
> On Fri, 8 May 2026 at 09:22, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>>
>> Hi,
>>
>> While testing UPDATE FOR PORTION OF, I started wondering whether REPACK supports temporal tables. In theory, it should, because temporal WITHOUT OVERLAPS indexes can be used as replica identity indexes. So I created a test script, repack_temporal.spec, which is included in the attached patch, and it failed.
>>
>> I found that REPACK hard-codes BTEqualStrategyNumber when calling get_opfamily_member(). That seems wrong, because build_replindex_scan_key() uses IndexAmTranslateCompareType() to get the equality strategy for COMPARE_EQ.
>>
>> After fixing the hard-coded BTEqualStrategyNumber, the temporal test passed. Then I added another test for multirange, repack_temporal_multirange.spec, which also failed. The reason is that find_target_tuple() uses the identity index to find the first tuple and returns it directly, but a lossy index scan may return false positives and require recheck.
>>
>> Please see the attached patch for the fix details and test scripts.
>>
>> Best regards,
>> --
>> Chao Li (Evan)
>> HighGo Software Co., Ltd.
>> https://www.highgo.com/
>>
>
> your analysis appears correct to me

Hi Kirill, thanks for your review.

>
>> + while (index_getnext_slot(scan, ForwardScanDirection, retrieved))
>> + {
>> + if (scan->xs_recheck && !identity_key_equal(chgcxt, locator, retrieved))
>> + continue;
>> +
>> + retval = true;
>> + break;
>> + }
>
> Should we add CFI() ?
>

Oh, I didn’t consider that at all, because I thought there should not be a lot of candidate rows needing recheck. I am okay to add that.

>
> Also, do we really need isolation tests and inj points here?

I think so. Without the injection point, the first phase of copying a new heap would be very fast, it would be hard to run an update in the second session. I think that’s way the repack code intentionally added an injection point before the first round of replay:

```
/*
* During testing, wait for another backend to perform concurrent data
* changes which we will process below.
*/
INJECTION_POINT("repack-concurrently-before-lock", NULL);
```

> Doesn't a
> simple regression test for REPACK execute the same code?
>

It seems we intentionally avoid to run repack test in the regress test, see [1] and [2].

PFA v2: added the CFI as Kirill suggested.

[1] https://postgr.es/m/769631.1777575242@sss.pgh.pa.us
[2] https://git.postgresql.org/cgit/postgresql.git/commit/?id=2fd787d0aac1cb00a42ebce92ebb1d7534035ee3

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Attachment Content-Type Size
v2-0001-Fix-REPACK-with-WITHOUT-OVERLAPS-replica-identity.patch application/octet-stream 16.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhongpu Chen 2026-05-09 08:58:09 Re: Proposal: tighten validation for legacy EUC encodings or document that accepted byte sequences may be unconvertible to UTF8
Previous Message lakshmi 2026-05-09 08:02:44 Re: parallel data loading for pgbench -i