| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | Kirill Reshke <reshkekirill(at)gmail(dot)com> |
| Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes |
| Date: | 2026-05-09 08:36:34 |
| Message-ID: | D373B03F-4560-4168-9CFF-5CE59F0A8E8D@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On May 9, 2026, at 01:47, Kirill Reshke <reshkekirill(at)gmail(dot)com> wrote:
>
> On Fri, 8 May 2026 at 09:22, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>>
>> Hi,
>>
>> While testing UPDATE FOR PORTION OF, I started wondering whether REPACK supports temporal tables. In theory, it should, because temporal WITHOUT OVERLAPS indexes can be used as replica identity indexes. So I created a test script, repack_temporal.spec, which is included in the attached patch, and it failed.
>>
>> I found that REPACK hard-codes BTEqualStrategyNumber when calling get_opfamily_member(). That seems wrong, because build_replindex_scan_key() uses IndexAmTranslateCompareType() to get the equality strategy for COMPARE_EQ.
>>
>> After fixing the hard-coded BTEqualStrategyNumber, the temporal test passed. Then I added another test for multirange, repack_temporal_multirange.spec, which also failed. The reason is that find_target_tuple() uses the identity index to find the first tuple and returns it directly, but a lossy index scan may return false positives and require recheck.
>>
>> Please see the attached patch for the fix details and test scripts.
>>
>> Best regards,
>> --
>> Chao Li (Evan)
>> HighGo Software Co., Ltd.
>> https://www.highgo.com/
>>
>
> your analysis appears correct to me
Hi Kirill, thanks for your review.
>
>> + while (index_getnext_slot(scan, ForwardScanDirection, retrieved))
>> + {
>> + if (scan->xs_recheck && !identity_key_equal(chgcxt, locator, retrieved))
>> + continue;
>> +
>> + retval = true;
>> + break;
>> + }
>
> Should we add CFI() ?
>
Oh, I didn’t consider that at all, because I thought there should not be a lot of candidate rows needing recheck. I am okay to add that.
>
> Also, do we really need isolation tests and inj points here?
I think so. Without the injection point, the first phase of copying a new heap would be very fast, it would be hard to run an update in the second session. I think that’s way the repack code intentionally added an injection point before the first round of replay:
```
/*
* During testing, wait for another backend to perform concurrent data
* changes which we will process below.
*/
INJECTION_POINT("repack-concurrently-before-lock", NULL);
```
> Doesn't a
> simple regression test for REPACK execute the same code?
>
It seems we intentionally avoid to run repack test in the regress test, see [1] and [2].
PFA v2: added the CFI as Kirill suggested.
[1] https://postgr.es/m/769631.1777575242@sss.pgh.pa.us
[2] https://git.postgresql.org/cgit/postgresql.git/commit/?id=2fd787d0aac1cb00a42ebce92ebb1d7534035ee3
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-Fix-REPACK-with-WITHOUT-OVERLAPS-replica-identity.patch | application/octet-stream | 16.4 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Zhongpu Chen | 2026-05-09 08:58:09 | Re: Proposal: tighten validation for legacy EUC encodings or document that accepted byte sequences may be unconvertible to UTF8 |
| Previous Message | lakshmi | 2026-05-09 08:02:44 | Re: parallel data loading for pgbench -i |