repack: fix a bug to reject deferrable primary key fallback for concurrent mode

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: repack: fix a bug to reject deferrable primary key fallback for concurrent mode
Date: 2026-04-17 03:35:14
Message-ID: 10DD5E13-B45D-44F1-BE08-C63E00ABCAC0@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I am continuing to test REPACK, and I found another issue.

In check_concurrent_repack_requirements(), if a table has no replica identity index, the code falls back to using the primary key if one exists. The problem is that a deferrable primary key cannot be used for this purpose. WAL generation does not consider a deferrable primary key to be a replica identity, so concurrent mode may not receive enough old tuple information to replay concurrent changes.

I tested this with the following procedure.

1 - Create a table
```
create table t (id int, v text, primary key (id) deferrable initially deferred);
insert into t values (1, 'a');
```

2 - Attach a debugger to session 1's backend process. I used vscode. Add a breakpoint at the first process_concurrent_changes() call. This blocks the REPACK process and gives session 2 time to issue a DELETE.

3 - In session 1, issue a repack, it will stop at the breakpoint
```
repack (concurrently) t;
```

4 - In session 2
```
delete from t where id=1;
```

5 - Detach session 1 from the debugger, so that repack continues and tries to re-apply the delete from session 2.

6 - repack fails with:
```
evantest=# repack (concurrently) t;
ERROR: incomplete delete info
CONTEXT: slot "repack_96468", output plugin "pgrepack", in the change callback, associated LSN 0/2A5717F0
REPACK decoding worker
```

The error comes from this code in pgrepack.c:
```
case REORDER_BUFFER_CHANGE_DELETE:
{
HeapTuple oldtuple;

oldtuple = change->data.tp.oldtuple;

if (oldtuple == NULL)
elog(ERROR, "incomplete delete info");

repack_store_change(ctx, relation, CHANGE_DELETE, oldtuple);
}
break;
```

The root cause is that repack.c assumes rel->rd_pkindex is usable as an identity index, but logical decoding does not treat a deferrable primary key as replica identity. As a result, the decoding worker may not get the old tuple needed to re-apply the delete.

To fix the problem, I think we should just not fall back to deferrable primary key in the first place. See the attached patch.

With this patch, repack will quickly for the test:
```
evantest=# repack (concurrently) t;
ERROR: cannot process relation "t"
HINT: Relation "t" has a deferrable primary key.
```

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Attachment Content-Type Size
v1-0001-Reject-deferrable-primary-key-fallback-in-REPACK-.patch application/octet-stream 2.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2026-04-17 03:35:28 RE: Fix stats reporting delays in logical parallel apply worker
Previous Message Amit Langote 2026-04-17 03:30:59 Re: Reject invalid databases in pg_get_database_ddl()