| From: | Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com> |
|---|---|
| To: | exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Subject: | Re: BUG #19519: REPACK can fail due to missing chunk for toast value |
| Date: | 2026-06-17 17:03:21 |
| Message-ID: | CAFC+b6rLNvDzzYxrfLdq7CWbbouHcuOxi-mjEOW1r3qy2uAPMQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Hi Alexander,
On Sun, Jun 14, 2026 at 10:37 PM PG Bug reporting form <
noreply(at)postgresql(dot)org> wrote:
> The following bug has been logged on the website:
>
> The following script:
> createdb db1
> createdb db2
>
> cat << EOF | psql db1
> SET default_statistics_target = 1000;
> CREATE TABLE t(i int, t text);
> INSERT INTO t SELECT 1, g::text FROM generate_series(1, 50000) g;
> ANALYZE t;
> EOF
>
> cat << EOF | psql db2 &
> BEGIN;
> CREATE TABLE t(i int);
> SELECT pg_sleep(3);
> EOF
>
> sleep 1
>
> cat << EOF | psql db1
> DROP TABLE t;
> VACUUM pg_toast.pg_toast_2619;
> REPACK;
> EOF
> wait
>
> triggers:
> ERROR: missing chunk number 0 for toast value 16393 in pg_toast_2619
>
Thanks for the report and clean reproducer , I can also reproduce this.
I looked into this. It is not REPACK-specific, it reproduces with
plain VACUUM FULL on back branches too, so it predates
the REPACK command. AFAIU The root cause is a disagreement about
the removal horizon between a table rewrite and an independent vacuum
of that table's TOAST relation.
A rewrite (cluster_rel -> copy_table_data, copy_heap_data) preserves live
and RECENTLY_DEAD tuples, and detoasts a preserved tuple's external
values to move them into the new TOAST table.Its removal cutoff comes
from GetOldestNonRemovableTransactionId(), i.e. ComputeXidHorizons(),
which includes the rewriting backend's own snapshot xmin.
The rewrite holds an ordinary MVCC snapshot (taken so index expressions
can be evaluated) and deliberately does not set PROC_IN_VACUUM. So
The rewrite backend's snapshot xmin is the cluster-wide oldest
running xid, which the db2 transaction pins. Since that backend is
in db1 and lacks PROC_IN_VACUUM, its own xmin folds into the
relation's data/catalog horizon -> OldestXmin sits behind the
DROP, so the dead pg_statistic tuple is RECENTLY_DEAD and gets
copied.
The lazy vacuum of pg_toast_2619 sets PROC_IN_VACUUM, which excludes
its own xmin from the horizon, and the db2 transaction, being in another
database,
does not constrain a non-shared relation's horizon. Its cutoff advances
past the DROP
and the chunks are removed.The two operations thus disagree about the same
tuple,
and the rewrite ends up fetching TOAST the lazy vacuum already reclaimed.
Attached Fix:
The rewrite is simply too conservative,AFAIK it's snapshot is only there to
evaluate index expressions against other relations, and it is not a
reader of the rewritten relation's historical rows, so its own xmin
should not hold back that relation's removal horizon. Excluding it
yields exactly the horizon the lazy vacuum uses.
We cannot set PROC_IN_VACUUM on the rewrite, that would broadcast
"ignore my xmin" to every backend and let other vacuums remove tuples in
other relations that the index expressions may need (the documented
reason VACUUM FULL does not set it). The exclusion has to be local to
the rewrite's own horizon computation, so the patch adds
GetOldestNonRemovableTransactionIdForRewrite(): ComputeXidHorizons()
gains a flag to skip the calling backend (and to not publish the
resulting, more-aggressive horizons into the shared GlobalVisState), and
copy_table_data() uses it for OldestXmin.
--
Thanks :)
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Fix-missing-chunk-errors-during-heap-rewrites-by-ign.patch | application/octet-stream | 9.4 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrey Borodin | 2026-06-17 17:39:04 | Re: Possible G2-item at SERIALIZABLE |
| Previous Message | Сергей Дм. Апойченко | 2026-06-17 16:49:22 | Re: BUG #19521: After a minor PostgreSQL update from 14.22 to 14.23, the database goes into an infinite loop. |