| From: | Antonin Houska <ah(at)cybertec(dot)at> |
|---|---|
| To: | Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com> |
| Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net> |
| Subject: | Re: Adding REPACK [concurrently] |
| Date: | 2026-02-06 16:29:58 |
| Message-ID: | 27597.1770395398@localhost |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com> wrote:
> > I think it *is* related. My earlier patch version, which used the
> > PROC_IN_VACUUM flag improperly [1] was also causing visibility issues. Please
> > let me know if you manage to reproduce the issue with v32.
>
> Will try. Just to highlight - first error happened on v31 *without* PROC_IN_REPACK.
I spent some time running the test with your branch (based on v32 as you told
me off-list), but couldn't reproduce the problem.
The related code is in heap_inplace_lock():
/* no known way this can happen */
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg_internal("attempted to overwrite invisible tuple")));
The only path REPACK (CONCURRENTLY) uses in-place update seems to be:
cluster.c:build_new_indexes() -> index_create_copy() -> index_create() ->
index_build() -> index_update_stats() -> systable_inplace_update_begin()
However I've got no idea how this can be related to REPACK. Since the new
index is not visible to other transactions until REPACK is done, VACUUM should
be the only process able to change the tuple before
heap_inplace_lock(). Indeed, the server log seems to indicate relationship to
VACUUM:
2026-02-01 16:44:58.878 UTC autovacuum worker[22589] LOG: automatic vacuum of table "postgres.pg_catalog.pg_class": index scans: 1
...
2026-02-01 16:44:58.884 UTC client backend[12737] 008_repack_concurrently.pl LOG: statement: COMMIT;
2026-02-01 16:44:58.884 UTC client backend[12737] 008_repack_concurrently.pl LOG: statement: SELECT pg_try_advisory_lock(42)::integer AS gotlock
2026-02-01 16:44:58.884 UTC client backend[12737] 008_repack_concurrently.pl LOG: statement: SELECT pg_advisory_lock(43);
2026-02-01 16:44:58.884 UTC client backend[12737] 008_repack_concurrently.pl LOG: statement: BEGIN;
2026-02-01 16:44:58.884 UTC client backend[12737] 008_repack_concurrently.pl LOG: statement: INSERT INTO tbl(j) VALUES (nextval('last_j')) RETURNING j
2026-02-01 16:44:58.885 UTC client backend[12727] 008_repack_concurrently.pl LOG: statement: SELECT COUNT(*) AS count FROM tbl WHERE j <= 14148
2026-02-01 16:44:58.885 UTC client backend[12734] 008_repack_concurrently.pl LOG: statement: SELECT COUNT(*) AS count FROM tbl WHERE j <= 14145
2026-02-01 16:44:58.885 UTC client backend[12737] 008_repack_concurrently.pl LOG: statement: COMMIT;
2026-02-01 16:44:58.885 UTC client backend[12737] 008_repack_concurrently.pl LOG: statement: SELECT pg_advisory_unlock(43);
2026-02-01 16:44:58.887 UTC client backend[12737] 008_repack_concurrently.pl LOG: statement: BEGIN
--TRANSACTION ISOLATION LEVEL REPEATABLE READ
;
2026-02-01 16:44:58.887 UTC client backend[12737] 008_repack_concurrently.pl LOG: statement: SELECT 1;
2026-02-01 16:44:58.891 UTC REPACK decoding worker[22621] FATAL: terminating background worker "REPACK decoding worker" due to administrator command
2026-02-01 16:44:58.896 UTC client backend[12740] 008_repack_concurrently.pl LOG: statement: SELECT COUNT(*) AS count FROM tbl WHERE j <= 14146
2026-02-01 16:44:58.896 UTC client backend[12722] 008_repack_concurrently.pl ERROR: attempted to overwrite invisible tuple
2026-02-01 16:44:58.896 UTC client backend[12722] 008_repack_concurrently.pl STATEMENT: REPACK (CONCURRENTLY) tbl USING INDEX tbl_pkey;
However, VACUUM should not touch the tuple because the scan in
systable_inplace_update_begin() should leave the containing buffer pinned. I
wonder if you managed to hit another existing bug.
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tomas Vondra | 2026-02-06 17:15:53 | Re: Changing the state of data checksums in a running cluster |
| Previous Message | Nathan Bossart | 2026-02-06 16:12:57 | Re: refactor architecture-specific popcount code |