| From: | Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com> |
|---|---|
| To: | Ewan Young <kdbase(dot)hack(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, ah(at)cybertec(dot)at, mihailnikalayeu(at)gmail(dot)com, alvherre(at)kurilemu(dot)de |
| Subject: | Re: REPACK CONCURRENTLY fails on tables with generated columns |
| Date: | 2026-06-14 03:56:48 |
| Message-ID: | CAFC+b6pbdKLp_NP2MZaRFxmKmZK88janhpsYmkW4wY+XpwOOrw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Ewan,
On Fri, Jun 12, 2026 at 2:10 PM Ewan Young <kdbase(dot)hack(at)gmail(dot)com> wrote:
> Hi,
>
> REPACK (CONCURRENTLY) aborts with an internal error on any table that has
> a STORED generated column, if a concurrent UPDATE that requires index
> maintenance is applied during the catch-up phase:
>
> ERROR: no generation expression found for column number 3 of table
> "pg_temp_16396"
> Plain (non-concurrent) REPACK on such a table works fine, and so does
> REPACK (CONCURRENTLY) as long as no qualifying concurrent change is
> applied -- so the problem is specific to the concurrent-change apply path.
>
> The attached patch adds an isolation test, but here is the manual
> sequence (server built with --enable-injection-points):
>
> CREATE EXTENSION injection_points;
> CREATE TABLE t (i int PRIMARY KEY, v int,
> g int GENERATED ALWAYS AS (v * 10) STORED);
> CREATE INDEX ON t (v); -- makes UPDATE of v non-HOT
> INSERT INTO t(i, v) VALUES (1, 1);
>
> -- session 1:
> SELECT injection_points_attach('repack-concurrently-before-lock', 'wait');
> REPACK (CONCURRENTLY) t; -- blocks at the injection point
>
> -- session 2, once session 1 is waiting:
> UPDATE t SET v = v + 1 WHERE i = 1;
> SELECT injection_points_wakeup('repack-concurrently-before-lock');
>
> -- session 1 then fails with the ERROR above.
> Without injection points this is a race: the concurrent UPDATE has to be
> decoded and applied during catch-up, and it has to be a non-HOT update
> (one that goes through index maintenance). It is reliably hit on a busy
> table with a generated column.
>
I was able to reproduce this.
>
> The transient heap built by make_new_heap() is intentionally created
> without the old table's defaults and constraints, so it has no generation
> expressions for its generated columns, even though the tuple descriptor
> still has attgenerated set.
>
> When apply_concurrent_update() replays a non-HOT update, it calls
> ExecInsertIndexTuples() with EIIT_IS_UPDATE. To decide whether to pass
> the "indexUnchanged" hint, that calls index_unchanged_by_update() ->
> ExecGetExtraUpdatedCols() -> ExecInitGenerated(), which looks up the
> generation expression of each generated column via build_column_default()
> and errors out when it finds none on the transient heap.
>
> The apply path does not need to recompute generated columns at all: the
> decoded tuple already carries the correct value, and it is only inserted.
> Note also that ExecGetUpdatedCols() already returns an empty set for this
> ResultRelInfo, because it is not part of any range table -- so the
> indexUnchanged determination here is already approximate.
>
makes sense, i have reviewed the patch, it LGTM.
--
Thanks :)
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/
| From | Date | Subject | |
|---|---|---|---|
| Previous Message | jian he | 2026-06-14 02:39:42 | Re: Row pattern recognition |