| From: | Antonin Houska <ah(at)cybertec(dot)at> |
|---|---|
| To: | Ewan Young <kdbase(dot)hack(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, mihailnikalayeu(at)gmail(dot)com, alvherre(at)kurilemu(dot)de |
| Subject: | Re: REPACK CONCURRENTLY fails on tables with generated columns |
| Date: | 2026-07-01 13:57:02 |
| Message-ID: | 60966.1782914222@localhost |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Ewan Young <kdbase(dot)hack(at)gmail(dot)com> wrote:
> On Mon, Jun 22, 2026 at 7:12 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> >
> > Ewan Young <kdbase(dot)hack(at)gmail(dot)com> wrote:
> >
> > > The transient heap built by make_new_heap() is intentionally created
> > > without the old table's defaults and constraints, so it has no generation
> > > expressions for its generated columns, even though the tuple descriptor
> > > still has attgenerated set.
> > >
> > > When apply_concurrent_update() replays a non-HOT update, it calls
> > > ExecInsertIndexTuples() with EIIT_IS_UPDATE. To decide whether to pass
> > > the "indexUnchanged" hint, that calls index_unchanged_by_update() ->
> > > ExecGetExtraUpdatedCols() -> ExecInitGenerated(), which looks up the
> > > generation expression of each generated column via build_column_default()
> > > and errors out when it finds none on the transient heap.
> > >
> > > The apply path does not need to recompute generated columns at all: the
> > > decoded tuple already carries the correct value, and it is only inserted.
> > > Note also that ExecGetUpdatedCols() already returns an empty set for this
> > > ResultRelInfo, because it is not part of any range table -- so the
> > > indexUnchanged determination here is already approximate.
> >
> > I'm sorry for the confusion, but the fact that ExecGetUpdatedCols() returns an
> > empty set is an omission rather than deliberate choice. Assuming we fix that,
> > the result of ExecGetExtraUpdatedCols() does matter. Thus we should copy the
> > related pg_attrdef entries, as I suggest in this patch.
> >
> > Another question is how serious problem it is that ExecGetUpdatedCols()
> > returns empty set. AFAICS, "indexUnchanged" does not affect correctness - it's
> > is only a hint that helps the btree AM decide whent the bottom-up deletion and
> > de-duplication techniques should (not) be used. I'm not sure it's easy to
> > update the set for individual UPDATEs: the UPDATE commands REPACK replays
> > originate from different SQL queries and the logical decoding does not
> > transfer this information.
> >
> > Even then, I think it'd be "less bad" to have ExecGetUpdatedCols() return a
> > set containing all the attributes rather than empty set. That is, avoid using
> > the btree optimizations altogether rather than allow them them when not
> > appropriate. However, per index_unchanged_by_update(), if ExecGetUpdatedCols()
> > tells that all columns are updated, the result of ExecGetExtraUpdatedCols()
> > does not matter. Nevertheless, I'd still slightly prefer copying the
> > pg_attrdef entries to hacking the executor.
>
> Agreed, thanks for the correction. Relying on the empty ExecGetUpdatedCols()
> set was the weak point of my version -- it's an omission, not something to
> build a second approximation on. Copying the pg_attrdef entries fixes the
> inconsistency at the root, so let's go with your approach.
>
> I applied the patch and ran it through an injection-point reproducer
> (cassert). Without the fix the bug reproduces (ERROR: no generation
> expression found for column number 3 ...); with it, REPACK CONCURRENTLY
> succeeds under a concurrent non-HOT UPDATE for a STORED generated column, an
> index directly on the generated column, and a VIRTUAL column, with correct
> values afterwards. Your repack.spec change passes.
>
> The approach is right and I've confirmed it fixes the bug, so +1 from me in
> this direction.
Thanks for checking. Here, 0002 tries to fix the problem of empty updatedCols,
as I proposed above.
0001 fixes two minor coding issues that I found when writing 0002.
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
| Attachment | Content-Type | Size |
|---|---|---|
| 0001-Minor-cleanup-in-initialize_change_context.patch | text/x-diff | 1.1 KB |
| 0002-Provide-the-executor-with-information-on-updated-col.patch | text/x-diff | 3.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Akshay Joshi | 2026-07-01 14:12:31 | Re: [PATCH] Add pg_get_table_ddl() to reconstruct CREATE TABLE statements |
| Previous Message | jian he | 2026-07-01 13:44:37 | Re: implement CAST(expr AS type FORMAT 'template') |