| From: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | "Anton A(dot) Melnikov" <a(dot)melnikov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(at)eisentraut(dot)org> |
| Subject: | Re: [BUG] Excessive memory usage with update on STORED generated columns. |
| Date: | 2026-03-30 20:14:25 |
| Message-ID: | CAEze2WhrYY6J=0X54EQbTH0POWPTSnWeAwNJd01vcoWoXhtiig@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, 30 Mar 2026 at 21:14, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> writes:
> > The actual issue is that ExecComputeStoredGenerated uses
> > ri_GeneratedExprsU's NULL-ness to check whether the generated columns'
> > expressions have been initialized, whilst for UPDATE ResultRelInfos
> > the initialized-indicator is stored in ri_extraUpdatedCols_valid.
>
> Sorry, I missed that you'd already responded.
That's allright.
> I don't like using ri_extraUpdatedCols_valid here: it requires callers
> to know more than they should about how ExecInitGenerated works, and
> it does not fix the comparable performance problem that probably
> exists in the INSERT path.
I'm not sure which comparable performance problem you're referring to;
I don't see one mentioned, and INSERT doesn't have the same issue
because we never call into ExecInitGenerated for inserts unless 1.)
there are any generated stored columns, and 2.) it hasn't been called
already for this ResultRelInfo (*) by checking
nonnull-after-initialization ri_GeneratedExprsI.
(*) the issue here, of course, being that we *do* call
ExecInitGenerated many times in the same query for the same RRI when
UPDATE only changes columns that aren't referenced in generated
columns, this caused due to an incorrect check which checks the wrong
field.
> I think the right fix is to have three
> booleans specifically reflecting the validity of ri_GeneratedExprsU,
> ri_GeneratedExprsI, and ri_extraUpdatedCols.
I'll defer to Peter as primary author of this code, but personally I
think that it isn't needed:
In the insert case (ri_GeneratedExprsI) the field is always non-null
once the generated columns' exprstates are initialized, whilst in the
update case the current boolean is indicative of the fields having
been populated. Yes, it might benefit from better naming, but the
boolean itself is already sufficient to indicate that you can rely on
the update-related fields to be populated by ExecInitGenerated.
> There's at least three bytes free after ri_extraUpdatedCols_valid,
> so we could cram two more bools in there without an ABI break.
> Admittedly, it's not pretty that those bools wouldn't be close
> to the fields they describe. But we could improve that in HEAD
> with some more-substantial reordering of the contents of
> ResultRelInfo; it's not like the current ordering has much rhyme
> or reason to it.
I personally try to avoid adding new fields in backbranches if that's
possible (even in alignment gaps), so that if we actually must add
data to the struct we still have space to pick from.
Kind regards,
Matthias van de Meent
Databricks (https://www.databricks.com)
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Heikki Linnakangas | 2026-03-30 20:15:44 | Re: Better shared data structure management and resizable shared data structures |
| Previous Message | Corey Huinker | 2026-03-30 20:05:43 | Re: Import Statistics in postgres_fdw before resorting to sampling. |