| From: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
|---|---|
| To: | "Anton A(dot) Melnikov" <a(dot)melnikov(at)postgrespro(dot)ru> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: [BUG] Excessive memory usage with update on STORED generated columns. |
| Date: | 2026-03-30 15:26:35 |
| Message-ID: | CAEze2Wh+_C8LtmiMRb58df=A1PrBVmMnYMOfbBUk9c=m99CN+w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, 30 Mar 2026 at 16:25, Anton A. Melnikov
<a(dot)melnikov(at)postgrespro(dot)ru> wrote:
>
> Hi!
>
>
> My colleagues found that a queries like that:
[...]
> lead to excessive memory consumption up to 10Gb in this example and
> query execution time up to ~1,5min.
>
> Bisect shows that the problem appeared after commit 83ea6c540
> (Virtual generated columns).
Yep, that looks about accurate. Thanks for the report and initial triage!
> This indicates that generated expressions are reparsed multiple times,
> once per row to be updated instead of being reused.
[...]
> I would like to propose a fix that add a caching of the the parsed
> expression trees (Node *) in ResultRelInfo, so that build_column_default()
> and stringToNode() are executed at most once per attribute per query.
>
> With this fix, the query execution time
> and memory consumption return to normal:
I think that fixes the wrong issue. Specifically, ExecInitGenerated
mentions that it expects to be called just once per ResultRelInfo, and
your code adds code to allow it to be called more than once without
causing more problems. While it does solve the complaint, it's still
executing more work than it should. Additionally, though it's not a
failing of the patch per se, it changes the layout of an existing
struct, and so it isn't backportable.
The actual issue is that ExecComputeStoredGenerated uses
ri_GeneratedExprsU's NULL-ness to check whether the generated columns'
expressions have been initialized, whilst for UPDATE ResultRelInfos
the initialized-indicator is stored in ri_extraUpdatedCols_valid.
So, I think the attached patch is a more appropriate fix, it avoids
calling into ExecInitGenerated at all when no column included in
expressions was updated. It also adds an assertion that the function
isn't called again once the field has been initialized. It also has
the benefit of being backportable.
Kind regards,
Matthias van de Meent
Databricks (https://www.databricks.com)
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-nodeModifyTable-fix-generated-tables-memory-leak-.patch | application/octet-stream | 1.4 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Heikki Linnakangas | 2026-03-30 15:28:58 | Re: Shared hash table allocations |
| Previous Message | Ashutosh Bapat | 2026-03-30 15:21:40 | Re: [PATCH] Report column-level error when lacking privilege |