From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Álvaro Herrera <alvherre(at)kurilemu(dot)de>, James Hunter <james(dot)hunter(dot)pg(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators |
Date: | 2025-08-05 17:40:22 |
Message-ID: | 9800a3044cf2f2d948def3e5b68f8eda61a82d22.camel@j-davis.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, 2025-08-05 at 14:15 +0200, Álvaro Herrera wrote:
> Here's a rebased version of this patch. I didn't review it or touch
> it
> in any way, just fixed conflicts from current master.
James,
Patch 0001 is doing too much.
For a first step, I think it would be useful to create a new field for
each Plan node, and use that to enforce the execution-time memory
limit. There's a related discussion here about prepared statements:
https://www.postgresql.org/message-id/83fbc36b66077e6ed0ad3a1c18fff3a7d2b22d36.camel@j-davis.com
so we may need to force replans if work_mem changes.
That first step wouldn't affect how memory usage is enforced (aside
from the prepared statement issue), because that field would just be a
copy of work_mem anyway. But once it's in place, extensions could
experiment by tweaking the work memory of individual plan nodes with a
planner_hook.
There's still a lot of work to do, but that work could be broken into
the following mostly-independent efforts:
* You point out that it's hard for an extension to handle subplans
without walking the expression tree. I haven't looked at this problem
in detail, but we can look at that as a separate change.
* Save the estimates, as well, which enables an extension to be smarter
about setting limits.
* We can consider starting from the paths first before copying to the
plan nodes. That would enable extensions to use set_rel_pathlist_hook
to affect the structure of the plan before it's generated rather than
just the per-node enforced limits after planning is done. For this to
be useful, we'd also need to have some additional infrastructure to
keep higher-cost lower-memory paths around. We'd need to find some
practical ways to prevent path explosion here, though.
* We should be more consistent about tracking memory usage by using
MemoryContextMemAllocated() where it makes sense.
* Consider enforcing the limit across all significant data structures
used within a node -- rather than the current behavior, where a single
node can use work_mem several times over by using multiple data
structures.
* We should free the memory from a node when execution is complete, not
wait until ExecutorEnd(). What really matters is the maximum
*concurrent* memory usage.
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2025-08-05 18:05:29 | Re: Add backup_type to pg_stat_progress_basebackup |
Previous Message | Dmitry Mityugov | 2025-08-05 17:37:01 | printtup.c: error: ‘VARSIZE_ANY’ makes pointer from integer when USE_VALGRIND is defined |