| From: | Tomas Vondra <tomas(at)vondra(dot)me> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Subquery pull-up increases jointree search space |
| Date: | 2026-06-09 10:43:04 |
| Message-ID: | 8b2c8e08-b02f-4418-952f-be8eab31e887@vondra.me |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
I kept poking at the patch releasing some transient sublists, trying to
make it release even more memory. Attached is a couple hacky patches
releasing memory in a couple more places:
0001 - Tom's patch releasing a couple transient lists
0002 - releases a couple more transient lists in nearby functions
0003 - release yet more transient lists in equivclass.c
0004 - release transient lists in estimate_num_groups
0005 - free lists used by mergejoin paths
0006 - free lists used by hahsjoin paths
I think 0002-0004 are a relatively straightforward extensions of the
0001 patch, releasing lists that are not used outside a function.
0005+0006 are a bit more invasive, but also save much more memory.
The places constructing mergejoin/hashjoin paths build lists for
inner/outer keys, merge/hash clauses etc. But then those are leaked,
because (a) it's not clear if a join using those lists was actually
created (or if add_path threw it away), and (b) add_path may keep it and
discard it sometime later. But that releases just the path, not the lists.
So 0005+0006 change the contract a bit:
(a) create_hashjoin_path/create_mergejoin_path always copy these list
(b) add_path frees these list when discarding a mergejoin/hashjoin path
(c) the places creating join paths free the lists too
This is a bit ugly, because sometimes the lists may be the same etc. But
it does work well enough for a hacky PoC (passes make check).
With these patches, the memory usage (per log_planner_stats) drops to
~500MB. That's a nice improvement, from the original 2.6GB. The larger
query that used ~17GB now needs ~3GB of memory.
I was wondering how this impacts planning time, so I ran the original
test with 2-11 tables, with 1000 joins for each join size. Attached are
two charts showing the average plan time and memory usage. The patched
build seems consistently faster - for smaller joins (up to ~8 tables)
the differences are small, not really visible in these charts. As the
joins grow it's getting much more visible, and the differences increase.
I redid the memory tracing for the "smaller" query (which now needs
~500MB), and it looks like this
allocation_group | pg_size_pretty
--------------------------------------+----------------
create_memoize_path | 92 MB
get_joinrel_parampathinfo | 65 MB
estimate_num_groups | 47 MB
add_paths_to_joinrel | 38 MB
generate_join_implied_equalities | 24 MB
| 15 MB
create_material_path | 15 MB
calc_nestloop_required_outer | 5224 kB
find_mergeclauses_for_outer_pathkeys | 0 bytes
make_inner_pathkeys_for_merge | 0 bytes
(10 rows)
It shouldn't be hard to get rid of estimate_num_groups entirely, by
freeing the GroupVarInfo entries (and not just the lists).
But after that, it's going to be harder. I'm not sure what to do about
the memoize/material paths - we're leaking these paths because various
places in joinpath.c try to "inject" these paths below a join. But this
way the logic in add_path may not free the path - we may not even get to
add_path, and even if we do, it operates on the join, not this new made
up memoize/material path. The callers have no idea if the path happens
to be used or not, so can't free it either.
I'm not sure what to do about this. I suppose it'd be good to have a
better idea if the path got used, in some way.
regards
--
Tomas Vondra
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-Tom-s-patch.patch | text/x-patch | 2.4 KB |
| v2-0002-release-more-lists.patch | text/x-patch | 1.5 KB |
| v2-0003-free-lists-in-equivclasses.patch | text/x-patch | 1.3 KB |
| v2-0004-fix-estimate_num_groups.patch | text/x-patch | 1.4 KB |
| v2-0005-free-mergejoin-keys.patch | text/x-patch | 5.1 KB |
| v2-0006-free-hashjoin-keys.patch | text/x-patch | 3.0 KB |
|
image/png | 12.1 KB |
|
image/png | 11.1 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | solai v | 2026-06-09 11:00:21 | Re: Use pg_current_xact_id() instead of deprecated txid_current() |
| Previous Message | Jakub Wartak | 2026-06-09 10:32:31 | Re: Heads Up: cirrus-ci is shutting down June 1st |