Re: Subquery pull-up increases jointree search space

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Subquery pull-up increases jointree search space
Date: 2026-06-09 10:43:04
Message-ID: 8b2c8e08-b02f-4418-952f-be8eab31e887@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I kept poking at the patch releasing some transient sublists, trying to
make it release even more memory. Attached is a couple hacky patches
releasing memory in a couple more places:

0001 - Tom's patch releasing a couple transient lists

0002 - releases a couple more transient lists in nearby functions

0003 - release yet more transient lists in equivclass.c

0004 - release transient lists in estimate_num_groups

0005 - free lists used by mergejoin paths

0006 - free lists used by hahsjoin paths

I think 0002-0004 are a relatively straightforward extensions of the
0001 patch, releasing lists that are not used outside a function.

0005+0006 are a bit more invasive, but also save much more memory.

The places constructing mergejoin/hashjoin paths build lists for
inner/outer keys, merge/hash clauses etc. But then those are leaked,
because (a) it's not clear if a join using those lists was actually
created (or if add_path threw it away), and (b) add_path may keep it and
discard it sometime later. But that releases just the path, not the lists.

So 0005+0006 change the contract a bit:

(a) create_hashjoin_path/create_mergejoin_path always copy these list

(b) add_path frees these list when discarding a mergejoin/hashjoin path

(c) the places creating join paths free the lists too

This is a bit ugly, because sometimes the lists may be the same etc. But
it does work well enough for a hacky PoC (passes make check).

With these patches, the memory usage (per log_planner_stats) drops to
~500MB. That's a nice improvement, from the original 2.6GB. The larger
query that used ~17GB now needs ~3GB of memory.

I was wondering how this impacts planning time, so I ran the original
test with 2-11 tables, with 1000 joins for each join size. Attached are
two charts showing the average plan time and memory usage. The patched
build seems consistently faster - for smaller joins (up to ~8 tables)
the differences are small, not really visible in these charts. As the
joins grow it's getting much more visible, and the differences increase.

I redid the memory tracing for the "smaller" query (which now needs
~500MB), and it looks like this

allocation_group | pg_size_pretty
--------------------------------------+----------------
create_memoize_path | 92 MB
get_joinrel_parampathinfo | 65 MB
estimate_num_groups | 47 MB
add_paths_to_joinrel | 38 MB
generate_join_implied_equalities | 24 MB
| 15 MB
create_material_path | 15 MB
calc_nestloop_required_outer | 5224 kB
find_mergeclauses_for_outer_pathkeys | 0 bytes
make_inner_pathkeys_for_merge | 0 bytes
(10 rows)

It shouldn't be hard to get rid of estimate_num_groups entirely, by
freeing the GroupVarInfo entries (and not just the lists).

But after that, it's going to be harder. I'm not sure what to do about
the memoize/material paths - we're leaking these paths because various
places in joinpath.c try to "inject" these paths below a join. But this
way the logic in add_path may not free the path - we may not even get to
add_path, and even if we do, it operates on the join, not this new made
up memoize/material path. The callers have no idea if the path happens
to be used or not, so can't free it either.

I'm not sure what to do about this. I suppose it'd be good to have a
better idea if the path got used, in some way.

regards

--
Tomas Vondra

Attachment Content-Type Size
v2-0001-Tom-s-patch.patch text/x-patch 2.4 KB
v2-0002-release-more-lists.patch text/x-patch 1.5 KB
v2-0003-free-lists-in-equivclasses.patch text/x-patch 1.3 KB
v2-0004-fix-estimate_num_groups.patch text/x-patch 1.4 KB
v2-0005-free-mergejoin-keys.patch text/x-patch 5.1 KB
v2-0006-free-hashjoin-keys.patch text/x-patch 3.0 KB
image/png 12.1 KB
image/png 11.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message solai v 2026-06-09 11:00:21 Re: Use pg_current_xact_id() instead of deprecated txid_current()
Previous Message Jakub Wartak 2026-06-09 10:32:31 Re: Heads Up: cirrus-ci is shutting down June 1st