| From: | Tomas Vondra <tomas(at)vondra(dot)me> |
|---|---|
| To: | Andrei Lepikhov <lepihov(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Subquery pull-up increases jointree search space |
| Date: | 2026-06-08 15:27:13 |
| Message-ID: | 7008398d-1e85-4843-9a64-8bc53e875cab@vondra.me |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 6/8/26 15:32, Andrei Lepikhov wrote:
> On 07/06/2026 20:10, Tomas Vondra wrote:
>> On 6/5/26 12:43, Tomas Vondra wrote:
>> After classifying the allocations a bit, I see this:
>>
>> allocation | size
>> --------------------------------------+----------------
>> find_mergeclauses_for_outer_pathkeys | 671 MB
>> generate_join_implied_equalities | 660 MB
>> add_paths_to_joinrel | 242 MB
>> make_inner_pathkeys_for_merge | 195 MB
>> estimate_num_groups | 92 MB
>> get_joinrel_parampathinfo | 90 MB
>> create_memoize_path | 48 MB
>> | 15 MB
>> calc_nestloop_required_outer | 5226 kB
>> (9 rows)
>
> It is fun to use the benchmark script from the previous research on this topic
> [1], conducted in 2009. Query, consuming a lot of memory and planning time [2],
> now shows the following numbers:
>
> join_collapse_limit = 12: Memory: used=0.8GB, Time: 2.5s
> join_collapse_limit = 14: Memory: used=1.1GB, Time: 5.5s
> join_collapse_limit = 18: Memory: used=9.3GB, Time: 50.6s
>
> It's interesting to compare these results with your memory consumption profile.
> I tried using the memtrace patch and the Python script directly, but they gave
> unusual output on my machine, so I couldn't rely on them without more detailed
> instructions.
>
Yeah, the memtrace patch is very hacky, and I haven't given enough
instructions how to use it. The .patch adds elog(LOG) with the MEMTRACE
information (context, action, chunk, size), and the memtrace.py
aggregates it like this:
grep MEMTRACE pg.log | ./memtrace.py > memtrace.log
But the heaptrack seems more convenient, of course.
> Instead, I used the heaptrack tool, which gave me the following
profiles:
> bytes pct function
> collapse limit = 14:
> 354.48 MB 29.3% get_relation_foreign_keys
> 109.07 MB 9.0% generate_join_implied_equalities_normal
> 75.51 MB 6.2% get_eclass_indexes_for_relids
> 67.12 MB 5.5% find_mergeclauses_for_outer_pathkeys
> 67.12 MB 5.5% create_nestloop_path
> 58.73 MB 4.9% make_inner_pathkeys_for_merge
> 38.27 MB 3.2% expression_tree_mutator_impl
> 33.56 MB 2.8% hash_inner_and_outer
> collapse limit = 18:
> 1.23 GB 12.8% generate_join_implied_equalities_normal
> 1.11 GB 11.5% get_joinrel_parampathinfo
> 989.90 MB 10.3% get_eclass_indexes_for_relids
> 755.04 MB 7.9% find_mergeclauses_for_outer_pathkeys
> 721.44 MB 7.5% calc_nestloop_required_outer
> 721.43 MB 7.5% get_param_path_clause_serials
> 629.15 MB 6.5% bms_intersect
> 578.81 MB 6.0% have_unsafe_outer_join_ref
> 411.08 MB 4.3% make_inner_pathkeys_for_merge
> 354.48 MB 3.7% get_relation_foreign_keys
> 335.58 MB 3.5% create_nestloop_path
> 301.99 MB 3.1% generate_join_implied_equalities
>
Hmm, how does heaptrack deal with out memory pools? I was worried
existing memory profilers (like heaptrack) would get confused by our
memory contexts, attributing the whole block to the palloc that just
happens to allocate a new block. But that's not really right.
I see the heaptrack README claims it can work with memory pools after
annotating the code in some way. But there's not much details about
that. Also, it suggests valgrind/massif can already do that.
> Overall, the results for generate_join_implied_equalities and
> find_mergeclauses_for_outer_pathkeys are consistent. This test also highlights
> other sources of memory allocations, such as parameterised paths. The memory
> profile changes as the number of joins in the 'join problem' increases.
>
Yes. I did actually mention get_joinrel_parampathinfo.
regards
--
Tomas Vondra
| From | Date | Subject | |
|---|---|---|---|
| Next Message | ZizhuanLiu X-MAN | 2026-06-08 15:31:21 | Re: [PATCH] Doc: Mention OFF as an alias for EXPLAIN SERIALIZE NONE |
| Previous Message | Andres Freund | 2026-06-08 15:10:31 | Re: Upload only the failed tests logs to the Postgres CI (Cirrus CI) |