Re: Use merge-based matching for MCVs in eqjoinsel

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Use merge-based matching for MCVs in eqjoinsel
Date: 2025-09-03 16:53:51
Message-ID: 6db3a5c9-eeb0-4549-bbf5-da8c649851f0@tantorlabs.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Following up on my previous messages about optimizing eqjoinsel() and
eqjoinsel_semi() for Var1 = Var2 clauses, I’d like to share detailed
profiling results showing the effect of the patch on JOB for different
values of default_statistics_target.

The first table shows the total planner time (summed over all 113
queries) before and after applying the patch, along with the speedup
achieved:

default_statistics_target | Planner Speedup (×) | Planner Before (ms) |
Planner After (ms)
--------------------------+---------------------+---------------------+--------------------
                     100  | *1.00x*       | 1828.433     |        1820.556
                    1000  | *1.12x*       | 2194.282     |        1963.110
                    2500  | *2.15x*       | 4606.705     |        2140.126
                    5000  | *6.37x*       | 16661.581     |        2616.109
                    7500  | *11.76x*       | 35988.569     |       
3061.161
                   10000  | *19.01x*       | 66616.620     |       
3504.144

The second table shows the profiling of eqjoinsel() using *perf*,
demonstrating that the function, which dominates planning at high
statistics targets, becomes essentially negligible after the patch:

default_statistics_target | eqjoinsel() Before (perf) | eqjoinsel()
After (perf)
--------------------------+---------------------------+--------------------------
                     100  |                     0.01%
|                     0.04%
                    1000  |                     6.23%
|                     0.06%
                    2500  |                    35.45%
|                     0.23%
                    5000  |                    66.14%
|                     0.53%
                    7500  |                    72.70%
|                     0.97%
                   10000  |                    75.42%
|                     1.25%

I’ve attached v3 of the patch. This version adds a check for NULL values
when comparing MCV entries, ensuring correctness in edge cases.

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com

Attachment Content-Type Size
v3-0001-Optimize-selectivity-estimation-for-Var-Var-clauses.patch text/x-patch 16.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-09-03 17:14:20 Re: Generate GUC tables from .dat file
Previous Message Srirama Kucherlapati 2025-09-03 16:44:00 RE: AIX support