Re: Use merge-based matching for MCVs in eqjoinsel

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: David Geier <geidav(dot)pg(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Use merge-based matching for MCVs in eqjoinsel
Date: 2025-11-19 15:52:08
Message-ID: abd67dfc-8505-4821-937b-0c003e9b5e28@tantorlabs.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 19.11.2025 18:38, David Geier wrote:
> On 19.11.2025 03:19, Tom Lane wrote:
>
>> I spent some effort on actually measuring timings of the v6 patch,
>> and concluded that this is all splitting hairs that we don't need
>> to split. The actual crossover between hash-loses and hash-wins
>> is more than what my theoretical argument suggested, but still
>> probably less than 100 MCVs on each side. I think we should go with
>>
>> (sslot1.nvalues + sslot2.nvalues) >= 200
>>
>> and call it good.
>>
>> To arrive at this result, I built the v6 patchset with
>> EQJOINSEL_MCV_HASH_THRESHOLD changed to either 0 (to force hashing)
>> or 1000000 (to prevent it). I then ran the attached scripts with
>> different values of "nstats" and collected timings from the postmaster
>> log output produced by the 0001 patch.
> Thanks for working out the details!
>
> I've ran your script on my development machine with 1000, 100 and 50
> MCVs with the following results. As the runtimes had quite some variance
> I didn't bother trying more variations. I think your proposal to go with
> 200 is fine.
>
> nstats | off INT | off TEXT | on INT | on TEXT
> -------------------------------------|------
> 1000 | 697 | 8907 | 14 | 2417
> 100 | 13.7 | 213 | 2.3 | 239
> 50 | 1.4 | 7.6 | 1.5 | 49
>
> The results suggest that the hash function for the non-deterministic
> collation is really slow. If we could properly include the operator
> cost, we could enable the optimization earlier in the case of simple
> data types such as INT. That can be future work.

LGTM

For simple types (integer columns), both algorithms finish in a couple
milliseconds when the MCV counts are under 100, and the difference
between them is very small (JOB results show the same trend). For text
types, the planning time shifts gradually from one algorithm to the
other around that range, without any sharp transition. And it seems to
me that the current criterion is a reasonable compromise, without
requiring us to complicate the threshold any further.

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2025-11-19 16:02:08 Re: pgsql: doc: remove verbiage about "receiving" data from rep. slots
Previous Message Tom Lane 2025-11-19 15:48:06 Re: Update timezone to C99