From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Nathan Bossart <nathandbossart(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, Vaibhav Jain <jainva(at)google(dot)com>, pgsql-hackers(at)postgresql(dot)org, Madhukar <madhukarprasad(at)google(dot)com>, Sangeetha Seshadri <sangsesh(at)google(dot)com> |
Subject: | Re: Fix overflow of nbatch |
Date: | 2025-10-09 14:30:06 |
Message-ID: | 3258ad4c-4fbf-4557-9757-3d286a69ae42@vondra.me |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10/9/25 16:16, Melanie Plageman wrote:
> On Wed, Oct 8, 2025 at 4:46 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>>
>> On 10/8/25 19:37, Melanie Plageman wrote:
>>
>>> Yes, this was wrong. I forgot an extra / sizeof(HashJoinTuple). I meant:
>>>
>>> hash_table_bytes / sizeof(HashJoinTuple) > MaxAllocSize /
>>> sizeof(HashJoinTuple) / 2
>>
>> But the hash table is not allocated as a single chunk of memory, so I
>> think MaxAllocSize would be the wrong thing to use here, no? Hash table
>> is a collection of tuples (allocated one by one), that's why
>> get_hash_memory_limit() uses SIZE_MAX. MaxAllocSize matters for
>> nbuckets, because that indeed is allocated as a contiguous array, ofc.
>>
>> Also, why bother with /sizeof(HashJoinTuple) on both sides? Without it
>> we get
>>
>> hash_table_bytes > MaxAllocSize / 2
>>
>> but again, that doesn't make much sense - the hash table can be larger,
>> it's not a single palloc.
>
> It came from the earlier clamping of nbuckets:
>
> max_pointers = hash_table_bytes / sizeof(HashJoinTuple);
> max_pointers = Min(max_pointers, MaxAllocSize / sizeof(HashJoinTuple));
> dbuckets = Min(dbuckets, max_pointers);
>
> I don't really get why this divides hash_table_bytes by
> sizeof(HashJoinTuple) since, as you say, hash_table_bytes is supposed
> to accommodate both the bucket_bytes and inner_rel_bytes.
>
I think that's simply to allocate enough buckets for the the expected
number of tuples, not more. And cap it to MaxAllocSize. Some of this may
be redundant, I suppose.
>> And I think it should be just
>>
>> if (hash_table_bytes > SIZE_MAX / 2)
>> break;
>>
>> as a protection against hash_table_bytes overflowing SIZE_MAX (which on
>> 64-bits seems implausible, but could happen on 32-bit builds).
>
> That's roughly the check I ended up with -- except I used
> space_allowed because it will be larger than hash_table_bytes if there
> is a skew hashtable. I've updated the comment and such in attached v3.
>
Ah, thanks. I was just hacking on this too, but I'll switch to your v3.
regards
--
Tomas Vondra
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2025-10-09 14:41:15 | Re: Fix for compiler warning triggered in WinGetFuncArgInPartition() |
Previous Message | Greg Burd | 2025-10-09 14:21:23 | Fix for compiler warning triggered in WinGetFuncArgInPartition() |