From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Nathan Bossart <nathandbossart(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, Vaibhav Jain <jainva(at)google(dot)com>, pgsql-hackers(at)postgresql(dot)org, Madhukar <madhukarprasad(at)google(dot)com>, Sangeetha Seshadri <sangsesh(at)google(dot)com> |
Subject: | Re: Fix overflow of nbatch |
Date: | 2025-10-08 21:08:45 |
Message-ID: | 5e073665-a73b-42c3-a483-253ab7a6fd87@vondra.me |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10/8/25 21:16, Melanie Plageman wrote:
> On Wed, Oct 8, 2025 at 1:37 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
>>
>> I have updated my patch to fix the mistakes above. I also noticed then
>> that I wasn't doubling space_allowed in the loop but instead setting
>> it to hash_table_bytes at the end. This doesn't produce a power of 2
>> because we subtract skew_mcvs from the hash_table_bytes. So, we have
>> to keep using space_allowed if we want a power of 2 in the end.
>>
>> I've changed my patch to do this, but this made me wonder if we want
>> to be doing this or instead take hash_table_bytes at the end and round
>> it up to a power of 2 and set space_allowed to that. If the skew
>> hashtable is large, we may be allocating way more space_allowed than
>> we need for new hash_table_bytes + skew hashtable buckets.
>
I don't think there's any promise hash_table_bytes being a power of 2.
You can make hash_table_bytes an almost arbitrary value by setting
work_mem and hash_mem_multiplier. Or am I missing something?
But you're right hash_table_bytes and space_allowed may not be equal if
useskew=true. So setting space_allowed to hash_table_bytes at the end
does not seem right. I think we don't actually need hash_table_bytes at
this point, we can just ignore it, and use/double *space_allowed.
I kept using hash_table_bytes mostly because it didn't require the
pointer dereferencing, but I failed to consider the useskew=true thing.
However, this means there's probably a bug - the loop should probably
double num_skew_mcvs too. We simply reserve SKEW_HASH_MEM_PERCENT of
space_allowed for skew hashtable, so should we adjust it the same way?
> Oh wait, that doesn't make sense because each batch could have a skew hashtable.
>
Not sure I understand. Is this the same issue I just described?
regards
--
Tomas Vondra
From | Date | Subject | |
---|---|---|---|
Next Message | Jacob Champion | 2025-10-08 21:08:49 | Re: Thoughts on a "global" client configuration? |
Previous Message | Andres Freund | 2025-10-08 20:52:17 | Re: Should we update the random_page_cost default value? |