From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | Nathan Bossart <nathandbossart(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, Vaibhav Jain <jainva(at)google(dot)com>, pgsql-hackers(at)postgresql(dot)org, Madhukar <madhukarprasad(at)google(dot)com>, Sangeetha Seshadri <sangsesh(at)google(dot)com> |
Subject: | Re: Fix overflow of nbatch |
Date: | 2025-10-13 16:05:11 |
Message-ID: | CAAKRu_bZVZY8XWt5pSif2RV1sJSpJhbd9pON=LUvofqj1FqJvw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Oct 9, 2025 at 7:36 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> 1) A couple comments adjusted. It feels a bit too audacious to correct
> comments written by native speaker, but it seems cleaner to me like this.
I attached a patch with a few more suggested adjustments (0003). The
more substantive tweaks are:
I don't really like that this comment says it is about nbuckets
overflowing MaxAllocSize because overflow means something specific and
this sounds like we are saying the nbuckets variable will overflow
MaxAllocSize but what we mean is that nbuckets worth of HashJoinTuples
could overflow MaxAllocSize. You don't have to use my wording, but I'm
not sure about this wording either.
/* Check that nbuckets wont't overflow MaxAllocSize */
if (nbuckets > MaxAllocSize / sizeof(HashJoinTuple) / 2)
break;
Also, upon re-reading the comment we wrote together:
* With extremely low work_mem values, nbuckets may have been set
* higher than hash_table_bytes / sizeof(HashJoinTuple). We don't try
* to correct that here, we accept nbuckets to be oversized.
I'm not so sure it belongs above the nbuckets allocation check.
Perhaps we should move it to where we are doubling nbuckets. Or even
above where we clamp it to 1024.
I'm actually wondering if we want this comment at all. Does it
emphasize an edge case to a confusing point? I'm imagining myself
studying it in the future and having no idea what it means.
I've kept it in but moved it in 0003.
> 4) added the doubling of num_skew_mcvs, and also the overflow protection
> for that
Do we really need to check if num_skew_mcvs will overflow? Shouldn't
it always be smaller than nbuckets? Maybe it can be an assert.
> You suggested this in another message:
>
> > We do need to recalculate num_skew_mcvs once at the end before
> > returning.
>
> But I think the doubling has the same effect, right? I don't want to
> redo the whole "if (useskew) { ... }" block at the end.
Yea, it would have to be some kind of helper or something. I worried
just doubling num_skew_mcvs would drift significantly because of
integer truncation -- perhaps even a meaningful amount. But it was
just an intuition -- I didn't plug in any numbers and try.
- Melanie
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Fix-overflow-risk-in-hashtable-size-calculation.patch | text/x-patch | 7.4 KB |
v5-0002-adjustments.patch | text/x-patch | 3.1 KB |
v5-0003-more-tweaks.patch | text/x-patch | 3.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-10-13 16:10:56 | Re: add function argument name to substring and substr |
Previous Message | Ashutosh Bapat | 2025-10-13 15:58:09 | Re: Changing shared_buffers without restart |