Re: Fix overflow of nbatch

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Vaibhav Jain <jainva(at)google(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Madhukar <madhukarprasad(at)google(dot)com>, Sangeetha Seshadri <sangsesh(at)google(dot)com>
Subject: Re: Fix overflow of nbatch
Date: 2025-09-22 20:45:12
Message-ID: CAApHDvq-yVY42m6_KwZEhAG5LA9pAQJMEuS8J94e_=5uXfVNMQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 23 Sept 2025 at 01:57, Vaibhav Jain <jainva(at)google(dot)com> wrote:
> With a1b4f28, to compute current_space, nbatch is being multiplied
> by BLCKSZ. nbatch is int and when multiplied with BLCKSZ, it can
> easily overflow the int limit.To keep the calculation safe for
> current_space, convert nbatch to size_t.

Thanks for finding and reporting this.

I think a1b4f289b mistakenly thought that there'd be size_t arithmetic
in the following two lines because the final result is a size_t:

size_t current_space = hash_table_bytes + (2 * nbatch * BLCKSZ);
size_t new_space = hash_table_bytes * 2 + (nbatch * BLCKSZ);

I'd rather see this fixed by adding a cast, i.e.: "(size_t) nbatch" on
the above two lines. All that code is new in a1b4f289b, and if you
change the nbatch to a size_t it affects the code that existed before
a1b4f289b, and from looking over that code, it seems to ensure that
nbatch does not overflow int by doing "dbatch = Min(dbatch,
max_pointers);", where max_pointers is constrained by MaxAllocSize /
sizeof(HashJoinTuple).

Also, making nbatches size_t makes the final line of " *numbatches =
nbatch;" somewhat questionable since numbatches is an int pointer.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2025-09-22 21:19:28 Re: Thoughts on NBASE=100000000
Previous Message Jacob Champion 2025-09-22 20:30:04 Re: RFC: adding pytest as a supported test framework