Re: Small and unlikely overflow hazard in bms_next_member()

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Small and unlikely overflow hazard in bms_next_member()
Date: 2026-04-13 00:23:26
Message-ID: 5E98992A-62D3-4961-B2F1-61246DB9D814@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Apr 12, 2026, at 21:17, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Fri, 3 Apr 2026 at 15:08, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>> IMO, if we can make bitmapset.c work with INT_MAX members and get a
>> performance increase, then we should do it.
>
> Re-thinking this after a week's holiday, it seems fine to use an
> unsigned 32-bit int rather than a 64-bit int to fix this bug. I'd
> previously been uncertain if there were any guarantees in C to what
> (unsigned int) -1 would return, but going by [1] at 6.3.1.3, it says:
>
> "Otherwise, if the new type is unsigned, the value is converted by
> repeatedly adding or subtracting one more than the maximum value that
> can be represented in the new type until the value is in the range of
> the new type."
>
> So, it seems even on one's complement that -1 as an unsigned int will
> be UINT_MAX. When we add 1 to UINT_MAX, we're guaranteed to get 0, as
> it's unsigned maths and overflows are going to result in a value
> modulus the max value for the type.
>
> That leads me to the attached v2 patch. Compiler Explorer link showing
> the assembly at [2].
>
> Testing the performance, it's better than master, so I got rid of the
> size_t wordnum stuff. We're post-freeze now, so fiddling with other
> optimisations seems a bit off the table as there appears to be no
> performance regression to compensate for now, per:
>
> drowley(at)amd3990x:~$ gcc test_bms_next3.c -O2 -o test_bms_next3 &&
> ./test_bms_next3
> Benchmarking 100000000 bms_next_member iterations...
> master: 1.18330 seconds
> Patched: 1.05493 seconds
>
> Benchmarking 100000000 bms_prev_member iterations...
> master: 2.94522 seconds
> Patched: 1.86130 seconds
>
> drowley(at)amd3990x:~$ clang test_bms_next3.c -O2 -o test_bms_next3 &&
> ./test_bms_next3
> Benchmarking 100000000 bms_next_member iterations...
> master: 1.07860 seconds
> Patched: 1.07896 seconds
>
> Benchmarking 100000000 bms_prev_member iterations...
> master: 2.76550 seconds
> Patched: 2.12369 seconds
>
> David
>
> [1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
> [2] https://godbolt.org/z/xW96rxd3P
> <test_bms_next3.c><v2-0001-Fix-unlikely-overflow-bug-in-bms_next_member.patch>

I just tried test_bms_next3 on my MacBook, looks like patched bms_prev_member is much faster there. I ran about 10 times, the results were consistent.
```
chaol(at)ChaodeMacBook-Air test % gcc test_bms_next3.c -O2 -o test_bms_next3
test_bms_next3.c:281:18: warning: format specifies type 'long' but the argument has type 'int64' (aka 'long long') [-Wformat]
281 | printf("%ld\n", count);
| ~~~ ^~~~~
| %lld
1 warning generated.
chaol(at)ChaodeMacBook-Air test % ./test_bms_next3
Benchmarking 100000000 bms_next_member iterations...
master: 0.49287 seconds
Patched: 0.47315 seconds

Benchmarking 100000000 bms_prev_member iterations...
master: 1.04644 seconds
Patched: 0.58053 seconds
2800000000
```

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2026-04-13 00:37:55 test_compression, test module for low-level compression APIs (for 2b5ba2a0a141)
Previous Message Michael Paquier 2026-04-13 00:16:29 Re: Fix pgstat_database.c to honor passed database OIDs