| From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Small and unlikely overflow hazard in bms_next_member() |
| Date: | 2026-04-03 02:08:58 |
| Message-ID: | CAApHDvqTUm3Cbgz3ZLV+ad8s_HJHZYrVbrBvGyPQdxCRR-6dvA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, 3 Apr 2026 at 11:12, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Thu, 2 Apr 2026 at 17:22, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > I don't think we should add cycles here for this purpose.
>
> I'm not keen on slowing things down for this either. I did do some
> experiments in [1] that sees fewer instructions from using 64-bit
> maths. I might go off and see if there are any wins there that also
> give us the INT_MAX fix. It's not great effort to reward ratio
> though...
The reduction in instructions with the patched version got me curious
to see if it would translate into a performance increase. I tested on
an AMD Zen2 machine, and it's a decent amount faster than master. I
tested with gcc and clang.
I also scanned over the remaining parts of bitmapset.c and didn't find
anywhere else that has overflow risk aside from what you pointed out
in bms_prev_member().
The attached patch contains the benchmark function I added to the
test_bitmapset module. It should apply to master with a bit of noise.
CREATE EXTENSION test_bitmapset;
SELECT
generate_series(1,3) AS run,
bench_bms_next_member('(b 1 2 3 4 5 6 7 8 64)', 1000000)/1000 AS
bms_next_member_us,
bench_bms_prev_member('(b 1 2 3 4 5 6 7 8 64)', 1000000)/1000 AS
bms_prev_member_us;
master (gcc)
run | bms_next_member_us | bms_prev_member_us
-----+--------------------+--------------------
1 | 26473 | 40404
2 | 26218 | 40413
3 | 26209 | 40387
patched (gcc)
run | bms_next_member_us | bms_prev_member_us
-----+--------------------+--------------------
1 | 25409 | 29705
2 | 24905 | 29693
3 | 24870 | 29707
Times are in microseconds to do 1 million bms_*_member() loops over
the entire set.
I've also attached the full results I got. I've also included the
results from Chao's version, which does slow things down decently on
clang.
IMO, if we can make bitmapset.c work with INT_MAX members and get a
performance increase, then we should do it.
David
| Attachment | Content-Type | Size |
|---|---|---|
| benchmark_results.txt | text/plain | 8.0 KB |
| bms_fixes.patch | application/octet-stream | 5.0 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Haas | 2026-04-03 02:15:36 | Re: pg_plan_advice |
| Previous Message | Robert Haas | 2026-04-03 02:08:51 | Re: pg_plan_advice |