From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Greg Burd <greg(at)burd(dot)me> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Subject: | Re: [PATCH] Add tests for Bitmapset |
Date: | 2025-09-19 06:36:30 |
Message-ID: | aMz57nHgkLozddYu@paquier.xyz |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Sep 18, 2025 at 02:50:45PM -0400, Greg Burd wrote:
> Thanks for all the feedback and time spent reviewing this patch. I
> switched out the encode/decode functions to use the nodeToString() and
> stringToNode() functions and change all the SQL testing function
> signatures to TEXT from BYTEA. This exercises more code and that's good
> for testing purposes. I took out the code that checks the hash values,
> I can't think of a reason that should cause a future failure should that
> change and output different values. I re-integrated the earlier random
> testing function along with the newer code. I'm not sure this buys us
> much if anything.
>
> coverage: HEAD v7
> lines......: 87.1 90.5 +3.4%
> functions..: 100.0 100.0 +0.0%
> branches...: 63.5 75.9 +12.4%
Nathan has given me his blessing, so as I could look at this patch and
put my hands on it. I have spotted one bug, and things that could be
done a bit better. Please see below.
As far as I am concerned there is a mistake with bms_overlap_list().
This API checks a Bitmapset with a list of integers, but you have
coded things so as it checks for a list of Bitmapset. I think that
test_bms_overlap_list() should take in input a int4 array, then the
code in charge of the conversion from the array to the list needs to
be tweaked a little bit. With the current code, we could get random
failures with negative members included in a Bitmapset, depending on
what's on the stack.
I am not really convinced about the value brought by bms_values() now
that the output is generated for all the SQL functions using the
"decode" and "encode" routines (which may be actually less confusing
if renamed as "bms_to_text" and "text_to_bms", but that's a personal
take), because this results in doing an extra round-trip between text
and Bitmapset. The tests don't rely much on NULL as input to
translate that as "(b)". bmsToString() is just a direct call to what
the decode and encode routines do.
Same remark about test_bms_from_array(), that mimicks nodeRead(), in
charge of doing a int4[] -> Bitmapset conversion. Wouldn't it be
simpler to just use "(b 1 2 3)" in input and let the decode/encode
routines do the job?
Let's replace the test descriptions in the queries that we know have
an individual result by comments. For example all the "Range
operations". These descriptions are useful in the UNION queries,
where we want to rely on a set of Bitmapsets for multiple operations,
to be able to know the description of the result. For individual
queries, this bloats the output.
The numbering in the tests are not that useful to have. If the test
suite is updated, that would mean more updates required if we add a
new block in the middle.
+SELECT 'overlapping ranges' as test,
+ bms_values(test_bms_union(
+ test_bms_add_range(NULL, 50, 150),
+ test_bms_add_range(NULL, 100, 200)
The output generated by this query leads to a bitmapset with 200
members. It would be simpler and more readable to reduce these
ranges, or is there a reason for these to be large? Same remark for
"difference subtraction edge case" with its 50 members.
+SELECT 'add_range large range' as test, bms_values(test_bms_add_range(NULL, 0, 1000)) as result;
This one also is funky with its output of 1000 elements. Any reason
this is justified in terms of coverage? If yes, we could use some
substr() calls to get the end and the beginning of the output
generated, perhaps to reduce the output of the whole, or the length,
or something like that relevant enough to validate the output. There
is a second test a bit down that uses a range of [1000,1100], as well.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Florents Tselai | 2025-09-19 06:45:21 | Re: encode/decode support for base64url |
Previous Message | Fujii Masao | 2025-09-19 06:30:11 | Re: Invalid primary_slot_name triggers warnings in all processes on reload |