Re: [PATCH] Add tests for Bitmapset

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Greg Burd <greg(at)burd(dot)me>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: [PATCH] Add tests for Bitmapset
Date: 2025-09-17 03:25:09
Message-ID: aMoqFTfa-Or-rMFt@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 16, 2025 at 03:04:39PM -0400, Greg Burd wrote:
> I've re-written the set of tests as suggested (I hope!). I've not
> re-run the coverage report (yet) to ensure we're at least as well
> covered as v3, but attached is v4 for your inspection (amusement?).
> I've tried to author the SQL tests such that failures clearly indicate
> what's at gone wrong.

Thanks for the update, that's easier to follow, even if it's more
code overall. It looks like all the 29 APIs are present, plus the
extras for the array mapping routines required for the tests with the
list APIs.

+static void
+elog_bitmapset(int elevel, const char *label, const Bitmapset *bms)

This routine, that's able to translate a bytea into a readable form,
is also something that I would have added as an extra function, then
used it in some of the tests where I would care about the bitmap in
output generated after a given operation, removing the DEBUG parts
that I guess have also a risk of rotting over time. That would make
the tests for simple operations that generate a bms as a result in
easier to read, because we would call less SQL calls. For example,
take the "add_member idempotent" case, that calls twice
test_bms_make_singleton() to compare the output byteas, we could just
do a test_bms_add_member(test_bms_make_singleton(10), 10) and show the
output?

We could perhaps use the CTE style for the basic calls, to reduce the
test_bms_make_singleton() calls. What you are doing feels OK as well,
just a thought in passing.

FWIW, I tend to write the regression test queries so as these are
limited to 72-80 characters per line, to fit even in a small
terminals. That's just an old-school take, even if it means more
lines in the SQL input file..

Perhaps it would be worth documenting what's the value of
test_random_operations? It is a mean of testing add, del and
is_member with a modulo 1000 of the member number. With the other
functions that offer deterministic tests, I am not sure what's the
extra value gained in it.

+ bms1_data = PG_GETARG_BYTEA_PP(0);
+ if (VARSIZE_ANY_EXHDR(bms1_data) > 0)
+ {
+ bms1 = (Bitmapset *) palloc(VARSIZE_ANY_EXHDR(bms1_data));
+ memcpy(bms1, VARDATA_ANY(bms1_data), VARSIZE_ANY_EXHDR(bms1_data));
+ }

This pattern is repeated 9 times, if my fingers got it right. Perhaps
hide it in a macro, or invent an extra static inline routine that does
the translation?

+DO $$
+BEGIN
+ BEGIN
+ PERFORM test_bms_singleton_member(test_bms_from_array(ARRAY[1,2]));
+ RAISE NOTICE 'ERROR: singleton_member([1,2]) should have failed!';

Do we need all these DO blocks to catch failures? These cases bump on
elog(ERROR) in the internals of bitmapset.c, which are not something
the end-users would expect, but in a test module that's fine to
trigger. The function call would just fail, which is fine.

The queries with CTEs and UNION ALL are elegant, nice.

> This exercise turned into a lot more LOC than I'd expected, I hope it
> provides some value to the community for testing this important data
> structure and helps to codify the API clearly.

It does, at least that's my opinion. So thanks for caring about this
level of testing.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2025-09-17 03:30:44 Re: --with-llvm on 32-bit platforms?
Previous Message Zhijie Hou (Fujitsu) 2025-09-17 03:22:56 RE: Reword messages using "as" instead of "because"