Exploring pass-by-value for small Bitmapset sets

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Exploring pass-by-value for small Bitmapset sets
Date: 2026-03-03 02:55:04
Message-ID: CA+hUKGJdzw6tSTQrBZNe2f0tAr+GdttiSS7-b0aVwPDfi3uh3A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I played around with the idea of supporting "immediate" Bitmapset
values that skip allocation and pointer chasing overheads and live in
registers, as long as they fit. Prototype-grade patch attached.

The main thing that had stopped me exploring this earlier is that it
has to work within the Node object model. Here I tried out a
technique from dynamically typed language implementations*: a Node
pointer can be tagged with zero bit = 1 meaning "this is an
immediate/unboxed value", since valid pointers to structs with alignof
> 1 can't hold odd addresses. That leaves 63 (or 31) free bits in
which to store a value. If more Node types wanted an immediate
representation then it seems like we could either steal at least one
more bit for a fixed- or variable-width type selector field, which
doesn't seem very scalable, or screw the Node object model down a bit
so we wouldn't need it**. I haven't looked into that much... for now
I just wanted the simplest hardcoded thing to let me try out the
bitmapset.c part.

Then I inserted a bunch of immediate value handling, for example:

bms_equal() has:

if (bms_is_immediate(a))
return a == b;
else if (bms_is_immediate(b))
return false;

bms_intersect() has:

if (bms_is_immediate(a) || bms_is_immediate(b))
{
bitmapword aw = bms_first_word(a);
bitmapword bw = bms_first_word(b);

return bms_make_from_one_word(aw & bw);
}

bms_make_from_one_word() has:

if (likely(bms_word_fits_in_immediate(aw)))
return bms_word_to_immediate(aw);

That turns into:

if (aw == 0)
return NULL;

aw <<= NODE_PTR_TAG_MASK_WIDTH;
aw |= NODE_PTR_T_Bitmapset;
return (Bitmapset *) aw;

There are still non-immediate Bitmapset objects with nwords == 1, but
only for sets containing 63 (or 31) as the highest member, since the
tag eats a bit and the value is shifted. That's why I added a bunch
of edge cases around that boundary to the tests.

Something involving a union or Datum-like thing rather than casts of a
pointer might be better C style, but that's superficial and I didn't
want to run around and change types all over the tree for this
demonstration. Another superficial aspect is that it feels like some
common operations are obvious candidates for inlining of at least
their happy paths.

The bitmapset.c API suits the resulting pass-by-value-sometimes
semantics perfectly: NULL is already a special case for the extreme
case of small sets, and all modifying operations return their result,
so callers already store them and require no change. (The list API
also has that property. Bitmapset replaced an earlier use of lists of
small integers to hold relids and attrs etc, and lists were probably
originally CONS chains with NIL as empty and outputs that share
structure with their input, hence this API style.)

I'm interested to hear whether people think this sort of thing is
worth pursuing, how much of a problem Bitmapset manipulation really
is, whether it's worth hacking on the Node system to achieve that, etc
etc.

* Modern pointer tagging systems often steal high bits instead (or as
well), knowing that virtual addresses are limited to 48-52 bits or
something like that by most architectures. They're also sometimes
interested in tagging some values that *are* pointers (imagine if our
node->tag were moved into the upper bits of the node pointer itself
and had to be masked off before dereferencing), whereas here I only
want to mark a value as *not* a pointer. The high end leads to very
slightly cheaper bitswizzling perhaps since you don't need to shift,
but I have dim recollections that that AIX broke that assumption and
couldn't run certain open source virtual machines for Javascript etc
as a result. So I went for the low end of the word, a traditional
version of the trick from the 32 bit days. Of course you could use
different bits/operations for different systems and word sizes if it's
really worth it, in a more developed version...

** I suspect that if we got rid of the fully general print() etc that
we inherited from Lisp (which was in fact using tagged pointers and
immediates and only doing the equivalent of node->tag in user code to
discriminate struct types, BTW) and always used more specific
print_expr() etc functions, as we already do when we're recursing,
then we probably wouldn't need know about pointer tags anywhere
outside eg bitmapset.c, if it were true that no place in the grammar
of valid Node tree productions permits discriminated unions of types
with an immediate representation. Perhaps that same line of thinking
would already work today in the parallel universe of Datums, if, say,
there were an immediate representation of tiny varlena to support
small text, numeric etc values. On the other hand, it doesn't seem
too bad to add a couple of cycles to the most general
node-type-dispatching functions, and it would be avoided in the common
case if we had a different nodeType() function that doesn't test for
them for the places with known possible types that doesn't include
immediates in our grammar of legal trees. I haven't studied any of
that very much, and for this I just did the easiest thing to make it
work...

Attachment Content-Type Size
v1-0001-Teach-bitmapset.c-to-pass-small-sets-by-value.patch text/x-patch 61.9 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message zhangqiang 2026-03-03 03:07:52 Re:doc: Clarify that empty COMMENT string removes the comment
Previous Message Chao Li 2026-03-03 02:46:51 Re: pg_dumpall --roles-only interact with other options