Re: Revise the Asserts added to bimapset manipulation functions

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Revise the Asserts added to bimapset manipulation functions
Date: 2024-01-16 03:08:36
Message-ID: CAApHDvoXPoaDYEjMj9e1ihZZZynCtGqdAppWgPZMaMQ222NAkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2 Jan 2024 at 20:18, Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> I think one purpose of introducing REALLOCATE_BITMAPSETS is to help find
> dangling pointers to Bitmapset's. From this point of view, I agree that
> we should call bms_copy_and_free() in bms_add_members(), because the
> bitmapset 'a' might be recycled (in-place modified, or pfreed).

I spoke to Andres about this in our weekly meeting and he explained it
in such I way that I now agree that it's useful to repalloc with all
modifications. I'd previously thought that when the comments mention
that some function "recycle their left input" that you could be
certain that the Bitmapset would be modified in-place, therefore any
other pointers pointing to this set should remain valid. As Andres
reminded me, that's not the case when the set becomes empty and
there's nothing particularly special about a set becoming empty so
making a clone of the set will help identify any pointers that don't
get updated as a result of the modification.

I've now adjusted the patch to have all modifications to Bitmapsets
return a newly allocated set. There are a few cases missing in master
that need to be fixed (items 6-10 below):

The patch now includes changes for the following:

1. Document what REALLOCATE_BITMAPSETS is for.
2. Provide a reusable function to check a set is valid and use it in
cassert builds and use it to validate sets (Richard)
3. Provide a reusable function to copy a set and pfree the original
and use that instead of duplicating that code all over bitmapset.c
4. Fix Assert duplication (Richard)
5. Make comments in bms_union, bms_intersect, bms_difference clear
that a new set is allocated. (This has annoyed me for a while)
6. Fix failure to duplicate the set in bms_add_members() when b == NULL.
7. Fix failure to duplicate the set in bms_add_range() when upper < lower
8. Fix failure to duplicate the set in bms_add_range() when the set
has enough words already.
9. Fix failure to duplicate the set in bms_del_members() when b == NULL
10. Fix failure to duplicate the set in bms_join() when a == NULL and
also fix the b == NULL case too
11. Fix hazard in bms_add_members(), bms_int_members(),
bms_del_members() and bms_join(), where the code added in 7d58f2342
could crash if a == b due to the REALLOCATE_BITMAPSETS code pfreeing
'a'. This happens in knapsack.c:93, although I propose we adjust that
code now that empty sets are always represented as NULL.

David

> According to this criterion, it seems to me that we should also call
> bms_copy_and_free() in bms_join(), since both inputs there might be
> recycled. But maybe I'm not understanding it correctly.
>
>>
>> > * Do you think we can add Assert(bms_is_valid_set(a)) for bms_free()?
>>
>> I did briefly have the Assert in bms_free(), but I removed it as I
>> didn't think it was that useful to validate the set before freeing it.
>> Certainly, there'd be an argument to do that, but I ended up not
>> putting it there. I wondered if there could be a case where we do
>> something like bms_int_members() which results in an empty set and
>> after checking for and finding the set is now empty, we call
>> bms_free(). If we did that then we'd Assert fail. In reality, we use
>> pfree() instead of bms_free() as we already know the set is not NULL,
>> so it wouldn't cause a problem for that particular case. I wondered if
>> there might be another one that I didn't spot, so felt it was best not
>> to Assert there.
>>
>> I imagine that if we found some case where the bms_free() Assert
>> failed, we'd likely just remove it rather than trying to make the set
>> valid before freeing it. So it seems pretty pointless if we'd opt to
>> remove the Assert() at the first sign of trouble.
>
>
> I think I understand your concern. In some bitmapset manipulation
> functions, like bms_int_members(), the result might be computed as
> empty. In such cases we need to free the input bitmap. If we use
> bms_free(), the Assert would fail.
>
> It seems to me that this scenario can only occur within the bitmapset
> manipulation functions. Outside of bitmapset.c, I think it should be
> quite safe to call bms_free() with this Assert. So I think it should
> not have problem to have this Assert in bms_free() as long as we are
> careful enough when calling bms_free() inside bitmapset.c.
>
> Thanks
> Richard

Attachment Content-Type Size
v3-0001-Fix-REALLOCATE_BITMAPSETS-code.patch text/plain 20.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-01-16 03:32:22 Re: Add PQsendSyncMessage() to libpq
Previous Message Kyotaro Horiguchi 2024-01-16 02:57:03 Re: Make mesage at end-of-recovery less scary.