Re: Making empty Bitmapsets always be NULL

From: Yuya Watari <watari(dot)yuya(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Making empty Bitmapsets always be NULL
Date: 2023-06-15 08:56:54
Message-ID: CAJ2pMkZ0HfhfA0QNa_msknL=_-PavZmPHRWnW7yOb3_PWUoB+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

On Tue, Jun 13, 2023 at 8:07 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I've incorporated fixes for the bugs in the attached patch. I didn't
> quite use the same approach as you did. I did the fix for 0003
> slightly differently and added two separate paths. We've no need to
> track the last non-zero word when 'a' has more words than 'b' since we
> can't truncate any zero-words off for that case. Not having to do
> that makes the do/while loop pretty tight.

I really appreciate your quick response and incorporating those fixes
into your patch. The fix for 0003 looks good to me. I believe your
change improves performance more.

> For the fix in the 0004 patch, I think we can do what you did more
> simply. I don't think there's any need to perform the loop to find
> the last non-zero word. We're only deleting a member from a single
> word here, so we only need to check if that word is the last word and
> remove it if it's become zero. If it's not the last word then we
> can't remove it as there must be some other non-zero word after it.

If my thinking is correct, the do-while loop I added is still
necessary. Consider the following code. The Assertion in this code
passes in the master but fails in the new patch.

=====
Bitmapset *x = bms_make_singleton(1000);

x = bms_del_member(x, 1000);
Assert(x == NULL);
=====

In the code above, we get a new Bitmapset by bms_make_singleton(1000).
This Bitmapset has many words. Only the last word is non-zero, and all
the rest are zero. If we call bms_del_member(x, 1000) for the
Bitmapset, all words of the result will be zero, including the last
word, so we must return NULL. However, the new patch truncates only
the last word, leading to an incorrect result. Therefore, we need to
perform the loop to find the actual non-zero word after the deletion.
Of course, I agree that if we are not modifying the last word, we
don't have to truncate anything, so we can omit the loop.

> I also made a small adjustment to bms_get_singleton_member() and
> bms_singleton_member() to have them Assert fail if result is < 0 after
> looping over the set. This should no longer happen so I thought it
> would make more compact code if that check was just removed. We'd
> likely do better if we got reports of Assert failures here than, in
> the case of bms_get_singleton_member, some code accidentally doing the
> wrong thing based on a corrupt Bitmapset.

I agree with your change. I think failing by Assertion is better than
a runtime error or unexpected behavior.

--
Best regards,
Yuya Watari

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2023-06-15 09:04:20 Re: Let's make PostgreSQL multi-threaded
Previous Message Hannu Krosing 2023-06-15 08:50:31 Re: Let's make PostgreSQL multi-threaded