Re: Missing update of all_hasnulls in BRIN opclasses

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Missing update of all_hasnulls in BRIN opclasses
Date: 2023-01-08 23:34:18
Message-ID: a18e4ec1-86d7-8250-90ee-3337a382c0bb@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks Justin! I've applied all the fixes you proposed, and (hopefully)
improved a couple other comments.

I've been working on this over the past couple days, trying to polish
and commit it over the weekend - both into master and backbranches.
Sadly, the backpatching part turned out to be a bit more complicated
than I expected, because of the BRIN reworks in PG14 (done by me, as
foundation for the new opclasses, so ... well).

Anyway, I got it done, but it's a bit uglier than I hoped for and I
don't feel like pushing this on Sunday midnight. I think it's correct,
but maybe another pass to polish it a bit more is better.

So here are two patches - one for 11-13, the other for 14-master.

There's also a separate patch with pageinspect tests, but only as a
demonstration of various (non)broken cases, not for commit. And then
also a bash script generating indexes with random data, randomized
summarization etc. - on unpatched systems this happens to fail in about
1/3 of the runs (at least for me). I haven't seen any failures with the
patches attached (on any branch).

As for the issue / fix, I don't think there's a better solution than
what the patch does - we need to distinguish empty / all-nulls ranges,
but we can't add a flag because of on-disk format / ABI. So using the
existing flags seems like the only option - I haven't heard any other
ideas so far, and I couldn't come up with any myself either.

I've also thought about alternative "encodings" into allnulls/hasnulls,
instead of treating (true,true) as "empty" - but none of that ended up
being any simpler, quite the opposite actually, as it would change what
the individual flags mean etc. So AFAICS this is the best / least
disruptive option.

I went over all the places touching these flags, to double check if any
of those needs some tweaks (similar to union_tuples, which I missed for
a long time). But I haven't found anything else, so I think this version
of the patches is complete.

As for assessing how many indexes are affected - in principle, any index
on columns with NULLs may be broken. But it only matters if the index is
used for IS NULL queries, other queries are not affected.

I also realized that this only affects insertion of individual tuples
into existing all-null summaries, not "bulk" summarization that sees all
values at once. This happens because in this case add_values_to_range
sets hasnulls=true for the first (NULL) value, and then calls the
addValue procedure for the second (non-NULL) one, which resets the
allnulls flag to false.

But when inserting individual rows, we first set hasnulls=true, but
brin_form_tuple ignores that because of allnulls=true. And then when
inserting the second row, we start with hasnulls=false again, and the
opclass quietly resets the allnulls flag.

I guess this further reduces the number of broken indexes, especially
for data sets with small null_frac, or for append-only (or -mostly)
tables where most of the summarization is bulk.

I still feel a bit uneasy about this, but I think the patch is solid.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-extra-tests.patch text/x-patch 10.2 KB
0001-Fix-handling-of-NULLs-in-BRIN-indexes-14-master.patch text/x-patch 11.5 KB
0001-Fix-handling-of-NULLs-in-BRIN-indexes-11-13.patch text/x-patch 12.8 KB
stress-test.sh application/x-shellscript 5.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-01-08 23:53:09 Re: Fixing a couple of buglets in how VACUUM sets visibility map bits
Previous Message Peter Geoghegan 2023-01-08 22:39:19 Re: Fixing a couple of buglets in how VACUUM sets visibility map bits