Re: Missing update of all_hasnulls in BRIN opclasses

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Missing update of all_hasnulls in BRIN opclasses
Date: 2023-05-19 01:04:13
Message-ID: 891550a8-4118-2a25-d644-bf82d9c83ed7@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/18/23 20:45, Tomas Vondra wrote:
> ...
>
> 0001 fixes the issue. 0002 is the original fix, and 0003 is just the
> pageinspect changes (for master only).
>
> For the backbranches, I thought about making the code more like master
> (by moving some of the handling from opclasses to brin.c), but decided
> not to. It'd be low-risk, but it feels wrong to kinda do what the master
> does under "oi_regular_nulls" flag.
>

I've now pushed all these patches into relevant branches, after some
minor last-minute tweaks, and so far it didn't cause any buildfarm
issues. Assuming this fully fixes the NULL-handling for BRIN, this
leaves just the deadlock issue discussed in [1].

It seems rather unfortunate all these issues went unnoticed / unreported
essentially since BRIN was introduced in 9.5. To some extent it might be
explained by fairly low likelihood of actually hitting the issue (just
the right timing, concurrency with summarization, NULL values, ...). It
took me quite a bit of time and luck to (accidentally) hit these issues
while stress testing the code.

But there's also the problem of writing tests for this kind of thing. To
exercise the interesting parts (e.g. the union_tuples), it's necessary
to coordinate the order of concurrent steps - but what's a good generic
way to do that (which we could do in TAP tests)? In manual testing it's
doable by setting breakpoints on a particular lines, and step through
the concurrent processes that way.

But that doesn't seem like a particularly great solution for regression
tests. I can imagine adding some sort of "probes" into the code and then
attaching breakpoints to those, but surely we're not the first project
needing this ...

regards

[1]
https://www.postgresql.org/message-id/261e68bc-f5f5-5234-fb2c-af4f583513c0@enterprisedb.com

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message tender wang 2023-05-19 01:42:35 ERROR: wrong varnullingrels (b 5 7) (expected (b)) for Var 3/3
Previous Message Tom Lane 2023-05-18 23:57:14 Re: The documentation for READ COMMITTED may be incomplete or wrong