Re: WIP: BRIN multi-range indexes

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP: BRIN multi-range indexes
Date: 2021-02-11 14:51:31
Message-ID: 1388c876-2d07-c694-4f95-37032d711627@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Attached is an updated version of the patch series, addressing all the
failures on cfbot (at least I hope so). This turned out to be more fun
than I expected, as the issues went unnoticed on 64-bits and only failed
on 32-bits. That's also why I'm not entirely sure this will make cfbot
happy as that seems to be x86_64, but the issues are real so let's see.

1) I already outlined the issue in the previous message:

MAXALIGN(a * b) != MAXALIGN(a) * b

and there's an assert that we used exactly the same amount of memory we
allocated, so this caused a crash. Strange that it'd fail on 32-bits and
not 64-bits, but perhaps there's some math reason for that, or maybe it
was just pure luck.

2) The rest of the issues generally boils down to types that are byval
on 64-bits, but byref on 32-bits. Like int8 or float8 for example. The
first place causing issues were cross-type operators, i.e. the bloom
opclasses did things like this in pg_amop.dat:

{ amopfamily => 'brin/integer_bloom_ops', amoplefttype => 'int2',
amoprighttype => 'int8', amopstrategy => '1',
amopopr => '=(int2,int8)', amopmethod => 'brin' },

so it was possible to do this:

WHERE int8column = 1234::int2

in which case we used the int8 opclass, so the consistent function
thought it's working with int8, and used the hash function defined for
that opclass in pg_amproc. That's hashint8 of course, but we called that
on Datum storing int2. Clearly, dereferencing that pointer is guaranteed
to fail with a segfault.

I think there are two options to fix this. Firstly, we can remove the
cross-type operators, so that the left/right type is always the same.
That'll work fine for most cases, and it's pretty simple. It's also what
the hash_ops opclasses do, so I've done that.

An alternative would be to do something like minmax does for stategies,
and consider the subtype (i.e. type of the right argument). It's trick a
bit tricky, though, because it assumes the hash functions for the two
types are "compatible" and produce the same hash for the same value.
AFAIK that's correct for the usual cases (int2/int4/int8) and it'd be
restricted by pg_amop. But hash_ops don't do that for some reason, so I
wonder what am I missing. (The other thing is where to define these hash
functions - right now pg_amproc only tracks hash function for the "base"
data type, and there may be multiple supported subtypes, so where to
store that? Perhaps we could use the hash function from the default hash
opclass for each type.)

Anyway, I've decided to keep this simple for now, and I've ripped-out
the cross-type operators. We can add that back later, if needed.

3) There were a couple byref failures in the distance functions, which
generally used "double" internally (which I'm not sure is guaranteed to
be 64-bit types) instead of float8, and used plain "return" instead of
PG_RETURN_FLOAT8() in a couple places. Silly mistakes.

4) A particulary funny mistake was in actually calculating the hashes
for bloom filter, which is using hash_uint32_extended (so that we can
seed it). The trouble is that while hash_uint32() returns uint32,
hash_uint32_extended returns ... uint64. So we calculated a hash, but
then used the *pointer* to the uint64 value, not the value. I have to
say, the "uint32" in the function name is somewhat misleading.

This passes all my tests, including valgrind on the 32-bit rpi4 machine,
the stress test (testing both the bloom and multi-minmax opclasses etc.)

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0009-Ignore-correlation-for-new-BRIN-opclasses-20210211.patch text/x-patch 4.2 KB
0008-Define-multi-minmax-oclasses-for-types-with-20210211.patch text/x-patch 26.0 KB
0007-Remove-the-special-batch-mode-use-a-larger--20210211.patch text/x-patch 39.0 KB
0006-Batch-mode-when-building-new-BRIN-multi-min-20210211.patch text/x-patch 12.0 KB
0005-BRIN-minmax-multi-indexes-20210211.patch text/x-patch 224.9 KB
0004-BRIN-bloom-indexes-20210211.patch text/x-patch 127.6 KB
0003-Optimize-allocations-in-bringetbitmap-20210211.patch text/x-patch 4.4 KB
0002-Move-IS-NOT-NULL-handling-from-BRIN-support-20210211.patch text/x-patch 23.3 KB
0001-Pass-all-scan-keys-to-BRIN-consistent-funct-20210211.patch text/x-patch 23.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-02-11 14:52:16 Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)
Previous Message Robert Haas 2021-02-11 14:47:02 Re: [HACKERS] Custom compression methods