From: | Peter Eisentraut <peter(at)eisentraut(dot)org> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Bug in brin_minmax_multi_distance_numeric() |
Date: | 2025-08-05 17:24:07 |
Message-ID: | d48b2a78-7ce0-46d8-b096-2ff2331aa819@eisentraut.org |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 01.08.25 19:17, Tomas Vondra wrote:
>
>
> On 7/31/25 20:35, Tom Lane wrote:
>> Tomas Vondra <tomas(at)vondra(dot)me> writes:
>>> On 7/31/25 19:33, Tom Lane wrote:
>>>> ... It is certainly broken on
>>>> 32-bit machines where the Datum result of numeric_float8 will
>>>> be a pointer, so that we will convert the numeric pointer value
>>>> to a float and return that, yielding a totally-garbage distance
>>>> value. But I think it's broken on 64-bit machines too, because
>>>> we'll be interpreting the output of numeric_float8 as a uintptr_t
>>>> and applying some unwanted conversion to make that a float.
>>
>>> Agreed it's a bug on 32-bit machines. Not sure about 64-bits.
>>
>> Yeah, I'm not 100% sure about that. It's certainly doing something
>> unexpected, but we might accidentally end up with relatively-sane
>> relative distance comparisons anyway. (I assume the outputs will
>> only be compared to other outputs of the same function, right?)
>
>
> Yes. The index accumulates values, sorts them, calculates distance
> between the points, and them "merges" the closest ones. So it only
> compares results of the same function.
>
>> I have a vague recollection that the IEEE float format was chosen with
>> an eye to making comparisons cheap, ie not too much different from
>> integer comparisons. So the sort order might be about the same
>> even after incorrectly reinterpreting the bit-pattern as an int.
>> NaNs probably mess that up, but they would anyway.
>>
>>> Actually, no - it should not cause "broken" indexes, as in "giving
>>> incorrect results". The distance functions determine in what order we
>>> merge points into ranges, and if the distances are bogus then we can
>>> build a summary that is less efficient.
>>
>> Got it. So it might be worth reindexing such indexes after the
>> fix, but it's not strictly necessary.
Do we want to make a separate commit for this issue that can be
backpatched and have some user-facing information attached to it?
From | Date | Subject | |
---|---|---|---|
Next Message | Nikhil Kumar Veldanda | 2025-08-05 17:26:03 | Re: Dead code with short varlenas in toast_save_datum() |
Previous Message | Peter Eisentraut | 2025-08-05 17:20:20 | pgaio_io_get_id() type (was Re: Datum as struct) |