Bug in GiST paring heap comparator

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Bug in GiST paring heap comparator
Date: 2019-09-02 04:54:11
Message-ID: CAPpHfdsNvNdA0DBS+wMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

Andrey Borodin noticed me that results of some KNN-GIST tests are
obviously wrong and don't match results of sort node.

SELECT * FROM point_tbl ORDER BY f1 <-> '0,1';
f1
-------------------
(10,10)
(NaN,NaN)
(0,0)
(1e-300,-1e-300)
(-3,4)
(-10,0)
(-5,-12)
(5.1,34.5)

(1e+300,Infinity)
(10 rows)

It appears to be related to implementation of comparison function in
pairing heap used as priority queue for KNN. It used plain float
comparison, which doesn't handle Inf and Nan values well. Attached
patch replaced it with float8_cmp_internal().

Also, note that with patch KNN results still don't fully match results
of sort node.

SELECT * FROM point_tbl ORDER BY f1 <-> '0,1';
f1
-------------------
(0,0)
(1e-300,-1e-300)
(-3,4)
(-10,0)
(10,10)
(-5,-12)
(5.1,34.5)
(1e+300,Infinity)

(NaN,NaN)
(10 rows)

NULL and '(NaN,NaN)' are swapped. It happens so, because we assume
distance to NULL to be Inf, while float8_cmp_internal() assumes NaN to
be greater than NULL. If even we would assume distance to NULL to be
Nan, it doesn't guarantee that NULLs would be last. It looks like we
can handle this only by introduction array of "distance is null" flags
to GISTSearchItem. But does it worth it?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
gist-pairing-heap-cmp-fix-1.patch application/octet-stream 2.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2019-09-02 04:58:42 Re: moonjelly vs tsearch
Previous Message Amit Langote 2019-09-02 04:50:42 Re: REL_12_STABLE crashing with assertion failure in ExtractReplicaIdentity