Question about hashed ScalarArrayOpExpr equality semantics

From: Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Question about hashed ScalarArrayOpExpr equality semantics
Date: 2026-05-11 04:34:38
Message-ID: CAJTYsWWGsHB+BOuDztdycXF_3641usoBAkCfUJR1k9=DD599ug@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

While looking at the hashed ScalarArrayOpExpr path, I noticed that the
linear
and hashed evaluation paths can give different answers if the comparison
function returns NULL for non-NULL inputs.

The part I am trying to understand is what assumption the hashed path is
allowed to make about equality operators used for hashing. Does declaring
an
equality operator as HASHES / putting it in a hash opclass imply that, for
non-NULL inputs, the equality result is a definite true or false value?

Reproduced on master.

Setup:

CREATE FUNCTION weird_strict_int_eq(int, int)
RETURNS bool
LANGUAGE sql IMMUTABLE STRICT
AS $$
SELECT CASE
WHEN $1 = 42 AND $2 = 42 THEN NULL
ELSE $1 = $2
END
$$;

CREATE FUNCTION weird_strict_int_ne(int, int)
RETURNS bool
LANGUAGE sql IMMUTABLE STRICT
AS $$
SELECT NOT weird_strict_int_eq($1, $2)
$$;

CREATE OPERATOR === (
LEFTARG = int,
RIGHTARG = int,
PROCEDURE = weird_strict_int_eq,
COMMUTATOR = ===,
NEGATOR = !==,
HASHES
);

CREATE OPERATOR !== (
LEFTARG = int,
RIGHTARG = int,
PROCEDURE = weird_strict_int_ne,
COMMUTATOR = !==,
NEGATOR = ===
);

CREATE OPERATOR CLASS weird_strict_int_hash_ops
FOR TYPE int USING hash AS
OPERATOR 1 ===,
FUNCTION 1 hashint4(int);

The hashed ScalarArrayOpExpr path activates once the constant array has at
least MIN_ARRAY_SIZE_FOR_HASHED_SAOP (currently 9) elements, so the
8-element
variant uses the linear evaluation and the 9-element variant uses the hash
table. I wrap the left-hand side in a non-immutable function call so the
expression is not constant-folded at plan time.

CREATE FUNCTION return_int_input(int) RETURNS int
LANGUAGE sql STABLE AS $$ SELECT $1 $$;

SELECT
return_int_input(42) === ANY (ARRAY[1,2,3,4,5,6,7,42]) AS linear_any,
return_int_input(42) === ANY (ARRAY[1,2,3,4,5,6,7,8,42]) AS hashed_any,
return_int_input(42) !== ALL (ARRAY[1,2,3,4,5,6,7,42]) AS linear_not_in,
return_int_input(42) !== ALL (ARRAY[1,2,3,4,5,6,7,8,42]) AS hashed_not_in;

The linear path preserves the NULL (UNKNOWN) result from the comparison.
The
hashed path seems to treat a NULL from the comparison as a non-match,
producing
a different result.

I realize this is a strange equality operator, and that may be the whole
point.
For a hash table lookup, it seems reasonable to need a definite answer to
"does
this stored key match the lookup key?". On the other hand,
ScalarArrayOpExpr
itself has SQL three-valued semantics, and the linear path does preserve the
comparison's NULL result.

So my question is: is the hashed path allowed to assume that hashable
equality
operators never return NULL for non-NULL inputs, or should it preserve the
same
UNKNOWN result that the linear ScalarArrayOpExpr evaluation would produce?

Regards,
Ayush

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2026-05-11 04:46:06 Re: Review - Patch for pg_bsd_indent: improve formatting of multiline comments
Previous Message Amit Kapila 2026-05-11 04:25:27 Re: DOCS - CREATE PUBLICATION ... EXCEPT has no mention on what happens after drop/create tables