| From: | Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com> |
| Subject: | Question about hashed ScalarArrayOpExpr equality semantics |
| Date: | 2026-05-11 04:34:38 |
| Message-ID: | CAJTYsWWGsHB+BOuDztdycXF_3641usoBAkCfUJR1k9=DD599ug@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
While looking at the hashed ScalarArrayOpExpr path, I noticed that the
linear
and hashed evaluation paths can give different answers if the comparison
function returns NULL for non-NULL inputs.
The part I am trying to understand is what assumption the hashed path is
allowed to make about equality operators used for hashing. Does declaring
an
equality operator as HASHES / putting it in a hash opclass imply that, for
non-NULL inputs, the equality result is a definite true or false value?
Reproduced on master.
Setup:
CREATE FUNCTION weird_strict_int_eq(int, int)
RETURNS bool
LANGUAGE sql IMMUTABLE STRICT
AS $$
SELECT CASE
WHEN $1 = 42 AND $2 = 42 THEN NULL
ELSE $1 = $2
END
$$;
CREATE FUNCTION weird_strict_int_ne(int, int)
RETURNS bool
LANGUAGE sql IMMUTABLE STRICT
AS $$
SELECT NOT weird_strict_int_eq($1, $2)
$$;
CREATE OPERATOR === (
LEFTARG = int,
RIGHTARG = int,
PROCEDURE = weird_strict_int_eq,
COMMUTATOR = ===,
NEGATOR = !==,
HASHES
);
CREATE OPERATOR !== (
LEFTARG = int,
RIGHTARG = int,
PROCEDURE = weird_strict_int_ne,
COMMUTATOR = !==,
NEGATOR = ===
);
CREATE OPERATOR CLASS weird_strict_int_hash_ops
FOR TYPE int USING hash AS
OPERATOR 1 ===,
FUNCTION 1 hashint4(int);
The hashed ScalarArrayOpExpr path activates once the constant array has at
least MIN_ARRAY_SIZE_FOR_HASHED_SAOP (currently 9) elements, so the
8-element
variant uses the linear evaluation and the 9-element variant uses the hash
table. I wrap the left-hand side in a non-immutable function call so the
expression is not constant-folded at plan time.
CREATE FUNCTION return_int_input(int) RETURNS int
LANGUAGE sql STABLE AS $$ SELECT $1 $$;
SELECT
return_int_input(42) === ANY (ARRAY[1,2,3,4,5,6,7,42]) AS linear_any,
return_int_input(42) === ANY (ARRAY[1,2,3,4,5,6,7,8,42]) AS hashed_any,
return_int_input(42) !== ALL (ARRAY[1,2,3,4,5,6,7,42]) AS linear_not_in,
return_int_input(42) !== ALL (ARRAY[1,2,3,4,5,6,7,8,42]) AS hashed_not_in;
The linear path preserves the NULL (UNKNOWN) result from the comparison.
The
hashed path seems to treat a NULL from the comparison as a non-match,
producing
a different result.
I realize this is a strange equality operator, and that may be the whole
point.
For a hash table lookup, it seems reasonable to need a definite answer to
"does
this stored key match the lookup key?". On the other hand,
ScalarArrayOpExpr
itself has SQL three-valued semantics, and the linear path does preserve the
comparison's NULL result.
So my question is: is the hashed path allowed to assume that hashable
equality
operators never return NULL for non-NULL inputs, or should it preserve the
same
UNKNOWN result that the linear ScalarArrayOpExpr evaluation would produce?
Regards,
Ayush
| From | Date | Subject | |
|---|---|---|---|
| Next Message | John Naylor | 2026-05-11 04:46:06 | Re: Review - Patch for pg_bsd_indent: improve formatting of multiline comments |
| Previous Message | Amit Kapila | 2026-05-11 04:25:27 | Re: DOCS - CREATE PUBLICATION ... EXCEPT has no mention on what happens after drop/create tables |