Re: [PATCH] Fix hashed ScalarArrayOp semantics for NULL LHS with non-strict comparators

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Chengpeng Yan <chengpeng_yan(at)outlook(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, cca5507 <cca5507(at)qq(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: [PATCH] Fix hashed ScalarArrayOp semantics for NULL LHS with non-strict comparators
Date: 2026-04-22 23:33:28
Message-ID: CAApHDvo=bOVURpRH_ydBYT_2R9PU4EvzzFfez1xV4RKBPRUPVA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 20 Apr 2026 at 18:17, Chengpeng Yan <chengpeng_yan(at)outlook(dot)com> wrote:
> It may also be
> possible to cache the NULL-LHS outcome once per expression, since the
> RHS array is constant in the hashed SAOP case, which might help reduce
> the cost of that fallback.

Yeah, it doesn't make sense to repeatedly perform a linear search over
the array to check if NULL matches anything in the array. Let's just
do that once when we build the hash table and reuse that cached value
whenever we see a NULL. We can skip that step with strict functions
since we'll short-circuit earlier.

A patch for that is attached.

> ChangAo's example also seems to expose a separate correctness issue. If
> the comparator can return NULL even for non-NULL inputs, then a lookup
> hit seems sufficient, but a miss is no longer enough to distinguish
> FALSE for IN / TRUE for NOT IN from NULL.

IMO it's unrealistic to assume we can do anything sane with an
equality function that always returns NULL.

> A conservative fix there would again be a linear fallback after miss,
> which should recover the right semantics, but that case does seem much
> more performance-sensitive.

I really doubt it's worth troubling over that. If we did want to do
something, then it would be more efficient to probe the hash table
directly after we insert a Datum and verify we can find it again. If
we can't find any value we just inserted, mark the entire table as
broken and have it so we check for that and do a linear search.

David

Attachment Content-Type Size
v3-0001-Fix-incorrect-logic-for-hashed-IN-NOT-IN-with-non.patch application/octet-stream 20.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2026-04-23 00:03:08 Re: Add null check on get_tablespace_name() return in pg_get_database_ddl_internal
Previous Message SATYANARAYANA NARLAPURAM 2026-04-22 22:20:36 Add null check on get_tablespace_name() return in pg_get_database_ddl_internal