Reduce planning time for large NOT IN lists containing NULL

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Reduce planning time for large NOT IN lists containing NULL
Date: 2026-02-18 14:11:56
Message-ID: eaa2598c-5356-4e1e-9ec3-5fd6eb1cd704@tantorlabs.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

In this thread [0] an interesting idea came up about avoiding
unnecessary work during selectivity estimation for x <> ALL (NULL, ...)
or x NOT IN (NULL, ...)

Semantically, if the array contains at least one NULL, the selectivity
of x NOT IN (...) is always 0.0, regardless of the other elements in the
list.

Currently, the planner still iterates over all array elements and
invokes the operator's selectivity estimator for each of them. For large
IN / ALL lists, this increases planning time.

For constant arrays I propose adding a simple pre-check before entering
the per-element loop: detect whether the array contains at least one
NULL element (e.g., via memchr() for the deconstructed array case). If
so, and we are in the ALL / NOT IN case, we can immediately return
selectivity = 0.0 and skip all further computation. This would avoid
extra per-element estimation work while preserving semantics.

In cases where array elements are not known to be constants in advance,
such a pre-check is less straightforward, because each element must
first be inspected to determine whether it is a Const and whether it is
NULL. That already requires iterating through the list, so introducing a
separate early pass would not actually reduce the amount of work.
Therefore, it like makes sense to keep the current behavior in that
situation.

Thoughts?

[0]:
https://www.postgresql.org/message-id/CAHza6qdAH99C0xa27YDTixiNVFa99j90QYquUPAxL0JwahmggA%40mail.gmail.com

--
Best regards.
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/

Attachment Content-Type Size
v1-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patch text/x-patch 1.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2026-02-18 14:57:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
Previous Message Nazir Bilal Yavuz 2026-02-18 13:38:07 Re: Speed up COPY FROM text/CSV parsing using SIMD