Re: Reduce planning time for large NOT IN lists containing NULL

From: David Geier <geidav(dot)pg(at)gmail(dot)com>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Reduce planning time for large NOT IN lists containing NULL
Date: 2026-02-24 08:29:27
Message-ID: 3fc74079-f429-48c8-ad8c-688b21d4d1c1@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> I think it would be a good idea to add a test, I think there's a
>> regression with this patch:
>>
>> CREATE TABLE notin_test AS SELECT generate_series(1, 1000) AS x;
>> ANALYZE notin_test;
>>
>> CREATE FUNCTION replace_elem(arr int[], idx int, val int)
>> RETURNS int[] AS $$
>> BEGIN
>>      arr[idx] := val;
>>      RETURN arr;
>> END;
>> $$ LANGUAGE plpgsql IMMUTABLE;
>>
>> EXPLAIN SELECT * FROM notin_test WHERE x <> ALL(ARRAY[1,99,3]);
>> -- same array, constructed from an array with a NULL
>> EXPLAIN SELECT * FROM notin_test WHERE x <>
>> ALL(replace_elem(ARRAY[1,NULL,3], 2, 99));
>> DROP TABLE notin_test;
>> DROP FUNCTION replace_elem;

Good catch. The macro name is misleading here. It should have been
called ARR_HASNULLBITMAP().

+1 on adding an explicit test that says why we care about that case.

>> ARR_HASNULL probably should be array_contains_nulls, as ARR_HASNULL
>> simply checks for the existence of a NULL bitmap.

Using array_contains_nulls() seems fine. In case the IN list doesn't
contain NULL, the function can immediately bail thanks to the
!ARR_HASNULL() check in the beginning.

It only needs to iterate over the NULL-bitmap, if it exists. This is the
case if there's actually a NULL element in the array, or if the array
initially contained NULL and all NULLs got removed subsequently.

If we ever find the latter case to matter we could remove the
NULL-bitmap in array_set_element() / array_set_element_expanded(), when
the last NULL element got removed.

> Could you clarify what exactly this additional test meant to verify?

Zsolt's test case creates an array that initially contains NULL. The
NULL element is subsequently replaced by a non-NULL value but
array_set_element_expanded() keeps the NULL-bitmap around. With that,
your ARR_ISNULL() check bails and causes the selectivity estimation to
incorrectly return 0.

> I attached this thread to commitfest: https://commitfest.postgresql.org/
> patch/6519/

I'll assign myself as reviewer.

--
David Geier

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2026-02-24 08:39:27 Re: Add errdetail() with PID and UID about source of termination signal
Previous Message Ilia Evdokimov 2026-02-24 08:15:54 Re: Reduce planning time for large NOT IN lists containing NULL