| From: | Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com> |
|---|---|
| To: | David Geier <geidav(dot)pg(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Reduce planning time for large NOT IN lists containing NULL |
| Date: | 2026-02-20 09:06:55 |
| Message-ID: | 7791ec74-369d-48c7-969c-cf4ad18024f2@tantorlabs.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi David,
Thanks for review
On 2/19/26 18:38, David Geier wrote:
> +1 on the general idea.
>
>> For constant arrays I propose adding a simple pre-check before entering
>> the per-element loop: detect whether the array contains at least one
>> NULL element (e.g., via memchr() for the deconstructed array case). If
>> so, and we are in the ALL / NOT IN case, we can immediately return
>> selectivity = 0.0 and skip all further computation. This would avoid
>> extra per-element estimation work while preserving semantics.
> How much overhead does the memchr() call add? It seems like this
> approach optimizes the edge case at the expense of the common case,
> which doesn't seem like a good trade-off.
>
> How about instead adding a flag to ArrayType which indicates if the
> array contains NULL or not. This flag could be set in
> construct_md_array() which already iterates over all elements. The flag
> would need to be kept up-to-date, e.g. in array_set_element() and
> possibly other functions modifying the array.
It seems we might reinventing the wheel.
There is already ARR_HASNULL() which allows us to detect the presence of
NULL in ArrayType.
>> In cases where array elements are not known to be constants in advance,
>> such a pre-check is less straightforward, because each element must
>> first be inspected to determine whether it is a Const and whether it is
>> NULL. That already requires iterating through the list, so introducing a
>> separate early pass would not actually reduce the amount of work.
>> Therefore, it like makes sense to keep the current behavior in that
>> situation.
> Agreed.
After thinking about this more, is seems reasonable to short-circuit еру
loop when we detect a NULL element by checking whether the element is a
Const and NULL.
I've attached v2 patch.
--
Best regards.
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patch | text/x-patch | 1.7 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | vignesh C | 2026-02-20 09:08:03 | Re: Skipping schema changes in publication |
| Previous Message | Maxim Orlov | 2026-02-20 08:47:12 | Re: POC: make mxidoff 64 bits |