| From: | Richard Guo <guofenglinux(at)gmail(dot)com> |
|---|---|
| To: | David Geier <geidav(dot)pg(at)gmail(dot)com> |
| Cc: | Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Convert NOT IN sublinks to anti-joins when safe |
| Date: | 2026-02-05 06:09:17 |
| Message-ID: | CAMbWs49nvNcBaUXTw5_euodb7ONADwDULJ4Cxw5qurDXdurc+Q@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Feb 4, 2026 at 11:59 PM David Geier <geidav(dot)pg(at)gmail(dot)com> wrote:
> If the sub-select can yield NULLs, the rewrite can be fixed by adding an
> OR t2.c1 IS NULL clause, such as:
>
> SELECT t1.c1 FROM t1 WHERE
> NOT EXISTS (SELECT 1 FROM t2 WHERE t1.c1 = t2.c1 OR t2.c1 IS NULL)
I'm not sure if this rewrite results in a better plan. The OR clause
would force a nested loop join, which could be much slower than a
hashed-subplan plan.
> If the outer expression can yield NULLs, the rewrite can be fixed by
> adding a t1.c1 IS NOT NULL clause, such as:
>
> SELECT t1.c1 FROM T1 WHERE
> t1.c1 IS NOT NULL AND
> NOT EXISTS (SELECT 1 FROM t2 WHERE t1.c1 = t2.c1)
This rewrite doesn't seem correct to me. If t2 is empty, you would
incorrectly lose the NULL rows from t1 in the final result.
> What's our today's take on doing more involved transformations inside
> the planner to support such cases? It would greatly open up the scope of
> the optimization.
As mentioned in my initial email, the goal of this patch is not to
handle every possible case, but rather only to handle the basic form
where both sides of NOT IN are provably non-nullable. This keeps the
code complexity to a minimum, and I believe this would cover the most
common use cases in real world.
- Richard
| From | Date | Subject | |
|---|---|---|---|
| Next Message | jian he | 2026-02-05 06:15:25 | Re: CREATE TABLE LIKE INCLUDING TRIGGERS |
| Previous Message | Chao Li | 2026-02-05 05:49:53 | Re: pg_upgrade: fix memory leak in SLRU I/O code |