Quick Links

Re: Query plan for "id IS NULL" on PK

From:	"Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>
To:	pgsql-general(at)lists(dot)postgresql(dot)org
Subject:	Re: Query plan for "id IS NULL" on PK
Date:	2023-02-14 22:39:17
Message-ID:	20230214223917.xlkhp4yspwhsxvgu@hjp.at
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 2023-02-14 17:04:51 -0500, Ben Chrobot wrote:
> We have a large table (~470 million rows) with integer primary key id (not
> null) on a Postgres 14.5 cluster. A third-party tool is attempting to perform a
> SELECT-based full table copy in preparation for log-based sync with a query
> like the following:
>
> SELECT "id", "other_column_a", "other_column_b", "created_at", "updated_at"
> FROM "public"."my_large_table"
> WHERE (("id" > ? OR "id" IS NULL)) AND (("id" <= ?))
> ORDER BY "id" LIMIT 50000;

That doesn't make sense. ("id" <= ?) implies that ("id" IS NULL) is
FALSE. So the where clause can be simplified to
WHERE (("id" > ? OR FALSE)) AND (("id" <= ?))
and then
WHERE (("id" > ?)) AND (("id" <= ?))
even without the knowledge that "id" is a primary key (and therefore can
never be null).

Even if the column could contain NULL values, those would never be
selected.

It could therefore be argued that the query as written is broken and
should be fixed.

OTOH it could also be argued that the optimizer should be able to
perform the same simplifications as I did above and produce the same
code for WHERE (("id" > ? OR "id" IS NULL)) AND (("id" <= ?))
as for WHERE (("id" > ?)) AND (("id" <= ?)).

In response to

Query plan for "id IS NULL" on PK at 2023-02-14 22:04:51 from Ben Chrobot

Responses

Re: Query plan for "id IS NULL" on PK at 2023-02-15 00:21:05 from David Rowley

Browse pgsql-general by date

	From	Date	Subject
Next Message	Peter J. Holzer	2023-02-14 22:43:04	Re: Query plan for "id IS NULL" on PK
Previous Message	Rob Sargent	2023-02-14 22:36:32	Re: Query plan for "id IS NULL" on PK