Re: bug in query planning?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: DeJuan Jackson <djackson(at)speedfc(dot)com>
Cc: "Steven D(dot)Arnold" <stevena(at)neosynapse(dot)net>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: bug in query planning?
Date: 2003-12-24 05:45:11
Message-ID: 12673.1072244711@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

DeJuan Jackson <djackson(at)speedfc(dot)com> writes:
> Query 1:
> SELECT COUNT(message_id)
> FROM messages m
> LEFT JOIN accounts a
> ON m.account_id::bigint = a.account_id::bigint
> WHERE a.email = 'stevena(at)neosynapse(dot)net';

> Query 2:
> SELECT COUNT(message_id)
> FROM accounts a
> LEFT JOIN messages m
> ON a.account_id::bigint = m.account_id::bigint
> WHERE a.email = 'stevena(at)neosynapse(dot)net';

> Query 3:
> SELECT COUNT(message_id)
> FROM messages m, accounts a
> WHERE m.account_id::bigint = a.account_id::bigint
> AND a.email = 'stevena(at)neosynapse(dot)net';

> From what I can see they are not the same query and therefore shouldn't
> use the same plan.

Actually, queries 1 and 3 are equivalent, and I believe PG 7.4 will
recognize them as such. The reason is that the WHERE clause "a.email =
'something'" cannot succeed when a.email is NULL; therefore, there is no
point in the JOIN being a LEFT JOIN --- any null-extended rows added by
the left join will be thrown away again by the WHERE clause. We may as
well reduce the LEFT JOIN to a plain inner JOIN, whereupon query 1 is
obviously the same as query 3. PG 7.4's optimizer can make exactly this
sequence of deductions. The bit of knowledge it needs for this is that
the '=' operator involved is STRICT, ie, yields NULL for NULL input.
All the standard '=' operators are strict and are so marked in the
catalogs. (If you are defining a user-defined type, don't forget to
mark your operators strict where applicable.)

I believe that query 2 is really equivalent to the others as well, but
proving it is more subtle. The reason is that COUNT(message_id) does
not count rows where message_id is NULL, and so any null-extended rows
added by the LEFT JOIN won't be counted, and so we might as well reduce
the LEFT JOIN to a plain inner JOIN. PG's optimizer will not recognize
this, however. Possibly it could if anyone wanted to figure out how.
Right now we make very few assumptions about the behavior of aggregate
functions, but I think you could prove that this is safe based on the
behavior of nodeAgg.c for strict transition functions. Next question
is whether the case would come up often enough to be worth testing
for ...

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Karam Chand 2003-12-24 06:13:47 Where do I get Windows libpq and header files?
Previous Message Ezra Epstein 2003-12-24 05:09:02 Re: Triggers for FK on Views - can they be made deferrable?