"WHERE col NOT IN" yields falsely empty result.

From: pgsql-bugs(at)postgresql(dot)org
To: pgsql-bugs(at)postgresql(dot)org
Subject: "WHERE col NOT IN" yields falsely empty result.
Date: 2001-06-13 12:35:47
Message-ID: 200106131235.f5DCZlj28167@hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Dr M J Carter (Martin(dot)J(dot)Carter(at)nottingham(dot)ac(dot)uk) reports a bug with a severity of 2
The lower the number the more severe it is.

Short Description
"WHERE col NOT IN" yields falsely empty result.

Long Description
Consider

SELECT * FROM bar WHERE col2 NOT IN (SELECT col1 FROM foo)

where tables foo and bar overlap but neither is a subset of
the other. Since the difference set bar - foo is nonempty, the above
should yield one or more rows. However, the presence of a null
in foo.col1 (tests 3 and 8 below) yields zero rows, even where
the difference set has rows with no null entries (see test 2).
Am I missing something obvious? eg in one of the FAQs?

Two possible user-level workarounds (using EXCEPT with IN (tests 5
and 9), and using a correlated join (tests 6 and 10)) are shown
below, which work as expected if the null is in the difference
between the tables. A null in the intersection, as in tests 9
and 10, causes its row to appear in the output; on reflection this
is be expected, and can be fixed (tests 11 and 12) once the possibility is recognised. I can't (yet) see how to show null-entry
rows in the difference set without also showing nulls from the
intersection, but I'm an SQL newbie so this time I *know* I'm
missing something. (Disable blather mode.)

The tables need not have the same column types; that just makes testing a touch easier, by changing the columns being examined (as
in test 4). Apologies for the extensiveness of my tests, of which
you say "don't waste your time" in your Web page: I had to convince
myself that I wasn't just being stupid.

Binaries used are from the stock Debian distributions, not recompiled
by me. The older set is on my home box (Debian 2.2, PostgreSQL-6.5.3,
built with gcc 2.95.2), and the newer one at work (Deb 2.x (testing): psql yields "PostgreSQL 7.1 on i686-pc-linux-gnu, compiled by GCC
2.95.4"; libc6 is v2.2.3-5; processor AMD K6/2 .... anything else?).

Sample Code
Feed the following through "psql foo -f foo.sql | less":

---- snip ----
DROP TABLE foo;
CREATE TABLE "foo" (
"ipnum" inet,
"ipname" text);

DROP TABLE bar;
CREATE TABLE "bar" (
"ipnum" inet,
"ipname" text);

COPY "foo" FROM stdin;
192.168.187.10 theory
192.168.187.11 junior
192.168.40.43 granby
192.168.40.48 gotham
192.168.21.16 marian
192.168.20.6 marian
192.168.1.1 jips-gw
\.

COPY "bar" FROM stdin;
192.168.187.10 theory
192.168.187.11 junior
192.168.40.43 granby
192.168.40.48 gotham
192.168.21.16 marian
192.168.20.6 marian
192.168.3.8 real-gw
\.

SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 1: Shows intersection (6 rows).

SELECT * FROM bar WHERE ipname NOT IN (SELECT ipname FROM foo);
-- 2: Shows expected single row from bar.

INSERT INTO foo (ipnum) VALUES ('192.168.104.42');
-- Add row to foo with null ipname value.

SELECT * FROM bar WHERE ipname NOT IN (SELECT ipname FROM foo);
-- 3: Yields empty set. Is this a bug? it's certainly unexpected; -- what I'd expected was the same one-line reply as for test 2.

SELECT * FROM bar WHERE ipnum NOT IN (SELECT ipnum FROM foo);
-- 4: Yields expected single row again, by examining other column.

SELECT * FROM bar EXCEPT
SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 5: Yields expected row from bar.

SELECT * FROM bar WHERE NOT EXISTS (
SELECT * FROM foo WHERE foo.ipname = bar.ipname);
-- 6: Inspired by answer 4.23 in general PostgreSQL FAQ: yields -- expected row, as well as being somewhat faster for large tables.

INSERT INTO bar (ipnum) VALUES ('192.168.104.42');
-- Move row to intersection.

SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 7: Shows non-null rows of intersection (6 rows).

SELECT * FROM bar WHERE ipname NOT IN (SELECT ipname FROM foo);
-- 8: Yields empty set.

SELECT * FROM bar EXCEPT
SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 9: Yields expected row from bar, plus empty-field row from the -- intersection.

SELECT * FROM bar WHERE NOT EXISTS (
SELECT * FROM foo WHERE foo.ipname = bar.ipname);
-- 10: Ditto.

SELECT * FROM bar WHERE ipname IS NOT NULL
EXCEPT
SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 11: Yields expected row from bar only.

SELECT * FROM bar WHERE ipname IS NOT NULL AND NOT EXISTS (
SELECT * FROM foo WHERE foo.ipname = bar.ipname);
-- 12: Ditto.

No file was uploaded with this report

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2001-06-13 13:30:38 Re: coalesce in execute crashes backend
Previous Message Mister X 2001-06-13 06:39:10 ODBC Interface Bug in version 07_01_0005