Group By, NULL values and inconsistent behaviour.

From: darrenk(at)insightdist(dot)com (Darren King)
To: pgsql-hackers(at)postgresql(dot)org
Subject: Group By, NULL values and inconsistent behaviour.
Date: 1998-01-17 21:27:26
Message-ID: 9801172127.AA42656@ceodev
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


While looking thru the nodeGroup code, I noticed the following
that I'm not sure is correct.

-- Using 01-09 snapshot
create table t1 (a int4, b char(2), c char(2));
CREATE
insert into t1 (a,c) values (1,'x');
INSERT 149419 1
insert into t1 (a,c) values (2,'x');
INSERT 149420 1
insert into t1 (a,c) values (3,'z');
INSERT 149421 1
insert into t1 (a,c) values (2,'x');
INSERT 149422 1
select * from t1;
a|b|c
-+-+--
1| |x
2| |x
3| |z
2| |x
(4 rows)

select b,c,sum(a) from t1 group by b,c;
b|c |sum
-+--+---
|x | 3
|z | 3
|x | 2
(3 rows)

select b,c,sum(a) from t1 group by b,c order by c;
b|c |sum
-+--+---
|x | 3
|x | 2
|z | 3
(3 rows)

In the second query, the first two rows have been grouped, but shouldn't
they not be since b is NULL? I thought that NULL != NULL?

If so, is the third query wrong? The first two rows are different, but
only because of the aggregated column that is the source of the group by.
According to the logic from the second query, these should have been
grouped, no?

What does the standard say about comparing two NULL values?

The fixes for these inconsistencies appear to be simple. To cause a new
group to be started if NULL != NULL, simply change the "continue;" in the
sameGroup function in nodeGroup.c to "return FALSE;" Ignoring aggregated
columns would also then be added to sameGroup().

darrenk

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message The Hermit Hacker 1998-01-17 23:06:21 Re: [QUESTIONS] Business cases
Previous Message Bruce Momjian 1998-01-17 16:00:25 Re: [HACKERS] subselects coding started