Re: [SQL] Oddities with NULL and GROUP BY

From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: secret <secret(at)kearneydev(dot)com>
Cc: "[Jos_] Soares" <jose(at)sferacarta(dot)com>, "Jackson, DeJuan" <djackson(at)cpsgroup(dot)com>, PG-SQL <pgsql-sql(at)postgreSQL(dot)org>
Subject: Re: [SQL] Oddities with NULL and GROUP BY
Date: 1999-07-07 18:23:55
Message-ID: 199907071823.OAA05656@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql

Looks like this is fixed in 6.5 too.

a|b|sum
-+-+---
1|1| 3
2| | 3
(2 rows)

[Charset iso-8859-1 unsupported, filtering to ASCII...]
> Jos_ Soares wrote:
>
> > secret ha scritto:
> >
> >> "Jackson, DeJuan" wrote:
> >>
> >> > The behavior is valid, if you define NULL as meaning undefined.
> >> > In other words when you define something as NULL you're saying, "I
> >> don't
> >> > know what it is. It could be equal or not."
> >> > -DEJ
> >> >
> >> > > -----Original Message-----
> >> > > From: secret [SMTP:secret(at)kearneydev(dot)com]
> >> > > Sent: Friday, May 14, 1999 11:58 AM
> >> > > To: PG-SQL
> >> > > Subject: [SQL] Oddities with NULL and GROUP BY
> >> > >
> >> > > Maybe there is something I don't know about how GROUP BY
> >> should
> >> > > work, but if I have a table like:
> >> > > a,b,c
> >> > > 1,1,1
> >> > > 1,1,2
> >> > > 1,1,3
> >> > > 1,2,1
> >> > > 1,3,1
> >> > >
> >> > > And I say SELECT a,b,sum(c) FROm .. GROUP BY a,b I get
> >> > > 1,1,6
> >> > > 1,2,1
> >> > > 1,3,1
> >> > >
> >> > > So whenever a or b changes we get a new summed row, well if I
> >> have rows
> >> > > where a or b are null, this doesn't happen, infact I seem to get
> >> all
> >> > > those rows individually... Like if:
> >> > > 1,1,1
> >> > > 1,1,3
> >> > > 1,NULL,10
> >> > > 1,NULL,20
> >> > > 1,2,3
> >> > >
> >> > > I get:
> >> > > 1,1,4
> >> > > 1,NULL,10
> >> > > 1,NULL,20
> >> > > 1,2,3
> >> > >
> >> > > Shouldn't I get 1,NULL,30? Ie shouldn't NULL be treated like
> >> any other
> >> > > value? Or is there some bit of information I'm missing? I can
> >> set
> >> > > everything from NULL to 0 if need be, but I'd rather not...
> >> > >
> >> > > David Secret
> >> > > MIS Director
> >> > > Kearney Development Co., Inc.
> >> > >
> >>
> >> IBM's DB/2 Disagrees, so does Oracle8!
> >>
> >> Here is a cut & paste from Oracle SQL+:
> >>
> >> SQL> select * from z;
> >>
> >> A B
> >> --------- ---------
> >> 1 1
> >> 1 2
> >> 5
> >> 10
> >>
> >> SQL> select a,sum(b) from z group by a;
> >>
> >> A SUM(B)
> >> --------- ---------
> >> 1 3
> >> 15
> >>
> >> SQL>
> >>
> >> I'm going to report this as a bug now that I've verified 2 major
> >> database
> >> vendors perform the task as I would expect them to, and PostgreSQL
> >> does it
> >> very differently. The question is really is NULL=NULL, which I
> >> would say it
> >> should be.
> >
> >
> > I tried it in PostgreSQL 6.5beta1 with the same result:
> >
> > select * from z;
> > a| b
> > -+--
> > 1| 1
> > 1| 2
> > | 5
> > |10
> > (4 rows)
> >
> > select a,sum(b) from z group by a;
> > a|sum
> > -+---
> > 1| 3
> > | 15
> > (2 rows)
> >
> > The Pratical SQL Handbook at page 171 says:
> > Since nulls represent "the great unknown", there is no way to know
> > whether one null is equal to any other null. Each unknown value
> > may or may not be different from another.
> > However, if the grouping column contains more than one null,
> > all of them are put into a single group.
> >
> > Thus: NULL!=NULL but on GROUP BY it is considered as NULL=NULL.
> >
> > Jos_
> >
> >
> >
> >
> > --
> > ______________________________________________________________
> > PostgreSQL 6.5.0 on i586-pc-linux-gnu, compiled by gcc 2.7.2.3
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > Jose'
> >
>
> Wonderful, that's as I expected. However please try this in 6.5
> Beta1,
> CREATE TABLE z(a int4,b int4, c int4);
> INSERT INTO z VALUES (1,1,1);
> INSERT INTO z VALUES (1,1,2);
> INSERT INTO z(a,c) VALUES (2,1);
> INSERT INTO z(a,c) VALUES (2,2);
>
> SELECT a,b,sum(c) FROM z GROUP BY a,b
>
> GROUPing in PostgreSQL w/NULLs works just fine when there is only 1
> column, however when one throws 2 in, the 2nd one having NULLs it starts
> failing. Your example demonstrates the right answer for 1 group by
> column, try it with 2 and I expect 6.5beta1 will fail as 6.4.2 does.
>
> As to NULL=NULL or NULL!=NULL, evadentally my estimation of why the
> problem is occuring was wrong. :) But from the SQL handbook we
> definately have a bug here.
>
> David Secret
> MIS Director
> Kearney Development Co., Inc.
>
>
>

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1999-07-07 18:25:26 Re: [HACKERS] Broken select on regular expression !!!
Previous Message Bruce Momjian 1999-07-07 18:20:35 Re: [HACKERS] spinlock freeze again

Browse pgsql-sql by date

  From Date Subject
Next Message Bruce Momjian 1999-07-07 19:09:11 Re: [SQL] Good Optimization
Previous Message Bruce Momjian 1999-07-07 18:20:35 Re: [HACKERS] spinlock freeze again