Skip site navigation (1) Skip section navigation (2)

Re: [HACKERS] No: implied sort with group by

From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: Darren King <darrenk(at)insightdist(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] No: implied sort with group by
Date: 1998-01-28 07:33:40
Message-ID: 34CEDF54.85CBAAB@alumni.caltech.edu (view raw or flat)
Thread:
Lists: pgsql-hackers
> > > > This is what I think is missing or broken right now.
> > > >
> > > > > > select * from t1;
> > > > >          a b  c
> > > > >          1    x
> > > > >          2    x
> > > > >          3    z
> > > > >          2    x
> > > > >
> > > > > 4 row(s) retrieved.
> > > > > > select b,c,sum(a) from t1 group by b,c;
> > > > > b  c             (sum)
> > > > >
> > > > >    x                 5
> > > > >    z                 3
> > > > >> 2 row(s) retrieved.
> > >
> > > Sorry, I've lost the thread. What is broken? I get this same result, and
> > > (assuming that column "b" is full of nulls) I think this the correct result.
> >
> > At one point, it was thought that NULLs shouldn't be grouped, but I
> > backed out the patch.  There is a problem with GROUP BY on large
> > datasets, and Vadim knows the cause, and will work on it later.
>
> Different from the grouping by NULLs issue...
>
> The above results are from Sybase.  If these same four rows are inserted into
> postgres, the second query will return three rows.  Something like...
>
> b|c|sum(a)
>  |x|3
>  |z|3
>  |x|2
>
> It does this not because of the null values of column b, but because the data is
> not sorted before getting to the group by node if the user does not explicitly put
> an order by in the query.  IMHO, postgres should put an arbitrary sort node in the
> tree so that the data can be properly grouped as the group node iterates over it.
>
> And even if I put an "order by c" clause in there, I still get three rows, they're
> just properly sorted. :)

Not necessarily true; as I said, I get the same result as above (with the 980112
source tree; have things changed since??). Perhaps you are running into the sorting
problem which seemed to be present on larger tables only?

                                                    - Tom

postgres=> select b,c,sum(a) from t1 group by b,c;
b|c|sum
-+-+---
 |x|  5
 |z|  3
(2 rows)

postgres=> select * from t1;
a|b|c
-+-+-
1| |x
2| |x
2| |x
3| |z
(4 rows)


In response to

pgsql-hackers by date

Next:From: Thomas G. LockhartDate: 1998-01-28 07:36:41
Subject: Re: [HACKERS] functions with same name, different args
Previous:From: Bruce MomjianDate: 1998-01-28 03:37:03
Subject: Re: [HACKERS] postmaster crash and .s.pgsql file

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group