Re: must appear in GROUP by clause issue

From: George Woodring <george(dot)woodring(at)iglass(dot)net>
To: Lee Hachadoorian <Lee(dot)Hachadoorian+L(at)gmail(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: must appear in GROUP by clause issue
Date: 2012-07-02 13:01:12
Message-ID: CACi+J=Rm8q+JK-9X_MPZje4_tRx_6CqGfCyBZ2OeFyeEGNYntQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

This was helpful, thank you. The issue was that one of the tables had a
"name" column added. Changing the alias name in the query so that it was
unique was the solution.

Thanks for the help,
George

On Sat, Jun 30, 2012 at 12:34 AM, Lee Hachadoorian <
Lee(dot)Hachadoorian+L(at)gmail(dot)com> wrote:

> On Fri, Jun 29, 2012 at 4:02 PM, George Woodring
> <george(dot)woodring(at)iglass(dot)net> wrote:
> > I have 2 (8.4.11) servers that I am testing the following query:
> >
> > SELECT count(*),
> > maptrunc(cpeloc.lat, 4.5)::text || maptrunc(cpeloc.long,
> > 4.5)::text AS name,
> > AVG(cpeloc.lat) AS lt,
> > AVG(cpeloc.long) AS lng
> > FROM cable_billing JOIN cpeloc USING(mac) LEFT JOIN davic USING(mac)
> > WHERE (status = 0 OR status is null) AND (cpeloc.lat is not null AND
> > cpeloc.lat !=0 AND cpeloc.long is not null AND cpeloc.long != 0)
> > GROUP BY name ORDER BY name;
> >
> > On the first server the query works,
> >
> > count | name | lt | lng
> > -------+-----------------+---------------------+----------------------
> > 1 | 43.0425-94.2295 | 43.0429410000000000 | -94.2299740000000000
> > 1 | 43.0525-94.260 | 43.0526200000000000 | -94.2603800000000000
> > 1 | 43.054-94.224 | 43.0543150000000000 | -94.2244750000000000
> > (51 rows)
> >
> > On the second server I get an error.
> >
> > ERROR: column "cpeloc.lat" must appear in the GROUP BY clause or be
> used in
> > an aggregate function
> > LINE 1: select count(*), maptrunc(cpeloc.lat, 4.5)::text || maptrunc...
> >
> > I was thinking the one server that works the maptrunc function was
> thought
> > of as an agg function, but they are both defined the same
> >
> > \df maptrunc
> > List of functions
> > Schema | Name | Result data type | Argument data types | Type
> > --------+----------+------------------+---------------------+--------
> > public | maptrunc | numeric | numeric, numeric | normal
> > (1 row)
> >
> > The only difference between the 2 servers are that the one that works
> > returns 51 rows and the one that does not I would expect to return 12000
> > rows.
> >
> > I can make the one that does not work functional by changing
> >
> > maptrunc(cpeloc.lat)::text --> max(maptrunc(cpeloc.lat))::text
> > maptrunc(cpeloc.long)::text --> max(maptrunc(cpeloc.long))::text
> >
> > however, that one breaks on the first server with the error
> >
> > ERROR: aggregates not allowed in GROUP BY clause
> > LINE 1: select count(*), max(maptrunc(cpeloc.lat, 4.5))::text || max...
> >
> > Any suggestions would be appreciated.
> >
> > George Woodring
> > --
> > iGLASS Networks
> > www.iglass.net
>
> Are you sure the queries are the same on both servers? I'm not sure how
>
> SELECT max(…) || max (…) AS name
> …
> GROUP BY name
>
> could ever work without leading to the "aggregates not allowed in
> GROUP BY clause".
>
> I *think* this could also happen if for some reason a "name" column
> was added to one of the tables on Server2 but not on Server1. Server1
> groups by name as an output column. On Server2, name is interpreted as
> an input column. Once you add max(…), cpeloc is now part of an
> aggregate expression, but the GROUP BY actually operates over the
> input column. Switch back to Server1, if name is not in the table
> definition it is interpreted as an output column and GROUP BY max(…)
> fails.
>
> All of this is kind of spitting in the dark without table defs and
> maybe the definition of maptrunc(), although at a guess I would say
> that it looks like you are doing is you are trying to join locations
> within a tolerance. maptrunc() truncates your latititude and longitude
> measurement to within some acceptable error, and you are averaging all
> candidate lat/longs to come up with "the" location.
>
> Assuming it's not just differences in table defs between the servers,
> I would suggest simplifying the query to debug it. Since lat and long
> are both in cpelong, and the only other column mentioned is status, at
> least one of the other tables (cable_billing and davic) is
> unnecessary. Replace GROUP BY name ORDER BY name with GROUP BY 2 ORDER
> BY 2. Consider generating name within a subquery before aggregating,
> i.e.:
>
> SELECT count(*),
> name,
> AVG(cpeloc.lat) AS lt,
> AVG(cpeloc.long) AS lng
> FROM (
> SELECT
> maptrunc(cpeloc.lat, 4.5)::text || maptrunc(cpeloc.long,
> 4.5)::text AS name,
> cpeloc.lat,
> cpeloc.long
> FROM
> cable_billing JOIN cpeloc USING(mac) LEFT JOIN davic
> USING(mac)
> WHERE COALESCE(status, 0) = 0
> AND COALESCE(cpeloc.lat, 0) !=0
> AND COALESCE(cpeloc.long, 0) != 0
> )
> GROUP BY name ORDER BY name;
>
> I would also inspect the subquery, ordered by name, to see if anything
> weird leaps out at you in terms of the adjacent rows that *should* be
> grouped. As an aside, I replaced the (x = 0 OR x IS NULL) and converse
> constructions with COALESCE() functions, which IMO are a bit easier to
> read.
>
> Hope this is at all helpful.
>
> --Lee
>
> --
> Lee Hachadoorian
> PhD, Earth & Environmental Sciences (Geography)
> Research Associate, CUNY Center for Urban Research
> http://freecity.commons.gc.cuny.edu/
>

--
iGLASS Networks
www.iglass.net

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Chris Preston 2012-07-10 15:28:55 Simple Upgrade from PostgreSQL version 8.1.11 (With schemas)
Previous Message Lee Hachadoorian 2012-06-30 04:34:45 Re: must appear in GROUP by clause issue