*** syntax.sgml.orig Sun Jan 21 22:17:17 2001
--- syntax.sgml.modi.5 Mon Jan 22 01:24:10 2001
***************
*** 1467,1475 ****
-
-
-
GROUP BY and HAVING clauses
--- 1467,1472 ----
***************
*** 1482,1540 ****
clause.)
! In standard SQL, the GROUP BY clause takes a list of column names,
! that specify a subrow, from the derived input table produced by
! the previous WHERE or FROM clause and partitions the table into
! groups with duplicate subrows such that within a column of the
! subrow, no column value is distinct from other column values. The
! resulting derived input table is a special type of table, called a
! grouped table, which still contains all columns but only
! references to columns of the grouped subrow, and group aggregates,
! derived from any of the columns, may appear in derived column
! value expressions in the query select list. When deriving an
! output table from a query using a grouped input table, each output
! row is derived from a corresponding group/partition of the grouped
! table. Aggregates computed in a derived output column are
! aggregates on the current partition/group of the grouped input
! table being processed. Only one output table row results per
! group/partition of the grouped input table.
! Postgres has extended the GROUP BY clause to allow some
! non-standard, but useful behavior. Derived output columns, given
! names using an AS clause in the query select list, may appear in
! the GROUP BY clause in combination with, or instead of, the input
! table column names. Tables may also be grouped by arbitrary
! expressions. If output table column names appear in the GROUP BY
! list, then the input table is augmented with additional columns of
! the output table columns listed in the GROUP BY clause. The value
! for each row in the additional columns is computed from the value
! expression that defines the output column in the query select
! list. The augmented input table is grouped by the column names
! listed in the GROUP BY clause. The resulting grouped augmented
! input table is then treated according standard SQL GROUP BY
! semantics. Only the columns of the unaugmented input table in the
! grouped subrow (if any), and group aggregates, derived from any of
! the columns of the unaugmented input table, may be referenced in
! the value expressions of the derived output columns of the
! query. Output columns derived with an aggregate expression cannot
! be named in the GROUP BY clause.
! A HAVING clause may optionally follow a GROUP BY clause. The
! HAVING clause selects or eliminates, depending on which
! perspective is taken, groups from the grouped table derived in the
! GROUP BY clause that precedes it. The search condition is the
! same type of expression allowed in a WHERE clause and may
! reference any of the input table column names in the grouped
! subrow, but may not reference any others or any named output
! columns. When the search condition results in TRUE the group is
! retained, otherwise the group is eliminated.
ORDER BY and LIMIT clauses
--- 1479,1590 ----
clause.)
+
+ GROUP BY grouping_column_reference , grouping_column_reference...
+
+
+
+ The GROUP BY clause is used to group together rows in a table
+ that share the same values in all the columns listed. The order
+ in which the columns are listed doesn't matter as it does in
+ an ORDER BY clause. The purpose is to reduce each group of rows
+ sharing common values into one group row that is representative
+ of all rows in the group. This is done to eliminate redundancy
+ in the output and/or obtain aggregates that apply to these groups.
+
+
+
+ Once a table
+ is grouped, columns that are not included in
+ the grouping can't be referenced, except in aggregate, since a specific value
+ in those columns is ambiguous - which row in the group should it come from?
+ The grouped-by columns can be referenced in select list column expressions
+ since they have a known constant-value
+ per group. Aggregate functions on the ungrouped columns
+ provide aggregates that span the rows of a group, not
+ of the whole table. For instance, a sum(sales) on a grouped table
+ by product code gives the total sales for each product, not the
+ total sales on all products. The aggregates of the ungrouped
+ columns are representative of the group, whereas their individual
+ values may not be.
+
+
+
+ In strict SQL, GROUP BY can only group by columns of the source
+ table but Postgres extends this to also allow GROUP BY to group
+ by target columns in the query select list. Grouping by
+ value expressions instead of simple column names is also allowed.
+
+
+
+ Example:
+
+
+
+
+ SELECT pid, p.name, (sum(s.units) * p.price) AS sales
+ FROM products p LEFT JOIN sales s USING ( pid )
+ GROUP BY pid, p.name, p.price;
+
+
+
+
+ In this example, the columns pid, p.name, and p.price
+ must be in the GROUP BY clause since they are
+ referenced in the query select list. The column
+ s.units doesn't have to be in the GROUP BY list since
+ it is only used in an aggregate, which represents
+ the group of sales of a product. For each
+ product, a summary row is returned about all sales
+ of the product.
+
+
+
+
+ HAVING search_condition
+
+
+
! If a table has been grouped using a GROUP BY clause, but then
! only certain groups are of interest, then the HAVING clause
! can be used, much like a WHERE clause, to select out the
! groups of interest from a grouped table. For some queries,
! Postgres allows a HAVING clause to be used without a GROUP BY
! and then it acts just like another WHERE clause, but the point
! in using HAVING that way is not clear. Since HAVING operates
! on groups, only grouped columns can be listed in the HAVING
! clause. If selection based on some ungrouped column is
! desired, it should be expressed in the WHERE clause.
! Example:
!
! SELECT pid AS "Products",
! p.name AS "Over 5000",
! (sum(s.units) * (p.price - p.cost))
! AS "Past Month Profit"
! FROM products p LEFT JOIN sales s USING ( pid )
! WHERE p.date > CURRENT_DATE - INTERVAL '4 week'
! GROUP BY pid, p.name, p.price, p.cost
! HAVING p.price > 5000;
!
!
!
!
! In the example above, the WHERE clause is selecting rows by a
! column that is not grouped, while the HAVING clause
! is selecting groups with a price greater than 5000.
+
+
+
+
ORDER BY and LIMIT clauses