PostgreSQL 9.3.12 Documentation | ||||
---|---|---|---|---|

Prev | Up | Chapter 9. Functions and Operators | Next |

*Aggregate functions* compute a single
result from a set of input values. The built-in aggregate
functions are listed in Table
9-47 and Table
9-48. The special syntax considerations for aggregate
functions are explained in Section 4.2.7.
Consult Section 2.7 for
additional introductory information.

Table 9-47. General-Purpose Aggregate Functions

Function | Argument Type(s) | Return Type | Description |
---|---|---|---|

`array_agg(` |
any | array of the argument type | input values, including nulls, concatenated into an array |

`avg(` |
smallint, int, bigint, real, double precision,
numeric, or interval |
numeric for any integer-type
argument, double precision for a
floating-point argument, otherwise the same as the
argument data type |
the average (arithmetic mean) of all input values |

`bit_and(` |
smallint, int, bigint, or
bit |
same as argument data type | the bitwise AND of all non-null input values, or null if none |

`bit_or(` |
smallint, int, bigint, or
bit |
same as argument data type | the bitwise OR of all non-null input values, or null if none |

`bool_and(` |
bool |
bool |
true if all input values are true, otherwise false |

`bool_or(` |
bool |
bool |
true if at least one input value is true, otherwise false |

`count(*)` |
bigint |
number of input rows | |

`count(` |
any | bigint |
number of input rows for which the value of
expression is not
null |

`every(` |
bool |
bool |
equivalent to `bool_and` |

`json_agg(` |
any |
json |
aggregates values as a JSON array |

`max(` |
any array, numeric, string, or date/time type | same as argument type | maximum value of expression across all input
values |

`min(` |
any array, numeric, string, or date/time type | same as argument type | minimum value of expression across all input
values |

`string_agg(` |
(text, text) or (bytea,
bytea) |
same as argument types | input values concatenated into a string, separated by delimiter |

`sum(` |
smallint, int, bigint, real, double precision,
numeric, interval, or money |
bigint for smallint or int
arguments, numeric for bigint arguments, otherwise the same as the
argument data type |
sum of expression
across all input values |

`xmlagg(` |
xml |
xml |
concatenation of XML values (see also Section 9.14.1.7) |

It should be noted that except for `count`

, these functions return a null value when
no rows are selected. In particular, `sum`

of no rows returns null, not zero as one
might expect, and `array_agg`

returns
null rather than an empty array when there are no input rows. The
`coalesce`

function can be used to
substitute zero or an empty array for null when necessary.

Note:Boolean aggregates`bool_and`

and`bool_or`

correspond to standard SQL aggregates`every`

and`any`

or`some`

. As for`any`

and`some`

, it seems that there is an ambiguity built into the standard syntax:SELECT b1 = ANY((SELECT b2 FROM t2 ...)) FROM t1 ...;Here

`ANY`

can be considered either as introducing a subquery, or as being an aggregate function, if the subquery returns one row with a Boolean value. Thus the standard name cannot be given to these aggregates.

Note:Users accustomed to working with other SQL database management systems might be disappointed by the performance of the`count`

aggregate when it is applied to the entire table. A query like:SELECT count(*) FROM sometable;will require effort proportional to the size of the table: PostgreSQL will need to scan either the entire table or the entirety of an index which includes all rows in the table.

The aggregate functions `array_agg`

, `json_agg`

, `string_agg`

, and `xmlagg`

, as well as similar user-defined
aggregate functions, produce meaningfully different result values
depending on the order of the input values. This ordering is
unspecified by default, but can be controlled by writing an
`ORDER BY` clause within the aggregate
call, as shown in Section 4.2.7.
Alternatively, supplying the input values from a sorted subquery
will usually work. For example:

SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;

But this syntax is not allowed in the SQL standard, and is not portable to other database systems.

Table
9-48 shows aggregate functions typically used in statistical
analysis. (These are separated out merely to avoid cluttering the
listing of more-commonly-used aggregates.) Where the description
mentions `N`, it means the number
of input rows for which all the input expressions are non-null.
In all cases, null is returned if the computation is meaningless,
for example when `N` is zero.

Table 9-48. Aggregate Functions for Statistics

Function | Argument Type | Return Type | Description |
---|---|---|---|

`corr(` |
double precision |
double precision |
correlation coefficient |

`covar_pop(` |
double precision |
double precision |
population covariance |

`covar_samp(` |
double precision |
double precision |
sample covariance |

`regr_avgx(` |
double precision |
double precision |
average of the independent variable (sum()X)/N |

`regr_avgy(` |
double precision |
double precision |
average of the dependent variable (sum()Y)/N |

`regr_count(` |
double precision |
bigint |
number of input rows in which both expressions are nonnull |

`regr_intercept(` |
double precision |
double precision |
y-intercept of the least-squares-fit linear equation
determined by the (X,
Y) pairs |

`regr_r2(` |
double precision |
double precision |
square of the correlation coefficient |

`regr_slope(` |
double precision |
double precision |
slope of the least-squares-fit linear equation
determined by the (X,
Y) pairs |

`regr_sxx(` |
double precision |
double precision |
sum( ("sum of
squares" of the independent variable)X^2) - sum(X)^2/N |

`regr_sxy(` |
double precision |
double precision |
sum( ("sum of
products" of independent times dependent
variable)X*Y) - sum(X) * sum(Y)/N |

`regr_syy(` |
double precision |
double precision |
sum( ("sum of
squares" of the dependent variable)Y^2) - sum(Y)^2/N |

`stddev(` |
smallint, int, bigint, real, double precision,
or numeric |
double precision for
floating-point arguments, otherwise numeric |
historical alias for `stddev_samp` |

`stddev_pop(` |
smallint, int, bigint, real, double precision,
or numeric |
double precision for
floating-point arguments, otherwise numeric |
population standard deviation of the input values |

`stddev_samp(` |
smallint, int, bigint, real, double precision,
or numeric |
double precision for
floating-point arguments, otherwise numeric |
sample standard deviation of the input values |

`variance` (expression) |
smallint, int, bigint, real, double precision,
or numeric |
double precision for
floating-point arguments, otherwise numeric |
historical alias for `var_samp` |

`var_pop` (expression) |
smallint, int, bigint, real, double precision,
or numeric |
double precision for
floating-point arguments, otherwise numeric |
population variance of the input values (square of the population standard deviation) |

`var_samp` (expression) |
smallint, int, bigint, real, double precision,
or numeric |
double precision for
floating-point arguments, otherwise numeric |
sample variance of the input values (square of the sample standard deviation) |