Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]

From: David Fetter <david(at)fetter(dot)org>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-07-05 17:23:37
Message-ID: 20130705172337.GB29050@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 01, 2013 at 05:30:38PM +0100, Dean Rasheed wrote:
> On 1 July 2013 01:44, David Fetter <david(at)fetter(dot)org> wrote:
> > On Fri, Jun 28, 2013 at 09:22:52PM +0100, Dean Rasheed wrote:
> >> On 21 June 2013 06:16, David Fetter <david(at)fetter(dot)org> wrote:
> >> > Please find attached a patch which allows subqueries in the FILTER
> >> > clause and adds regression testing for same.
> >> >
> >>
> >> This needs re-basing/merging following Robert's recent commit to make
> >> OVER unreserved.
> >
> > Please find attached. Thanks, Andrew Gierth! In this one, FILTER is
> > no longer a reserved word.
> >
>
> Looking at this patch again, it appears to be in pretty good shape.
>
> - Applies cleanly to head.
> - Compiles with no warnings.
> - Includes regression test cases and doc updates.
> - Compatible with the relevant part of T612, "Advanced OLAP operations".
> - Includes pg_dump support.
> - Code changes all look reasonable, and I can't find any corner cases
> that have been missed.
> - Appears to work as expected. I tested everything I could think of
> and couldn't break it.
>
> AFAICT all the bases have been covered. As mentioned upthread, I would
> have preferred a few more regression test cases, and a couple of the
> tests don't appear to return anything interesting, but I'll leave that
> for the committer to decide whether they're sufficient for regression
> tests.
>
> I have a few suggestions to improve the docs:
>
> 1). In syntax.sgml: "The aggregate_name can also be suffixed with
> FILTER as described below". It's not really a suffix to the aggregate
> name, since it follows the function arguments and optional order by
> clause. Perhaps it would be more consistent with the surrounding text
> to say something like
>
> <replaceable>expression</replaceable> is
> any value expression that does not itself contain an aggregate
> expression or a window function call, and
> ! <replaceable>order_by_clause</replaceable> and
> ! <replaceable>filter_clause</replaceable> are optional
> ! <literal>ORDER BY</> and <literal>FILTER</> clauses as described below.
>
> 2). In syntax.sgml: "... or when a FILTER clause is present, each row
> matching same". In the context of that paragraph this suggests that
> the filter clause only applies to the first form, since that paragraph
> is a description of the 4 forms of the aggregate function. I don't
> think it's worth mentioning FILTER in this paragraph at all --- it's
> adequately described below that.
>
> 3). In syntax.sgml: "Adding a FILTER clause to an aggregate specifies
> which values of the expression being aggregated to evaluate". How
> about something a little more specific, along the lines of
>
> If <literal>FILTER</> is specified, then only input rows for which
> the <replaceable>filter_clause</replaceable> evaluates to true are
> fed to the aggregate function; input rows for which the
> <replaceable>filter_clause</replaceable> evaluates to false or the
> null value are discarded. For example...
>
> 4). In select.sgml: "In the absence of a FILTER clause, aggregate
> functions...". It doesn't seem right to refer to the FILTER clause at
> the top level here because it's not part of the SELECT syntax being
> described on this page. Also I think this should include a
> cross-reference to the aggregate function syntax section, perhaps
> something like:
>
> Aggregate functions, if any are used, are computed across all rows
> making up each group, producing a separate value for each group
> (whereas without <literal>GROUP BY</literal>, an aggregate
> produces a single value computed across all the selected rows).
> + The set of rows fed to the aggregate function can be further filtered
> + by attaching a <literal>FILTER</literal> clause to the aggregate
> + function call, see <xref ...> for more information.
> When <literal>GROUP BY</literal> is present, it is not valid for
> the <command>SELECT</command> list expressions to refer to
> ungrouped columns except within aggregate functions or if the
> ungrouped column is functionally dependent on the grouped columns,
> since there would otherwise be more than one possible value to
> return for an ungrouped column. A functional dependency exists if
> the grouped columns (or a subset thereof) are the primary key of
> the table containing the ungrouped column.

Please find attached changes based on the above.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
filter_008.diff text/plain 38.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message ivan babrou 2013-07-05 17:28:59 Millisecond-precision connect_timeout for libpq
Previous Message Bruce Momjian 2013-07-05 17:04:12 Re: strange IS NULL behaviour