Re: WITHIN GROUP patch

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: WITHIN GROUP patch
Date: 2013-12-23 17:34:29
Message-ID: 15825.1387820069@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Or, really, why don't we just do the same thing I'm advocating for
> the plain-ordered-set case? That is, if there's a single collation
> applying to all the collatable inputs, that's the collation to use
> for the aggregate; otherwise it has no determinate collation, and
> it'll throw an error at runtime if it needs one.

I went and tried to implement this, and realized that it would take some
pretty significant rewriting of parse_collate.c, because of examples
like this:

rank(a,b) within group (order by c collate "foo", d collate "bar")

In the current parse_collate logic, it would throw error immediately
upon being told to merge the two explicit-COLLATE results. We'd
need a way to postpone that error and instead just decide that the
rank aggregate's collation is indeterminate. While that's perhaps
just a SMOP, it would mean that ordered-set aggregates don't resolve
collation the same way as other functions, which pretty much destroys
the argument for this approach.

What's more, the same problem applies to non-hypothetical ordered-set
aggregates, if they've got more than one sortable input column.

What I'm now thinking we want to do is:

1. Non-hypothetical direct args always contribute to determining the
agg's collation.

2. Hypothetical and aggregated args contribute to the agg's collation
only if the agg is designed so that there is always exactly one
aggregated arg (ie, it's non-variadic with one aggregated arg).
Otherwise we assign their collations per-sort-column and don't merge
them into the aggregate's collation.

This specification ensures that a variadic aggregate doesn't change
behavior depending on how many sort columns there happen to be.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2013-12-23 17:39:37 Re: XML Issue with DTDs
Previous Message Magnus Hagander 2013-12-23 17:28:51 Re: Assertion failure in base backup code path