From: | Vik Fearing <vik(at)postgresfriends(dot)org> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: ANY_VALUE aggregate |
Date: | 2022-12-08 05:00:54 |
Message-ID: | e66347b4-670f-eed6-8809-efd9184e0dd1@postgresfriends.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/7/22 04:22, David G. Johnston wrote:
> On Mon, Dec 5, 2022 at 10:40 PM Vik Fearing <vik(at)postgresfriends(dot)org> wrote:
>
>> On 12/6/22 05:57, David G. Johnston wrote:
>>> On Mon, Dec 5, 2022 at 9:48 PM Vik Fearing <vik(at)postgresfriends(dot)org>
>> wrote:
>>>
>>>> I can imagine an optimization that would remove an ORDER BY clause
>>>> because it isn't needed for any other aggregate.
>>>
>>>
>>> I'm referring to the query:
>>>
>>> select any_value(v order by v) from (values (2),(1),(3)) as vals (v);
>>> // produces 1, per the documented implementation-defined behavior.
>>
>> Implementation-dependent. It is NOT implementation-defined, per spec.
>
> I really don't care all that much about the spec here given that ORDER BY
> in an aggregate call is non-spec.
Well, this is demonstrably wrong.
<array aggregate function> ::=
ARRAY_AGG <left paren>
<value expression>
[ ORDER BY <sort specification list> ]
<right paren>
>> We often loosen the spec rules when they don't make technical sense to
>> us, but I don't know of any example of when we have tightened them.
>
> The function has to choose some row from among its inputs,
True.
> and the system has to obey an order by specification added to the function call.
False.
> You are de-facto creating a first_value aggregate (which is by definition
> non-standard) whether you like it or not.
I am de jure creating an any_value aggregate (which is by definition
standard) whether you like it or not.
> I'm just saying to be upfront
> and honest about it - our users do want such a capability so maybe accept
> that there is a first time for everything. Not that picking an
> advantageous "implementation-dependent" implementation should be considered
> deviating from the spec.
>
>
>>> Someone writing:
>>>
>>> select any_value(v) from (values (2),(1),(3)) as vals (v) order by v;
>>>
>>> Is not presently, nor am I saying, promised the value 1.
>>>
>>> I'm assuming you are thinking of the second query form, while the
>> guarantee
>>> only needs to apply to the first.
>>
>> I am saying that a theoretical pg_aggregate.aggorderdoesnotmatter could
>> bestow upon ANY_VALUE the ability to make those two queries equivalent.
>>
>
> That theoretical idea should not be entertained. Removing a user's
> explicitly added ORDER BY should be off-limits. Any approach at
> optimization here should simply look at whether an ORDER BY is specified
> and pass that information to the function. If the function itself really
> believes that ordering matters it can emit its own runtime exception
> stating that fact and the user can fix their query.
It absolutely should be entertained, and I plan on doing so in an
upcoming thread. Whether it errors or ignores is something that should
be discussed on that thread.
>> If you care about which value you get back, use something else.
>
> There isn't a "something else" to use so that isn't presently an option.
The query
SELECT proposed_first_value(x ORDER BY y) FROM ...
is equivalent to
SELECT (ARRAY_AGG(x ORDER BY y))[1] FROM ...
so I am not very sympathetic to your claim of "no other option".
> I suppose it comes down to what level of belief and care you have that
> people will simply mis-use this function if it is added in its current form
> to get the desired first_value effect that it produces.
People who rely on explicitly undefined behavior get what they deserve
when the implementation changes.
--
Vik Fearing
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2022-12-08 05:07:15 | Re: Support logical replication of DDLs |
Previous Message | Amit Kapila | 2022-12-08 04:53:40 | Re: Avoid streaming the transaction which are skipped (in corner cases) |