Re: ANY_VALUE aggregate

From: Vik Fearing <vik(at)postgresfriends(dot)org>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: ANY_VALUE aggregate
Date: 2023-02-09 09:42:16
Message-ID: 15c31fa6-b84a-8705-f5a2-987d165a89db@postgresfriends.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/23/23 08:50, David Rowley wrote:
> On Thu, 19 Jan 2023 at 06:01, Vik Fearing <vik(at)postgresfriends(dot)org> wrote:
>> Thank you for the review. Attached is a new version rebased to d540a02a72.
>
> I've only a bunch of nit-picks, personal preferences and random
> thoughts to offer as a review:
>
> 1. I'd be inclined *not* to mention the possible future optimisation in:
>
> + * Currently this just returns the first value, but in the future it might be
> + * able to signal to the aggregate that it does not need to be called anymore.
>
> I think it's unlikely that the transfn would "signal" such a thing. It
> seems more likely if we did anything about it that nodeAgg.c would
> maybe have some additional knowledge not to call that function if the
> agg state already has a value. Just so we're not preempting how we
> might do such a thing in the future, it seems best just to remove the
> mention of it. I don't really think it serves as a good reminder that
> we might want to do this one day anyway.

Modified. My logic in having the transition function signal that it is
finished is to one day allow something like:

array_agg(x order by y limit z)

> 2. +any_value_trans(PG_FUNCTION_ARGS)
>
> Many of transition function names end in "transfn", not "trans". I
> think it's better to follow the existing (loosely followed) naming
> pattern that a few aggregates seem to follow rather than invent a new
> one.

Renamed.

> 3. I tend to try to copy the capitalisation of keywords from the
> surrounding regression tests. I see the following breaks that.
>
> +SELECT any_value(v) FILTER (WHERE v > 2) FROM (VALUES (1), (2), (3)) AS v (v);
>
> (obviously, ideally, we'd always just follow the same capitalisation
> of keywords everywhere in each .sql file, but we've long broken that
> and the best way can do is be consistent with surrounding tests)

Downcased.

> 4. I think I'd use the word "Returns" instead of "Chooses" in:
>
> + Chooses a non-deterministic value from the non-null input values.

Done.

> 5. I've not managed to find a copy of the 2023 draft, so I'm assuming
> you've got the ignoring of NULLs correct.

Yes, I do. This is part of <computational operation>, so SQL:2016 10.9
GR 7.a applies.

> 6. Is it worth adding a WindowFunc test somewhere in window.sql with
> an any_value(...) over (...)? Is what any_value() returns as a
> WindowFunc equally as non-deterministic as when it's used as an
> Aggref? Can we assume there's no guarantee that it'll return the same
> value for each partition in each row? Does the spec mention anything
> about that?

This is governed by SQL:2016 10.9 GR 1.d and 1.e which defines the
source rows for the aggregate: either a group or a window frame. There
is no difference in behavior. I don't think a windowed test is useful
here unless I were to implement moving transitions. I think that might
be overkill for this function.

> 7. I wondered if it's worth adding a
> SupportRequestOptimizeWindowClause support function for this
> aggregate. I'm thinking that it might not be as likely people would
> use something more specific like first_value/nth_value/last_value
> instead of using any_value as a WindowFunc. Also, I'm currently
> thinking that a SupportRequestWFuncMonotonic for any_value() is not
> worth the dozen or so lines of code it would take to write it. I'm
> assuming it would always be a MONOTONICFUNC_BOTH function. It seems
> unlikely that someone would have a subquery with a WHERE clause in the
> upper-level query referencing the any_value() aggregate. Thought I'd
> mention both of these things anyway as someone else might think of
> some good reason we should add them that I didn't think of.

I thought about this for a while and decided that it was not worthwhile.

v4 attached. I am putting this back to Needs Review in the commitfest
app, but these changes were editorial so it is probably RfC like Peter
had set it. I will let you be the judge of that.
--
Vik Fearing

Attachment Content-Type Size
v4-0001-Implement-ANY_VALUE-aggregate.patch text/x-patch 6.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ajin Cherian 2023-02-09 09:55:17 Re: Support logical replication of DDLs
Previous Message Alvaro Herrera 2023-02-09 09:40:13 Re: Improve logging when using Huge Pages