Re: ANY_VALUE aggregate

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Vik Fearing <vik(at)postgresfriends(dot)org>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: ANY_VALUE aggregate
Date: 2022-12-06 04:57:05
Message-ID: CAKFQuwZio6maUZ_SxzudV9vP+DF8+fYWNAn0M-fxh=iKnAnSYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 5, 2022 at 9:48 PM Vik Fearing <vik(at)postgresfriends(dot)org> wrote:

> On 12/6/22 05:22, David G. Johnston wrote:
> > On Mon, Dec 5, 2022 at 8:46 PM Vik Fearing <vik(at)postgresfriends(dot)org>
> wrote:
> >
> >> On 12/5/22 18:56, David G. Johnston wrote:
> >>> Also, maybe we should have any_value do something like compute a 50/50
> >>> chance that any new value seen replaces the existing chosen value,
> >> instead
> >>> of simply returning the first value all the time. Maybe even prohibit
> >> the
> >>> first value from being chosen so long as a second value appears.
> >>
> >> The spec says the result is implementation-dependent meaning we don't
> >> even need to document how it is obtained, but surely behavior like this
> >> would preclude future optimizations like the ones I mentioned?
> >>
> >
> > So, given the fact that we don't actually want to name a function
> > first_value (because some users are readily confused as to when the
> concept
> > of first is actually valid or not) but some users do actually wish for
> this
> > functionality - and you are proposing to implement it here anyway - how
> > about we actually do document that we promise to return the first
> non-null
> > value encountered by the aggregate. We can then direct people to this
> > function and just let them know to pretend the function is really named
> > first_value in the case where they specify an order by. (last_value comes
> > for basically free with descending sorting).
>
> I can imagine an optimization that would remove an ORDER BY clause
> because it isn't needed for any other aggregate.

I'm referring to the query:

select any_value(v order by v) from (values (2),(1),(3)) as vals (v);
// produces 1, per the documented implementation-defined behavior.

Someone writing:

select any_value(v) from (values (2),(1),(3)) as vals (v) order by v;

Is not presently, nor am I saying, promised the value 1.

I'm assuming you are thinking of the second query form, while the guarantee
only needs to apply to the first.

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-12-06 04:57:20 Re: move some bitmapset.c macros to bitmapset.h
Previous Message Vik Fearing 2022-12-06 04:48:15 Re: ANY_VALUE aggregate