Re: MIN/MAX functions for a record

From: Viliam Ďurina <viliam(dot)durina(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: MIN/MAX functions for a record
Date: 2024-03-22 15:50:01
Message-ID: CAO=iB8Lx6R5mYdL4wfcx-0h6RfPD7SjR_sfu8nB3ubbXPcKELA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Exactly Tom, I see no fundamental problem for it not to be implemented,
since comparison operator is already implemented. In fact, MIN/MAX should
work for all types for which comparison operator is defined.

Regarding index support, there should not be an issue if the index is
defined for the record (e.g. `CREATE INDEX ON my_table(ROW(field_a,
field_b))`). However such indexes seem not to be supported. Whether a
composite index is compatible with a record created on the indexed fields
in every edge case I'm not sure...

Alexander, rewriting the year-month example is easy, but how would you
rewrite this query?

CREATE TABLE events(event_time TIMESTAMP, message VARCHAR, user_id VARCHAR);

You want a newest message for each user. It's easy with MAX(record):

SELECT user_id, MAX(ROW(event_time, message)).message
FROM events
GROUP BY user_id;

One option is to rewrite to a subquery with LIMIT 1

SELECT user_id, (SELECT message FROM events e2 WHERE e1.user_id=e2.user_id
ORDER BY event_time DESC LIMIT 1)
FROM events e1
GROUP BY user_id;

If your query already has multiple levels of grouping, multiple joins,
UNIONs etc., it gets much more complex. I also wonder if the optimizer
would pick the same plan as it would be if the MAX(record) is supported.

Viliam

On Fri, Mar 22, 2024 at 4:12 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Aleksander Alekseev <aleksander(at)timescale(dot)com> writes:
> >> In my queries I often need to do MIN/MAX for tuples, for example:
> >> SELECT MAX(row(year, month))
> >> FROM (VALUES(2025, 1), (2024,2)) x(year, month);
> >> This query throws:
> >> ERROR: function max(record) does not exist
> >> Was this ever discussed or is there something preventing the
> implementation?
>
> > I believe it would be challenging to implement max(record) that would
> > work reasonably well in a general case.
>
> As long as you define it as "works the same way record comparison
> does", ie base it on record_cmp(), I don't think it would be much
> more than a finger exercise [*]. And why would you want it to act
> any differently from record_cmp()? Those semantics have been
> established for a long time.
>
> regards, tom lane
>
> [*] Although conceivably there are some challenges in getting
> record_cmp's caching logic to work in the context of an aggregate.
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2024-03-22 15:50:21 Re: documentation structure
Previous Message Peter Eisentraut 2024-03-22 15:49:29 Re: SQL:2011 application time