Re: DISTINCT *and* ORDER BY in aggregate functions on expressions(!)y

From: Thorsten Glaser <tg(at)evolvis(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: DISTINCT *and* ORDER BY in aggregate functions on expressions(!)y
Date: 2023-02-28 02:54:58
Message-ID: 9fc8e12b-f943-c633-a37-254915f819a0@evolvis.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 27 Feb 2023, David G. Johnston wrote:

>Consider this then as a jumping point to a more precise query form:
[…]
>the basic concept holds - produce single rows in subqueries then join those
>various single rows together to produce your desired json output.

Ouch. I’ll have to read up and experiment with that, I guess.
But wouldn’t I rather then do a CTA for each referenced table
that does the aggregating and GROUP BY person_id, then join that
(which has single rows for each user row) to users?

>--BAD cross joining going on with multiple one-to-many relationships
>--GOOD, only joining one-to-one relationships

Not quite. This is many-to-many created by Django…

On Mon, 27 Feb 2023, Tom Lane wrote:

>Sure, I was just trying to explain the rule.

Aaah, okay. Sorry, I misunderstood that, and the… general direction
of the replies, then.

>Well, that may be what you want, but it's not what you wrote in
>the query. Follow David's advice and do
[…]
>I'm pretty sure that this will only incur one evaluation of the
>common subexpression, so even though it's tedious to type it's not
>inefficient.

Thanks. But I fear it’s not as simple as you wrote. More like:

jsonb_build_object('opening_times',
jsonb_agg(DISTINCT jsonb_build_object(
'weekday', cot.weekday,
'from_hour', cot.from_hour,
'to_hour', cot.to_hour)
ORDER BY
jsonb_build_object(
'weekday', cot.weekday,
'from_hour', cot.from_hour,
'to_hour', cot.to_hour)->>'weekday',
jsonb_build_object(
'weekday', cot.weekday,
'from_hour', cot.from_hour,
'to_hour', cot.to_hour)->>'from_hour',
jsonb_build_object(
'weekday', cot.weekday,
'from_hour', cot.from_hour,
'to_hour', cot.to_hour)->>'to_hour')
)

Isn’t that more like it?

(Though I guess at that point I could just drop at least
the to_hour fallback sort, hoping nobody inserts overlapping
hours with indentical start times…)

Now that I see that, it sort of re-raises a question I had
during developing this but didn’t pursue.

How about, instead of creating a JSONB object here, I create
a (not-JSON) ARRAY['weekday', cot.weekday, 'from_hour',
cot.from_hour, 'to_hour', cot.to_hour] in the aggregate function.
The array is something I could naturally use to sort (its elements
already are in sort key order), and its elements *also* are in
the order jsonb_build_object expects its arguments (assuming I can
do something like passing the array to it instead of “spreading”
the arguments). Pseudocode:

… hmm, not that easy. The aggregate would return e.g. this…

ARRAY[['weekday',1,…],['weekday',2,…]]

… and there’s no array_map function that could be used to pass
each inner array, one by one, to jsonb_build_object; converting
the outer array to JSON gets us json_array_elements (same problem),
other ways I can think of also don’t seem to have matching functions
(jq has… interesting ones).

As I write this, I fear that won’t fly because nōn-JSON arrays
cannot contain mixed types (text and numbers) in PostgreSQL… at
least I ran into that at some point in the past vaguely remembering…

Oh well,
//mirabilos
--
Solange man keine schmutzigen Tricks macht, und ich meine *wirklich*
schmutzige Tricks, wie bei einer doppelt verketteten Liste beide
Pointer XORen und in nur einem Word speichern, funktioniert Boehm ganz
hervorragend. -- Andreas Bogk über boehm-gc in d.a.s.r

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2023-02-28 04:24:30 RE: Support logical replication of DDLs
Previous Message David G. Johnston 2023-02-28 01:48:42 Re: DISTINCT *and* ORDER BY in aggregate functions on expressions(!)