Re: Suggestion for aggregate function

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Greg Stark <gsstark(at)MIT(dot)EDU>
Cc: Manfred Koizar <mkoi-pg(at)aon(dot)at>, Greg Stark <gsstark(at)mit(dot)edu>, Bruno Wolff III <bruno(at)wolff(dot)to>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Suggestion for aggregate function
Date: 2003-01-24 17:37:00
Message-ID: 878yxaphz7.fsf@stark.dyndns.tv
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark <gsstark(at)MIT(dot)EDU> writes:

> Manfred Koizar <mkoi-pg(at)aon(dot)at> writes:
>
> > Greg, we already have this feature, just the syntax is a bit different :-)
> >
> > SELECT DISTINCT ON (item_id) item_id,
> > price AS lowest_price,
> > store_id AS lowest_price_store
> > FROM items_for_sale
> > WHERE item_category = ?
> > ORDER BY item_id, price
>
> Neat! I hadn't seen this.

Ok, so I still think DISTINCT ON is the neatest thing since sliced bread. But
it strikes me as a bit of an odd syntax. It's very similar to GROUP BY except
where all the fields are implicitly aggregated using a peculiar aggregate
function that grabs the first value according to the order by expression.

I'm using this already for lots of queries, it's very handy. But I'm finding
it awkward in one situation -- when I also want other aggregate values other
than the first value according to the sort.

Consider the above query if I also wanted to know the maximum and average
prices per item. Along with the store that had the maximum and minimum prices
and the total number of stores that stock the item.

With DISTINCT ON I would have to do two queries to get the maximum and minimum
along with the relevant stores, and then do a third query with GROUP BY to get
the average and total number of stores.

What would be useful is something like

SELECT item_id,
first(price) as min_price, first(store_id) as min_store,
avg(price) as avg_price,
last(price) as max_price, last(store_id) as min_store,
count(distinct store_id) as num_stores
FROM (SELECT * FROM items_for_sale ORDER BY item_id, store_id)
GROUP BY store_id

This gives the benefits of DISTINCT ON but makes it easier to combine with
GROUP BY.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mike Mascari 2003-01-24 18:44:45 Re: Odd subselect in target list behavior WRT aggregates
Previous Message Chris Smith 2003-01-24 16:56:25 JDBC drivers and streaming content