Re: Problems with PARTITION BY with count() in window-func

From: "MS (direkt)" <martin(dot)stoecker(at)stb-datenservice(dot)de>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Problems with PARTITION BY with count() in window-func
Date: 2017-11-08 14:51:05
Message-ID: 41eca9c2-e777-9084-8665-d9d6fda582a1@stb-datenservice.de
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-sql

a solution without distinct:

select * from (
select
      date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE)
AS month,
    log.activity_id,
    count(log.entity_id) over (partition by date_trunc('month',
log.start_date::TIMESTAMP WITHOUT TIME ZONE), activity_id),
    count(log.entity_id) over (partition by date_trunc('month',
log.start_date::TIMESTAMP WITHOUT TIME ZONE))
FROM
    log_entry log
order by 1,2) as x
group by 1,2,3,4
order by 1,2;

If one think of introducing new column i prefer using grouping sets and
rollup.
If you do so, an enclosing query will help to filter just the subtotals
you need.

Regards, Martin

Am 08.11.2017 um 14:50 schrieb Andreas Joseph Krogh:
> På onsdag 08. november 2017 kl. 11:17:39, skrev Andreas Joseph Krogh
> <andreas(at)visena(dot)com <mailto:andreas(at)visena(dot)com>>:
>
> Hi all.
> I'm trying to count() all log-entries per activity per month in a
> separate column using count() over(partition by ...) but get an
> error I don't understand.
> [snip]
>
> select date_trunc('month',log.start_date::TIMESTAMP WITHOUT TIME ZONE)AS month ,log.activity_id
> ,count(log.entity_id)AS num_logs
> ,count(log.entity_id)OVER total_for_month_windowAS total_for_month
> FROM log_entrylog WHERE 1 =1 AND date_trunc('month',log.start_date::TIMESTAMP WITHOUT TIME ZONE)BETWEEN '2017-01-01'::DATEAND '2017-05-01'::DATE
> GROUP BY month,log.activity_id
> WINDOW total_for_month_windowAS (PARTITION BY date_trunc('month',log.start_date::TIMESTAMP WITHOUT TIME ZONE))
>
> ORDER BY date_trunc('month',log.start_date::TIMESTAMP WITHOUT TIME ZONE)ASC ;
>
> Note the missing log.entity_id in GROUP BY.
> but this gives the error:
> [42803] column "log.entity_id" must appear in the GROUP BY clause
> or be used in an aggregate function
> Details
>
> This query gives the desired results:
> SELECT DISTINCT date_trunc('month',log.start_date::TIMESTAMP WITHOUT TIME ZONE)AS month ,log.activity_id
> ,count(log.entity_id)OVER(partition by date_trunc('month',log.start_date::TIMESTAMP WITHOUT TIME ZONE),log.activity_id)AS num_logs_per_activity
> ,count(log.entity_id)OVER (partition by date_trunc('month',log.start_date::TIMESTAMP WITHOUT TIME ZONE))AS total_for_month
> FROM log_entrylog ORDER BY date_trunc('month',log.start_date::TIMESTAMP WITHOUT TIME ZONE)ASC,log.activity_idASC ;
> month activity_id num_logs_per_activity total_for_month
> 2017-01-01 00:00:00.000000 1 2 4
> 2017-01-01 00:00:00.000000 2 2 4
> 2017-02-01 00:00:00.000000 1 5 6
> 2017-02-01 00:00:00.000000 2 1 6
>
> But I'd like a solution without the DISTINCT, if one exists?
> If I introduce a new column, logged_for, and want to list number of
> logged entries per person per activity I can use this:
> SELECT DISTINCT date_trunc('month',log.start_date::TIMESTAMP WITHOUT TIME ZONE)AS month ,log.logged_for
> ,log.activity_id
> ,count(log.entity_id)OVER(partition by date_trunc('month',log.start_date::TIMESTAMP WITHOUT TIME ZONE),log.logged_for,log.activity_id)AS num_logs_per_person_for_activity
> ,count(log.entity_id)OVER (partition by date_trunc('month',log.start_date::TIMESTAMP WITHOUT TIME ZONE),log.logged_for)AS total_for_person_for_month
> FROM log_entrylog ORDER BY date_trunc('month',log.start_date::TIMESTAMP WITHOUT TIME ZONE)ASC ;
> Which gives:
> month logged_for activity_id num_logs_per_person_for_activity
> total_for_person_for_month
> 2017-01-01 00:00:00.000000 5 1 2 4
> 2017-01-01 00:00:00.000000 5 2 2 4
> 2017-01-01 00:00:00.000000 6 1 2 4
> 2017-01-01 00:00:00.000000 6 2 2 4
> 2017-02-01 00:00:00.000000 5 1 5 6
> 2017-02-01 00:00:00.000000 5 2 1 6
> 2017-02-01 00:00:00.000000 6 1 5 6
> 2017-02-01 00:00:00.000000 6 2 1 6
>
> Is this the recommended way to do this, I mean - having "group by" in
> the "partition by" belonging to the OVER()-clause of the count-aggregates?
> Thanks.
> --
> *Andreas Joseph Krogh*
> CTO / Partner - Visena AS
> Mobile: +47 909 56 963
> andreas(at)visena(dot)com <mailto:andreas(at)visena(dot)com>
> www.visena.com <https://www.visena.com>
> <https://www.visena.com>

--
Widdersdorfer Str. 415, 50933 Köln; Tel. +49 / 221 / 9544 010
HRB Köln HRB 75439, Geschäftsführer: S. Böhland, S. Rosenbauer

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Iaam Onkara 2017-11-13 06:15:32 md5 checksum of a previous row
Previous Message Andreas Joseph Krogh 2017-11-08 13:50:20 Re: Problems with PARTITION BY with count() in window-func