Re: monthly tally of new memberships

From: brian <brian(at)zijn-digital(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: monthly tally of new memberships
Date: 2007-07-18 19:54:20
Message-ID: 469E6FEC.2030709@zijn-digital.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Alvaro Herrera wrote:
> brian wrote:
>
>>Michael Glaesemann wrote:
>>
>>>On Jul 18, 2007, at 13:29 , brian wrote:
>>>
>>>>This returns the new memberships for each day, ignoring days
>>>>without any. What i'd like to do though, is to select only the 1st
>>>>of each month, summing the new memberships or that month, eg:
>>>>month | applications 2006-05-01 | 57 2006-06-01
>>>>| 36 2006-07-01 | 72
>>>
>>>Try something like this:
>>>SELECT date_trunc('month', applied)::date AS date_applied , count(id)
>>>AS applications FROM member GROUP BY applied ORDER BY date_applied
>>>ASC;
>>>Note I remove the WHERE applied = applied, as this is just identity.
>>
>>Thanks, but that isn't it. I've tried that exact query, actually. The
>>problem with that is it doesn't give me one row for the entire month.
>>Instead, i get one row for each day there was a new membership, only the
>>date_applied column has been changed to the 1st of that particular month.
>>eg:
>
>
> Hum, you should be grouping by date_applied (also known as "group by 1"
> because you can't use column aliases in GROUP BY)
>

Right, that works, also.

I compared this to Jon Sime's suggestion:

test=# EXPLAIN ANALYZE SELECT date_trunc('month', applied)::date AS
date_applied, count(id) AS applications FROM member GROUP BY 1 ORDER BY
date_applied ASC;

QUERY PLAN
-------------------------------------------------------
Sort (cost=140.76..141.26 rows=200 width=8) (actual time=17.590..17.622
rows=18 loops=1)
Sort Key: (date_trunc('month'::text, (applied)::timestamp with time
zone))::date
-> HashAggregate (cost=129.12..133.12 rows=200 width=8) (actual
time=17.478..17.523 rows=18 loops=1)
-> Seq Scan on member (cost=0.00..123.76 rows=1072 width=8)
(actual time=0.035..10.684 rows=1072 loops=1)
Total runtime: 17.733 ms
(5 rows)

test=# EXPLAIN ANALYZE SELECT to_char(applied, 'yyyy-mm') AS
month_applied, count(id) AS applications FROM member GROUP BY
to_char(applied, 'yyyy-mm') ORDER BY 1 ASC;

QUERY PLAN
-----------------------------------------------------
Sort (cost=137.58..138.08 rows=200 width=8) (actual
time=13.415..13.458 rows=18 loops=1)
Sort Key: to_char((applied)::timestamp with time zone, 'yyyy-mm'::text)
-> HashAggregate (cost=126.44..129.94 rows=200 width=8) (actual
time=13.273..13.314 rows=18 loops=1)
-> Seq Scan on member (cost=0.00..121.08 rows=1072 width=8)
(actual time=0.042..10.525 rows=1072 loops=1)
Total runtime: 13.564 ms
(5 rows)

But, getting back to your comment, i see that this (grouping by the
alias) also works:

test=# EXPLAIN ANALYZE SELECT to_char(applied, 'yyyy-mm') AS
month_applied, count(id) AS applications FROM member GROUP BY
month_applied ORDER BY 1 ASC;

QUERY PLAN
-----------------------------------------------------
Sort (cost=137.58..138.08 rows=200 width=8) (actual
time=44.329..44.363 rows=18 loops=1)
Sort Key: to_char((applied)::timestamp with time zone, 'yyyy-mm'::text)
-> HashAggregate (cost=126.44..129.94 rows=200 width=8) (actual
time=44.190..44.229 rows=18 loops=1)
-> Seq Scan on member (cost=0.00..121.08 rows=1072 width=8)
(actual time=0.042..41.242 rows=1072 loops=1)
Total runtime: 44.477 ms
(5 rows)

But it seems to take longer at the cost of keeping the query tidy.

brian

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Pat Maddox 2007-07-18 20:16:19 Need help optimizing this query
Previous Message Greg Sabino Mullane 2007-07-18 19:34:26 Re: DBI/DBD::Pg and transactions