Re: group by week (ww), plus -S performance

From: Michael Blakeley <mike(at)blakeley(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: group by week (ww), plus -S performance
Date: 2000-05-29 06:12:48
Message-ID: p04320400b557aa32162c@blakeley.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

At 12:54 AM -0400 5/29/2000, Tom Lane wrote:
>Michael Blakeley <mike(at)blakeley(dot)com> writes:
>> SELECT to_date(date_part('year',stamp),'yyyy')-5+7*date_part('week',stamp)
>> as week,count(*) FROM EVENTS GROUP BY week ORDER BY week;
>
>Seems like the hard way. You are relying on an implicit conversion from
>the float8 output of date_part() to the text input to_date expects,
>which is kind of slow (and IMHO it's a bug in 7.0 that it will do such
>a conversion silently, anyhow). Better to use date_trunc to save the
>conversion step:
>
>SELECT date_trunc('year',stamp)-5+7*date_part('week',stamp)
> as week,count(*) FROM EVENTS GROUP BY week ORDER BY week;

Thanks - that seems to be about 10% faster. I also got a 25% boost by
recompiling PG7 with -O3.

>But the real problem is that this is going to use a start-of-week day
>that is offset five days from whatever day-of-the-week 1 January is.
>If you'd tried a few other years than 2000 you'd likely have been
>dissatisfied with the results...

Here's something that works a bit better, at the expense of falling
in line with 'dow' and starting each week on Sunday, which I don't
mind. I'd only gone with Monday in the first place because that was
what 'ww' seemed to do.

$ cal 12 1999
December 1999
S M Tu W Th F S
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31

$ cal 1 2000
January 2000
S M Tu W Th F S
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31

$ cal 1 2001
January 2001
S M Tu W Th F S
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

$ D="'2000-01-01'"; psql -c "SELECT $D::date-date_part('dow',$D::date);"
?column?
------------
1999-12-26
(1 row)

$ D="'2001-01-01'"; psql -c "SELECT $D::date-date_part('dow',$D::date);"
?column?
------------
2000-12-31
(1 row)

$ D="'2000-01-15'"; psql -c "SELECT $D::date-date_part('dow',$D::date);"
?column?
------------
2000-01-09
(1 row)

$ D="'2000-01-16'"; psql -c "SELECT $D::date-date_part('dow',$D::date);"
?column?
------------
2000-01-16
(1 row)

It's also about 35% faster, and hits me with a healthy "duh!" factor
since it's so obvious and so much cleaner. The total time for my
original query has been cut by more than half - from 70 sec to 28
sec. Thanks!

>It seems like the right answer is that date_trunc('week',stamp) ought
>to do what you want, but it doesn't seem to be implemented. That's
>definitely a shortcoming --- anyone want to fix it?

You know what they say - if you want the right answer, just post the
wrong answer and wait to be corrected :-).

If you can't find fault with the query above, maybe it could be
plugged into date_trunc('week', stamp) as-is? I haven't even looked
at the source code, myself....

-- Mike

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Erich 2000-05-29 06:39:28 Re: group by week (ww), plus -S performance
Previous Message Tom Lane 2000-05-29 04:54:13 Re: group by week (ww), plus -S performance