Got one of those tricky SQL questions that someone else might be able to
come up with a really slick answer. I'll be as brief as I can be, but
it's a bit long. The actual questions are at the bottom... RTFM would
be lovely if pointers are given. I've searched the archives and docs
and come up blank...
My company provides games and interactive content to internet publishers
on an ASP basis. This means we've got the raw data to produce
interesting usage statistics.
One of the things that would be quite interesting would be to take, on
an hourly basis, various of certain tables both total and hourly, and
then graph these. I've got an acceptable little java applet that graphcs
things in a quite easy way. It's not terrifically powerful, but it's
not big, either.
A little background: when a customer puts our game on their site, it's
generally for a period of time, called a 'promotion'. This promotion is
broken into 'rounds'. Each round may have 1..n 'games' in it...and
these games can change between rounds.
So, to find out the total usage of all games for a given promotion,
group by promotion_id;
Okay so far. But perhaps we want to see the count of individual games:
group by promotion_id,game_id;
Again, good enough as it goes. Now we get tricky...we want to know the
number of times each game is played on an hourly basis.
promotion_id, game_id, date_trunc('hour',when_inserted),
group by promotion_id,game_id,date_trunc('hour',when_inserted);
Now this will produce something like this:
promotion_id | game_id | date_trunc | count
38 | 0 | 2001-11-28 09:00:00+00 | 84
38 | 0 | 2001-11-28 12:00:00+00 | 372
38 | 0 | 2001-11-28 13:00:00+00 | 320
Is the use of 'date_trunc' 3 times as in the above SQL not horribly
inefficient? Is there a way to only have to do it once in the query?
The REAL problem is in the rows returned not having all the hours
represented. Note that nobody apparently played game 0 between 10 and
11:59 that day. when the simple java graph applet I mentioned comes
along, it's not smart enough to fill in the hours that aren't there.
So, we need to provide those blank hours somehow. I can see several
ways, but was hoping there's a really slick SQL method of dealing with
Method 0 (smarts in wrong place) Change the graphic applet so that it
understand date/time stamps, and fills in blanks somehow. I don't like
this one, because the applet now has to know about the format of what
time looks like, instead of just copying the rows blindly to the page.
Method 1 (brute force) - The servlet that outputs the page gets the rows
that do exist, figures out which ones don't exist and inserts them in
the output stream with 0's and passes the lot to the applet. I don't
like this one, because the servlet now has to know about the format of
what time looks like, instead of just copying the rows blindly to the
Method 2 (overkill) - Create a table called 'time' with a row for each
hour on the clock. When creating the rows, do a left join on this table
so that hours that don't have any plays would still be represented. I'm
leery about doing left joins like this, as they are tricky, and can be
killers on the database.
Is there a slick use of 'between', 'interval' or some other SQL-ism (or
PostgreSQL-ism) that will give me in effect a left join on all the hours
from start::timestamp to end::timestamp?
Ken Corey CTO http://www.atomic-interactive.com 07720 440 731
pgsql-novice by date
|Next:||From: Rory Campbell-Lange||Date: 2002-07-22 15:20:35|
|Subject: Re: use vi mode in psql|
|Previous:||From: Nigel J. Andrews||Date: 2002-07-22 13:00:47|
|Subject: Re: URGENT! pg_dump doesn't work!|