Re: how many record versions

From: Greg Stark <gsstark(at)mit(dot)edu>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: how many record versions
Date: 2004-05-23 21:36:07
Message-ID: 87vfimq114.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


David Garamond <lists(at)zara(dot)6(dot)isreserved(dot)com> writes:

> Actually, each record will be incremented probably only thousands of times a
> day. But there are many banners. Each record has a (bannerid, campaignid,
> websiteid, date, countrycode) "dimensions" and (impression, click) "measures".

In the past when I had a very similar situation we kept the raw impression and
click event data. Ie, one record per impression in the impression table and
one record per click in the click data.

That makes the tables insert-only which is efficient and not prone to locking
contention. They would never have to be vacuumed except after purging old data.

Then to accelerate queries we had denormalized aggregate tables with a cron
job that did the equivalent of

insert into agg_clicks (
select count(*),bannerid
from clicks
where date between ? and ?
group by bannerid
)

Where the ?s were actually hourly periods. Ie, at 12:15 it ran this query for
the 11-12 period.

This meant we didn't have immediate up-to-date stats on banners but it meant
we did have stats on every single impression and click including time and
information about the users.

This worked out very well for reporting needs. If your system is using the
data to handle serving the ads, though, it's a different kettle of fish. For
that I think you'll want something that avoids having to do a database query
for every single impression.

--
greg

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Paul Thomas 2004-05-23 22:39:19 Re: pg_dump error
Previous Message Tom Lane 2004-05-23 20:47:38 Re: how many record versions