Re: how many record versions

From: David Garamond <lists(at)zara(dot)6(dot)isreserved(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: how many record versions
Date: 2004-05-24 06:03:54
Message-ID: 40B1904A.3090307@zara.6.isreserved.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Greg Stark wrote:
>>Actually, each record will be incremented probably only thousands of times a
>>day. But there are many banners. Each record has a (bannerid, campaignid,
>>websiteid, date, countrycode) "dimensions" and (impression, click) "measures".
>
> In the past when I had a very similar situation we kept the raw impression and
> click event data. Ie, one record per impression in the impression table and
> one record per click in the click data.

> That makes the tables insert-only which is efficient and not prone to locking
> contention. They would never have to be vacuumed except after purging old data.

Assuming there are 10 millions of impressions per day, the impression
table will grow at least 200-400MB per day, is that correct? What do you
do and how often do you purge old data? Do you do a mass DELETE on the
impression table itself or do you switch to another table? I've found
that deleting tens/hundreds of thousands of row, at least in InnoDB,
takes long, long time (plus it sucks CPU and slows other queries).

> Then to accelerate queries we had denormalized aggregate tables with a cron
> job that did the equivalent of
>
> insert into agg_clicks (
> select count(*),bannerid
> from clicks
> where date between ? and ?
> group by bannerid
> )

--
dave

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Marc Slemko 2004-05-24 06:23:28 Re: how many record versions
Previous Message David Garamond 2004-05-24 05:58:28 Re: how many record versions