From: | "Luke Lonergan" <llonergan(at)greenplum(dot)com> |
---|---|
To: | "Ulrich Wisser" <ulrich(dot)wisser(at)relevanttraffic(dot)se>, pgsql-performance(at)postgresql(dot)org |
Cc: | "Nicholas E(dot) Wakefield" <nwakefield(at)KineticNetworks(dot)com>, "Barry Klawans" <bklawans(at)jaspersoft(dot)com>, "Daria Hutchinson" <dhutchinson(at)greenplum(dot)com> |
Subject: | Re: Need for speed 3 |
Date: | 2005-09-01 16:37:53 |
Message-ID: | BF3C7C71.EBC5%llonergan@greenplum.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Ulrich,
On 9/1/05 6:25 AM, "Ulrich Wisser" <ulrich(dot)wisser(at)relevanttraffic(dot)se> wrote:
> My application basically imports Apache log files into a Postgres
> database. Every row in the log file gets imported in one of three (raw
> data) tables. My columns are exactly as in the log file. The import is
> run approx. every five minutes. We import about two million rows a month.
Bizgres Clickstream does this job using an ETL (extract transform and load)
process to transform the weblogs into an optimized schema for reporting.
> After every import the data from the current day is deleted from the
> reporting table and recalculated from the raw data table.
This is something the optimized ETL in Bizgres Clickstream also does well.
> What do you think of this approach? Are there better ways to do it? Is
> there some literature you recommend reading?
I recommend the Bizgres Clickstream docs, you can get it from Bizgres CVS,
and there will shortly be a live html link on the website.
Bizgres is free - it also improves COPY performance by almost 2x, among
other enhancements.
- Luke
From | Date | Subject | |
---|---|---|---|
Next Message | Nicholas E. Wakefield | 2005-09-01 17:30:43 | Re: Need for speed 3 |
Previous Message | Merlin Moncure | 2005-09-01 15:28:33 | Re: Need for speed 3 |