From: | Max Baker <max(at)warped(dot)org> |
---|---|
To: | Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> |
Cc: | Greg Stark <gsstark(at)mit(dot)edu>, Joe Conway <mail(at)joeconway(dot)com>, Manfred Koizar <mkoi-pg(at)aon(dot)at>, PostgreSQL Performance Mailing List <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: speeding up COUNT and DISTINCT queries |
Date: | 2003-03-13 20:22:54 |
Message-ID: | 20030313202254.GD30411@warped.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Thu, Mar 13, 2003 at 03:05:30PM -0500, Robert Treat wrote:
> On Thu, 2003-03-13 at 10:42, Greg Stark wrote:
> > Max Baker <max(at)warped(dot)org> writes:
> > > On Wed, Mar 12, 2003 at 05:57:50PM -0800, Joe Conway wrote:
> > > That would explain why once a night isn't enough. Thanks.
> > > The contents of this table get refreshed every 4 hours. I'll add a
> > > vacuum after every refresh and comapre the results in a couple days.
> >
> > If it gets completely refreshed, ie, every tuple is updated or deleted and
> > re-inserted in a big batch job then VACUUM might never be enough without
> > boosting some config values a lot. You might need to do a VACUUM FULL after
> > the refresh. VACUUM FULL locks the table though which might be unfortunate.
I'm not starting with fresh data every time, I'm usually checking for
an existing record, then setting a timestamp and a boolean flag.
I've run some profiling and it's about 8000-10,000 UPDATEs every 4
hours. These are accompanied by about 800-1000 INSERTs.
> hmm... approx 35,000 records, getting updated every 4 hours. so..
>
> 35000 / (4*60) =~ 145 tuples per minute.
>
> Lets assume we want to keep any overhead at 10% or less, so we need to
> lazy vacuum every 3500 updates. so...
>
> 3500 tuples / 145 tpm =~ 25 minutes.
>
> So, set up a cron job to lazy vacuum every 20 minutes and see how that
> works for you.
I'm now having VACUUM ANALYZE run after each of these updates. The data
comes in in spurts -- a 90 minute batch job that runs every 4 hours.
thanks folks,
-m
From | Date | Subject | |
---|---|---|---|
Next Message | Hannu Krosing | 2003-03-13 22:26:20 | Re: Postgresql performance |
Previous Message | Robert Treat | 2003-03-13 20:05:30 | Re: speeding up COUNT and DISTINCT queries |