Skip site navigation (1) Skip section navigation (2)

Re: Postgres vs. DSpam

From: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
To: Andrew McMillan <andrew(at)catalyst(dot)net(dot)nz>
Cc: Evilio del Rio <edelrio(at)cmima(dot)csic(dot)es>,dspam-users(at)networkdweebs(dot)com, pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres vs. DSpam
Date: 2004-11-29 21:50:56
Message-ID: 20041129215056.GZ41545@decibel.org (view raw or flat)
Thread:
Lists: pgsql-performance
FWIW, those queries won't be able to use an index. A better WHERE clause
would be:

AND last_hit < CURRENT_DATE - 60

On Fri, Nov 26, 2004 at 02:37:12PM +1300, Andrew McMillan wrote:
> On Wed, 2004-11-24 at 14:14 +0100, Evilio del Rio wrote:
> > Hi,
> > 
> > I have installed the dspam filter
> > (http://www.nuclearelephant.com/projects/dspam) on our mail server
> > (RedHat 7.3 Linux with sendmail 8.13 and procmail). I have ~300 users
> > with a quite low traffic of 4000 messages/day. So it's a quite common
> > platform/environment, nothing spectacular.
> 
> I am using DSpam with PostgreSQL here.  I have a daily job that cleans
> the DSpam database up, as follows:
> 
> DELETE FROM dspam_token_data
>   WHERE (innocent_hits*2) + spam_hits < 5
>   AND CURRENT_DATE - last_hit > 60;
> 
> DELETE FROM dspam_token_data
>   WHERE innocent_hits = 1
>   AND CURRENT_DATE - last_hit > 30;
> 
> DELETE FROM dspam_token_data
>   WHERE CURRENT_DATE - last_hit > 180;
> 
> DELETE FROM dspam_signature_data
>   WHERE CURRENT_DATE - created_on > 14;
> 
> VACUUM dspam_token_data;
> 
> VACUUM dspam_signature_data;
> 
> 
> 
> I also occasionally do a "VACUUM FULL ANALYZE;" on the database as well.
> 
> 
> In all honesty though, I think that MySQL is better suited to DSpam than
> PostgreSQL is.
> 
> 
> > Please, could anyone explain me this difference?
> > Is Postgres that bad?
> > Is MySQL that good?
> > Am I the only one to have observed this behavior?
> 
> I believe that what DSpam does that is not well-catered for in the way
> PostgreSQL operates, is that it does very frequent updates to rows in
> (eventually) quite large tables.  In PostgreSQL the UPDATE will result
> internally in a new record being written, with the old record being
> marked as deleted.  That old record won't be re-used until after a
> VACUUM has run, and this means that the on-disk tables will have a lot
> of dead rows in them quite quickly.
> 
> The reason that PostgreSQL operates this way, is a direct result of the
> way transactional support is implemented, and it may well change in a
> version or two.  It's got better over the last few versions, with things
> like pg_autovacuum, but that approach still doesn't suit some types of
> database updating.
> 
> Cheers,
> 					Andrew.
> -------------------------------------------------------------------------
> Andrew @ Catalyst .Net .NZ  Ltd,  PO Box 11-053, Manners St,  Wellington
> WEB: http://catalyst.net.nz/            PHYS: Level 2, 150-154 Willis St
> DDI: +64(4)803-2201      MOB: +64(272)DEBIAN      OFFICE: +64(4)499-2267
>        These PRESERVES should be FORCE-FED to PENTAGON OFFICIALS!!
> -------------------------------------------------------------------------
> 



-- 
Jim C. Nasby, Database Consultant               decibel(at)decibel(dot)org 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

In response to

pgsql-performance by date

Next:From: Dmitry KarasikDate: 2004-11-30 13:30:37
Subject: VACUUM ANALYZE downgrades performance
Previous:From: Christopher BrowneDate: 2004-11-27 18:43:15
Subject: Re: [dspam-users] Postgres vs. MySQL

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group