Re: Postgres vs. DSpam

From: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
To: Andrew McMillan <andrew(at)catalyst(dot)net(dot)nz>
Cc: Evilio del Rio <edelrio(at)cmima(dot)csic(dot)es>, dspam-users(at)networkdweebs(dot)com, pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres vs. DSpam
Date: 2004-11-29 21:50:56
Message-ID: 20041129215056.GZ41545@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

FWIW, those queries won't be able to use an index. A better WHERE clause
would be:

AND last_hit < CURRENT_DATE - 60

On Fri, Nov 26, 2004 at 02:37:12PM +1300, Andrew McMillan wrote:
> On Wed, 2004-11-24 at 14:14 +0100, Evilio del Rio wrote:
> > Hi,
> >
> > I have installed the dspam filter
> > (http://www.nuclearelephant.com/projects/dspam) on our mail server
> > (RedHat 7.3 Linux with sendmail 8.13 and procmail). I have ~300 users
> > with a quite low traffic of 4000 messages/day. So it's a quite common
> > platform/environment, nothing spectacular.
>
> I am using DSpam with PostgreSQL here. I have a daily job that cleans
> the DSpam database up, as follows:
>
> DELETE FROM dspam_token_data
> WHERE (innocent_hits*2) + spam_hits < 5
> AND CURRENT_DATE - last_hit > 60;
>
> DELETE FROM dspam_token_data
> WHERE innocent_hits = 1
> AND CURRENT_DATE - last_hit > 30;
>
> DELETE FROM dspam_token_data
> WHERE CURRENT_DATE - last_hit > 180;
>
> DELETE FROM dspam_signature_data
> WHERE CURRENT_DATE - created_on > 14;
>
> VACUUM dspam_token_data;
>
> VACUUM dspam_signature_data;
>
>
>
> I also occasionally do a "VACUUM FULL ANALYZE;" on the database as well.
>
>
> In all honesty though, I think that MySQL is better suited to DSpam than
> PostgreSQL is.
>
>
> > Please, could anyone explain me this difference?
> > Is Postgres that bad?
> > Is MySQL that good?
> > Am I the only one to have observed this behavior?
>
> I believe that what DSpam does that is not well-catered for in the way
> PostgreSQL operates, is that it does very frequent updates to rows in
> (eventually) quite large tables. In PostgreSQL the UPDATE will result
> internally in a new record being written, with the old record being
> marked as deleted. That old record won't be re-used until after a
> VACUUM has run, and this means that the on-disk tables will have a lot
> of dead rows in them quite quickly.
>
> The reason that PostgreSQL operates this way, is a direct result of the
> way transactional support is implemented, and it may well change in a
> version or two. It's got better over the last few versions, with things
> like pg_autovacuum, but that approach still doesn't suit some types of
> database updating.
>
> Cheers,
> Andrew.
> -------------------------------------------------------------------------
> Andrew @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington
> WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St
> DDI: +64(4)803-2201 MOB: +64(272)DEBIAN OFFICE: +64(4)499-2267
> These PRESERVES should be FORCE-FED to PENTAGON OFFICIALS!!
> -------------------------------------------------------------------------
>

--
Jim C. Nasby, Database Consultant decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Dmitry Karasik 2004-11-30 13:30:37 VACUUM ANALYZE downgrades performance
Previous Message Christopher Browne 2004-11-27 18:43:15 Re: [dspam-users] Postgres vs. MySQL