Skip site navigation (1) Skip section navigation (2)

Re: Performance problems testing with Spamassassin 3.1.0

From: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
To: Matthew Schumacher <matt(dot)s(at)aptalaska(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance problems testing with Spamassassin 3.1.0
Date: 2005-07-31 17:10:12
Message-ID: 20050731171012.GS60019@decibel.org (view raw or flat)
Thread:
Lists: pgsql-performance
On Sun, Jul 31, 2005 at 08:51:06AM -0800, Matthew Schumacher wrote:
> Ok, here is the current plan.
> 
> Change the spamassassin API to pass a hash of tokens into the storage
> module, pass the tokens to the proc as an array, start a transaction,
> load the tokens into a temp table using copy, select the tokens distinct
> into the token table for new tokens, update the token table for known
> tokens, then commit.

You might consider:
UPDATE tokens
    FROM temp_table (this updates existing records)

INSERT INTO tokens
    SELECT ...
    FROM temp_table
    WHERE NOT IN (SELECT ... FROM tokens)

This way you don't do an update to newly inserted tokens, which helps
keep vacuuming needs in check.

> This solves the following problems:
> 
> 1.  Each email is a transaction instead of each token.
> 2.  The update statement is only called when we really need an update
> which avoids all of those searches.
> 3.  The looping work is done inside the proc instead of perl calling a
> method a zillion times per email.
> 
> I'm not sure how vacuuming will be done yet, if we vacuum once per email
> that may be too often, so I may do that every 5 mins in cron.

I would suggest leaving an option to have SA vacuum every n emails,
since some people may not want to mess with cron, etc. I suspect that
pg_autovacuum would be able to keep up with things pretty well, though.
-- 
Jim C. Nasby, Database Consultant               decibel(at)decibel(dot)org 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

In response to

Responses

pgsql-performance by date

Next:From: Andreas PflugDate: 2005-07-31 18:19:11
Subject: Re: Performance problems testing with Spamassassin 3.1.0
Previous:From: Matthew SchumacherDate: 2005-07-31 16:51:06
Subject: Re: Performance problems testing with Spamassassin 3.1.0

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group