Re: Performance problems testing with Spamassassin 3.1.0

From: Matthew Schumacher <matt(dot)s(at)aptalaska(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance problems testing with Spamassassin 3.1.0
Date: 2005-07-29 00:13:19
Message-ID: 42E9749F.6000709@aptalaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Karim Nassar wrote:
> On Wed, 2005-07-27 at 14:35 -0800, Matthew Schumacher wrote:
>
>
>>I put the rest of the schema up at
>>http://www.aptalaska.net/~matt.s/bayes/bayes_pg.sql in case someone
>>needs to see it too.
>
>
> Do you have sample data too?
>

Ok, I finally got some test data together so that others can test
without installing SA.

The schema and test dataset is over at
http://www.aptalaska.net/~matt.s/bayes/bayesBenchmark.tar.gz

I have a pretty fast machine with a tuned postgres and it takes it about
2 minutes 30 seconds to load the test data. Since the test data is the
bayes information on 616 spam messages than comes out to be about 250ms
per message. While that is doable, it does add quite a bit of overhead
to the email system.

Perhaps this is as fast as I can expect it to go, if that's the case I
may have to look at mysql, but I really don't want to do that.

I will be working on some other benchmarks, and reading though exactly
how bayes works, but at least there is some data to play with.

schu

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Michael Fuhr 2005-07-29 01:53:22 Re: Two queries are better than one?
Previous Message Joshua D. Drake 2005-07-28 23:57:24 Re: [PATCHES] COPY FROM performance improvements