Re: Performance problems testing with Spamassassin 3.1.0

From: Gavin Sherry <swm(at)alcove(dot)com(dot)au>
To: Matthew Schumacher <matt(dot)s(at)aptalaska(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance problems testing with Spamassassin 3.1.0
Date: 2005-07-29 03:57:15
Message-ID: Pine.LNX.4.58.0507291352460.10626@linuxworld.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 28 Jul 2005, Matthew Schumacher wrote:

> Karim Nassar wrote:
> > On Wed, 2005-07-27 at 14:35 -0800, Matthew Schumacher wrote:
> >
> >
> >>I put the rest of the schema up at
> >>http://www.aptalaska.net/~matt.s/bayes/bayes_pg.sql in case someone
> >>needs to see it too.
> >
> >
> > Do you have sample data too?
> >
>
> Ok, I finally got some test data together so that others can test
> without installing SA.
>
> The schema and test dataset is over at
> http://www.aptalaska.net/~matt.s/bayes/bayesBenchmark.tar.gz
>
> I have a pretty fast machine with a tuned postgres and it takes it about
> 2 minutes 30 seconds to load the test data. Since the test data is the
> bayes information on 616 spam messages than comes out to be about 250ms
> per message. While that is doable, it does add quite a bit of overhead
> to the email system.

I had a look at your data -- thanks.

I have a question though: put_token() is invoked 120596 times in your
benchmark... for 616 messages. That's nearly 200 queries (not even
counting the 1-8 (??) inside the function itself) per message. Something
doesn't seem right there....

Gavin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Luke Lonergan 2005-07-29 04:03:20 Re: [PATCHES] COPY FROM performance improvements
Previous Message Karim Nassar 2005-07-29 02:02:13 Re: Two queries are better than one?