Large Text Search Help

From: psql-mail(at)freeuk(dot)com
To: pgsql-performance(at)postgresql(dot)org
Subject: Large Text Search Help
Date: 2003-10-08 15:48:17
Message-ID: E1A7GXx-0007J5-PM@buckaroo.freeuk.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,
I am trying to design a large text search database.

It will have upwards of 6 million documents, along with meta data on
each.

I am currently looking at tsearch2 to provide fast text searching and
also playing around with different hardware configurations.

1. With tsearch2 I get very good query times up until I insert more
records. For example with 100,000 records tsearch2 returns in around 6
seconds, with 200,000 records tsearch2 returns in just under a minute.
Is this due to the indices fitting entirely in memory with 100,000
records?

2. As well as whole word matching i also need to be able to do
substring matching. Is the FTI module the way to approach this?

3. I have just begun to look into distibuted queries. Is there an
existing solution for distibuting a postgresql database amongst
multiple servers, so each has the same schema but only a subset of the
total data?

Any other helpful comments or sugestions on how to improve query times
using different hardware or software techniques would be appreciated.

Thanks,

Mat

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Shridhar Daithankar 2003-10-08 15:51:16 Re: Presentation
Previous Message Jeff 2003-10-08 15:46:09 Re: Sun performance - Major discovery!