Re: Database-based alternatives to tsearch2?

From: Richard Huxton <dev(at)archonet(dot)com>
To: Wes <wespvp(at)syntegra(dot)com>
Cc: pgsql general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Database-based alternatives to tsearch2?
Date: 2006-12-12 20:24:16
Message-ID: 457F0FF0.3090205@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Wes wrote:
>
> Indexes are too fragile. Our documents will be offline, and re-indexing
> would be impossible. Additionally, as I undertstand it, tsearch2 doesn't
> scale to the numbers I need (hundreds of millions of documents).

Jeff's right about tsvector - sounds like it's what you're looking for.

If you're worried about reindexing costs, perhaps look at partioning the
table, or using partial indexes (so you could have multiple indexes for
each table, based on (id mod 100) or some such).

Obviously, partitioning over multiple machines is usually quite do-able
for this sort of task too.

> Is anyone aware of any such solutions for PostgreSQL, open source or
> otherwise?

Without wishing to discourage a potential large user from PG, it might
be worth checking if Google/Yahoo/etc have a non-relational server that
meets your needs off-the-shelf.

--
Richard Huxton
Archonet Ltd

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Martijn van Oosterhout 2006-12-12 20:25:40 Re: PostgreSQL 8.2 : IPO link warning with ICC 9.1.045
Previous Message Bruce Momjian 2006-12-12 20:18:59 Re: Asynchronous replication of a PostgreSQL DB to