Re: Efficient Searching of Large Text Fields

From: "Aaron Bono" <postgresql(at)aranya(dot)com>
To: "Rod Taylor" <pg(at)rbt(dot)ca>
Cc: PostgreSQL <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Efficient Searching of Large Text Fields
Date: 2006-06-13 21:04:11
Message-ID: bf05e51c0606131404r6105ccch88e206f8492a1811@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

I will look that tsearch (at . It appears their 8.1.x version is
still in development and I use PostgreSQL 8.1.3 but it is worth trying
- I'm not in a hurry for that feature anyway.

I also looked at PHPBB a little - it appears their database stores
words but the code is so difficult to dig through I was not sure about
their implementation or even what they used it for. Would it be worth
the work to save the text into a separate searchable table that kept
individual words and word counts or would that be more work and eat up
more space than it is worth? You could actually index the words that
way and get much quicker searches. Then again, as I read through
tsearch, it may make this approach unnecessary...

I have also seen what looks like people using search results tables
that, after a search is performed, save a list of the results. For
example, if I were doing a search of a forum, I could save the search
in a table like this:

forum_topic
forum_topic_id (PK)
forum_topic_name
etc...

forum_topic_search
forum_topic_search_id (PK)
forum_topic_search_dt
forum_topic_search_desc

forum_topic_search_results
forum_topic_search_results_id (PK)
forum_topic_search_id (FK)
sort_index (int to tell us the order the results are returned in)
forum_topic_id (FK)

This way you can allow users to page through the results without
having to constantly research or cache the results somewhere in
memory.

Has anyone tried an approach like this?

When do you clean these search tables out? They could get quite large
after a while.

Thanks!
Aaron

On 6/13/06, Rod Taylor <pg(at)rbt(dot)ca> wrote:
> It won't help at all. Fast partial matches against arrays is nearly
> impossible. You might take a look at tsearch though.

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Rommel the iCeMAn 2006-06-13 23:19:33 Error with "limit" clause
Previous Message PFC 2006-06-13 20:52:39 Re: Efficient Searching of Large Text Fields