Re: Tuning tsearch searching and indexing

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Eric Walstad <eric(at)ericwalstad(dot)com>
Cc: sfpug(at)postgresql(dot)org
Subject: Re: Tuning tsearch searching and indexing
Date: 2009-02-11 00:53:29
Message-ID: 49922189.5040300@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: sfpug

Eric,

> Our 'big' table is now at ~15.7M records, it holds address data with a
> few other fields, too. We were using Xapian to index the address
> data. Our users need to search the address data frequently. We want
> to try using tsearch over Xapian for various reasons but we're having
> difficulties getting the indexes built. For example, after running
> for about 30hours, the indexing operation failed when a scheduled
> vacuum started:
>
> er-ca=# UPDATE er_service_address
> er-ca-# SET search_tsv=to_tsvector(
> er-ca(# 'default',
> er-ca(# COALESCE(service_account, '') ||' '||
> er-ca(# COALESCE(customer_name, '') ||' '||
> er-ca(# COALESCE(addr1, '') ||' '||
> er-ca(# COALESCE(city, '') ||' '||
> er-ca(# COALESCE(zipcode, '') ||' '||
> er-ca(# COALESCE(service_type, '') ||' '||
> er-ca(# COALESCE(account_type, '')
> er-ca(# )
> er-ca-# ;

Well, first, you're going to become an early adopter of 8.4 because of
the multi-column TSearch indexes.

Second, I'd strongly suggest that you put the concatenated column for
the TSearch index into another table, er_service_address_fts. This both
would make the above operation much faster, and would allow you to adopt
asynchronous updating of the FTI if you find that the cost of updating
it makes updates to the primary data too slow.

--Josh

In response to

Responses

Browse sfpug by date

  From Date Subject
Next Message Eric Walstad 2009-02-11 01:54:15 Re: Tuning tsearch searching and indexing
Previous Message Steve Crawford 2009-02-10 19:00:43 Anyone need a ride?