Re: Tsearch2 Initial Search Speed

From: Howard Cole <howardnews(at)selestial(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Tsearch2 Initial Search Speed
Date: 2008-06-18 10:40:11
Message-ID: 4858E60B.5070300@selestial.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


>
> Actually, the index returns page numbers in the table on disc which
> may contain one or more rows that are relevant. Postgres has to fetch
> the whole row to find out the email_id and any other information,
> including whether the row is visible in your current transaction
> (concurrency control complicates it all). Just having a page number
> isn't much use to you!
>
> Matthew
>
Out of interest, if I could create a multicolumn index with both the
primary key and the fts key (I don't think I can create a multi-column
index using GIST with both the email_id and the fts field), would this
reduce access to the table due to the primary key being part of the index?

More importantly, are there other ways that I can improve performance on
this? I am guessing that a lot of the problem is that the email table is
so big. If I cut out some of the text fields that are not needed in the
search and put them in another table, presumably the size of the table
will be reduced to a point where it will reduce the number of disk hits
and speed the query up.

So I could split the table into two parts:

create table email_part2 (
email_id int8 references email_part1 (email_id),
fts ...,
email_directory_id ...,
)

create table email_part1(
email_id serial8 primary key,
cc text,
bcc text,
...
)

and the query will be
select email_id from email_part2 where to_tsquery('default', 'howard')
@@ fts;

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Matthew Wakeling 2008-06-18 12:38:42 Re: Tsearch2 Initial Search Speed
Previous Message Jon D 2008-06-18 01:14:58 Re: Which hardware ?