Tsearch2 Initial Search Speed

From: Howard Cole <howardnews(at)selestial(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Tsearch2 Initial Search Speed
Date: 2008-06-16 17:55:43
Message-ID: 4856A91F.1020001@selestial.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi, I am looking to improve the initial query speed for the following query:

select email_id from email, to_tsquery('default','example') as q where
q@@fts;

This is running on 8.2.4 on Windows Server 2K3.

The initial output from explain analyse is as follows.

"Nested Loop (cost=8.45..76.70 rows=18 width=8) (actual
time=5776.347..27364.248 rows=14938 loops=1)"
" -> Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual
time=0.023..0.024 rows=1 loops=1)"
" -> Bitmap Heap Scan on email (cost=8.45..76.46 rows=18 width=322)
(actual time=5776.314..27353.344 rows=14938 loops=1)"
" Filter: (q.q @@ email.fts)"
" -> Bitmap Index Scan on email_fts_index (cost=0.00..8.44
rows=18 width=0) (actual time=5763.355..5763.355 rows=15118 loops=1)"
" Index Cond: (q.q @@ email.fts)"
"Total runtime: 27369.091 ms"

Subsequent output is considerably faster. (I am guessing that is because
email_fts_index is cached.

"Nested Loop (cost=8.45..76.70 rows=18 width=8) (actual
time=29.241..264.712 rows=14938 loops=1)"
" -> Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual
time=0.008..0.010 rows=1 loops=1)"
" -> Bitmap Heap Scan on email (cost=8.45..76.46 rows=18 width=322)
(actual time=29.224..256.135 rows=14938 loops=1)"
" Filter: (q.q @@ email.fts)"
" -> Bitmap Index Scan on email_fts_index (cost=0.00..8.44
rows=18 width=0) (actual time=28.344..28.344 rows=15118 loops=1)"
" Index Cond: (q.q @@ email.fts)"
"Total runtime: 268.663 ms"

The table contains text derived from emails and therefore its contents
and the searches can vary wildly.

Table construction as follows:

CREATE TABLE email
(
email_id bigint NOT NULL DEFAULT
nextval(('public.email_email_id_seq'::text)::regclass),
send_to text NOT NULL DEFAULT ''::text,
reply_from character varying(100) NOT NULL DEFAULT ''::character varying,
cc text NOT NULL DEFAULT ''::text,
bcc text NOT NULL DEFAULT ''::text,
subject text NOT NULL DEFAULT ''::text,
"content" text NOT NULL DEFAULT ''::text,
time_tx_rx timestamp without time zone NOT NULL DEFAULT now(),
fts tsvector,
CONSTRAINT email_pkey PRIMARY KEY (email_id),
)
WITH (OIDS=FALSE);

-- Index: email_fts_index

CREATE INDEX email_fts_index
ON email
USING gist
(fts);

CREATE INDEX email_mailbox_id_idx
ON email
USING btree
(mailbox_id);

-- Trigger: fts_trigger on email
CREATE TRIGGER fts_trigger
BEFORE INSERT OR UPDATE
ON email
FOR EACH ROW
EXECUTE PROCEDURE tsearch2('fts', 'send_to', 'reply_from', 'cc',
'content', 'subject');

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alan Hodgson 2008-06-16 18:24:57 Re: Tsearch2 Initial Search Speed
Previous Message luke.78@libero.it 2008-06-16 09:06:44 function difference(geometry,geometry) is SLOW!