Skip site navigation (1) Skip section navigation (2)

Tsearch2 Initial Search Speed

From: Howard Cole <howardnews(at)selestial(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Tsearch2 Initial Search Speed
Date: 2008-06-16 17:55:43
Message-ID: 4856A91F.1020001@selestial.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi, I am looking to improve the initial query speed for the following query:

select email_id from email, to_tsquery('default','example') as q where 
q@@fts;

This is running on 8.2.4 on Windows Server 2K3.

The initial output from explain analyse is as follows.

"Nested Loop  (cost=8.45..76.70 rows=18 width=8) (actual 
time=5776.347..27364.248 rows=14938 loops=1)"
"  ->  Function Scan on q  (cost=0.00..0.01 rows=1 width=32) (actual 
time=0.023..0.024 rows=1 loops=1)"
"  ->  Bitmap Heap Scan on email  (cost=8.45..76.46 rows=18 width=322) 
(actual time=5776.314..27353.344 rows=14938 loops=1)"
"        Filter: (q.q @@ email.fts)"
"        ->  Bitmap Index Scan on email_fts_index  (cost=0.00..8.44 
rows=18 width=0) (actual time=5763.355..5763.355 rows=15118 loops=1)"
"              Index Cond: (q.q @@ email.fts)"
"Total runtime: 27369.091 ms"

Subsequent output is considerably faster. (I am guessing that is because 
email_fts_index is cached.

"Nested Loop  (cost=8.45..76.70 rows=18 width=8) (actual 
time=29.241..264.712 rows=14938 loops=1)"
"  ->  Function Scan on q  (cost=0.00..0.01 rows=1 width=32) (actual 
time=0.008..0.010 rows=1 loops=1)"
"  ->  Bitmap Heap Scan on email  (cost=8.45..76.46 rows=18 width=322) 
(actual time=29.224..256.135 rows=14938 loops=1)"
"        Filter: (q.q @@ email.fts)"
"        ->  Bitmap Index Scan on email_fts_index  (cost=0.00..8.44 
rows=18 width=0) (actual time=28.344..28.344 rows=15118 loops=1)"
"              Index Cond: (q.q @@ email.fts)"
"Total runtime: 268.663 ms"

The table contains text derived from emails and therefore its contents 
and the searches can vary wildly.

Table construction as follows:

CREATE TABLE email
(
  email_id bigint NOT NULL DEFAULT 
nextval(('public.email_email_id_seq'::text)::regclass),
  send_to text NOT NULL DEFAULT ''::text,
  reply_from character varying(100) NOT NULL DEFAULT ''::character varying,
  cc text NOT NULL DEFAULT ''::text,
  bcc text NOT NULL DEFAULT ''::text,
  subject text NOT NULL DEFAULT ''::text,
  "content" text NOT NULL DEFAULT ''::text,
  time_tx_rx timestamp without time zone NOT NULL DEFAULT now(),
  fts tsvector,
  CONSTRAINT email_pkey PRIMARY KEY (email_id),
)
WITH (OIDS=FALSE);

-- Index: email_fts_index

CREATE INDEX email_fts_index
  ON email
  USING gist
  (fts);

CREATE INDEX email_mailbox_id_idx
  ON email
  USING btree
  (mailbox_id);


-- Trigger: fts_trigger on email
CREATE TRIGGER fts_trigger
  BEFORE INSERT OR UPDATE
  ON email
  FOR EACH ROW
  EXECUTE PROCEDURE tsearch2('fts', 'send_to', 'reply_from', 'cc', 
'content', 'subject');


Responses

pgsql-performance by date

Next:From: Alan HodgsonDate: 2008-06-16 18:24:57
Subject: Re: Tsearch2 Initial Search Speed
Previous:From: luke.78@libero.itDate: 2008-06-16 09:06:44
Subject: function difference(geometry,geometry) is SLOW!

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group