full text search to_tsquery performance with ispell dictionary

From: Stanislav Raskin <raskin(at)livn(dot)de>
To: <pgsql-general(at)postgresql(dot)org>
Subject: full text search to_tsquery performance with ispell dictionary
Date: 2011-05-11 11:19:38
Message-ID: C9F03D6A.202C0%raskin@livn.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello everybody,

I was experimenting with the FTS feature on postgres 8.3.4 lately and
encountered a weird performance issue when using a custom FTS configuration.

I use this german ispell dictionary, re-encoded to utf8:

http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/ispell-g
erman-compound.tar.gz

With the following configuration:

CREATE TEXT SEARCH CONFIGURATION public.german_de (COPY =
pg_catalog.german);

CREATE TEXT SEARCH DICTIONARY german_de_ispell (

TEMPLATE = ispell,

DictFile = german_de_utf8,

AffFile = german_de_utf8,

StopWords = german_de_utf8

);

ALTER TEXT SEARCH CONFIGURATION german_de

ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,

word, hword, hword_part

WITH german_de_ispell, german_stem;

So far so good. Indexing and creation of tsvectors works like a charm.

The problem is, that if I open a new connection to the database and do
something like this

SELECT to_tsquery('german_de', 'abcd');

it takes A LOT of time for the query to complete for the first time. About
1-1,5s. If I submit the same query for a second, third, fourth time and so
on, it takes only some 10-20ms, which is what I would expect.

It almost seems as if the dictionary is somehow analyzed or indexed and the
results cached for each connection, which seems counter-intuitive to me.
After all, the dictionaries should not change that often.

Did I miss something or did I do something wrong?

I'd be thankful for any advice.

Kind Regards

--

Stanislav Raskin

livn GmbH
Campus Freudenberg
Rainer-Gruenter-Str. 21
42119 Wuppertal

+49(0)202-8 50 66 921
raskin(at)livn(dot)de
http://www.livn.de

livn
local individual video news GmbH
Registergericht Wuppertal HRB 20086

Geschäftsführer:
Dr. Stefan Brües
Alexander Jacob

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tim Uckun 2011-05-11 13:05:59 Postgres federation
Previous Message Noah Misch 2011-05-11 08:11:08 Re: One-off attempt at catalog hacking to turn bytea column into text