Quick Links

Need to improve performance

From:	Vassili A Akimov <vassili(at)erols(dot)com>
To:	pgsql-sql(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org
Subject:	Need to improve performance
Date:	2000-06-23 14:27:18
Message-ID:	395373C5.B1C2A566@erols.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general pgsql-sql

Hallo,
My name is Vassili Akimov. I work with postgres, we maintain the main
database for our sponsors on it. One of thier requirements,- perform the

search through the long text field. Sometimes this field even exceeds
32k so we cut off the rest. But search trough this long field is too
slow. (it takes approximately 4 minutes of proccessor time on Intel
pentium II 266MHz). So we were advised to use "fulltextindex".- chunk
this fields on single words and make new table with words and oids in
it. After we made this table its size was 2940360 records. And I tried
to measure the time:

Create table app_fti(string varchar(32),id oid);

fill this table with words and oid.(as it explained in
fulltextindex(update main table))

create index app_fti_idx on app_fti(string,id);
cluster app_fti_idx on app_fti;
vacuum;
vacuum analyze;
select f1.id from app_fti f1, app_fti f2 where f1.string~*'visual' and
f2.string~*'basic' and f1.id=f2.id;

this select takes about the same time that the select we used for
searching through whole original text field. So we tried slightly
different approach:

Create table app_fti(string varchar(32),id oid);
fill this table with words and oid. (as it explained in fulltextindex)
create index app_fti_idx on app_fti(string);
create index app_fti_id_idx on app_fti(id);
cluster app_fti_idx on app_fti;
vacuum;
vacuum analyze;
select f1.id from app_fti f1, app_fti f2 where f1.string~*'visual' and
f2.string~*'basic' and f1.id=f2.id;

this select takes slightly less time but not significant. And if we
would add one more word in search criteria, it would add 1 more minute
to the search time. So we can't use this for our database.
the question is can we do something to make this type of search faster?
we need to make performace of our search at least twice faster than it
is now.
original table and search we do now are:

Table "applicant"
Attribute | Type | Modifier

-----------------------------+-----------+------------------------------------

it also has index on oid;

the average select is:

SELECT CASE
WHEN "other skills"~*'[^a-zA-Z]visual[^a-zA-Z]'
then 1 else 0 end +
CASE WHEN "other skills"~*'[^a-zA-Z]basic[^a-zA-Z]'
then 1 else 0 end +
CASE WHEN "other skills"~*'[^a-zA-Z]web[^a-zA-Z]'
then 1 else 0
end as "count",
"first name", "last name",
"reference number",
"date of applying", "city", "state",
"e-mail", "home phone area code",
"home phone number",
"home phone extension" from "applicant"
WHERE "level">=2 AND
("other skills"~*'(^|[^a-zA-Z])unix($|[^a-zA-Z])'
and "other skills"~*'(^|[^a-zA-Z])oracle($|[^a-zA-Z])'
and "other skills"~*'(^|[^a-zA-Z])database($|[^a-zA-Z])')
order by "count" DESC
LIMIT 100;

this is the samle of our original query. We want to make it faster.
Number of criteria could be more than in this one. Our long field is
"other skills".

So can anybody advise us how we can improve performance of the search?

Thank you
vassili.

Responses

Re: Need to improve performance at 2000-06-23 14:53:41 from Mitch Vincent
RE: Need to improve performance at 2000-06-23 16:17:05 from Andrew Snow

Browse pgsql-general by date

	From	Date	Subject
Next Message	Marc Britten	2000-06-23 14:39:04	RE: boolean isn't boolean?
Previous Message	kaiq	2000-06-23 14:20:41	app-index in www.pgsql.com

Browse pgsql-sql by date

	From	Date	Subject
Next Message	Mitch Vincent	2000-06-23 14:29:59	Orderby two different columns
Previous Message	D'Arcy J.M. Cain	2000-06-23 11:40:25	Re: Merging two columns into one