Re: FTI is really really slow; what am I doing wrong?

From: newsreader(at)mediaone(dot)net
To: "Paul C(dot)" <ulive1x(at)hotmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: FTI is really really slow; what am I doing wrong?
Date: 2001-08-22 15:03:01
Message-ID: 20010822110300.A21186@dragon.universe
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Did you vacuum after
populating the tables?
If not you should do it

On Wed, Aug 22, 2001 at 11:08:55AM -0400, Paul C. wrote:
> Greetings,
> I am trying to test out the performance of the contrib/fulltextindex
> package and I am getting horrid performance results.
> The Setup:
> I created a simple table, ST (id SERIAL, body varchar(1024), which is to be
> searched. I created the ST_FTI table, trigger and indices as per
> instructions in the FTI readme and C file. To populate the table, I took a
> flat text version of 'War and Peace' I found on the net, broke it up into
> sentences and inserted each sentence into ST as a row. So I have about
> 38,000 sentences and my ST_FTI table is about 2 million rows.
> The Test:
> There is exactly one sentence (row) that has the strings 'Newton' and
> 'Kepler' in it. That is my target. For a straight select on ST:
> select * from st where body ~* 'newton' and body ~* 'kepler';
> the cost is 1100.41
> BUT for an query using the FTI indices:
> select s.* from st s, st_fti f1, st_fti f2 where f1.string
> ~ '^kepler' and f2.string ~ '^newton' and s.oid = f1.id
> and s.oid = f2.id;
> the cost becomes a staggering 80628.92!!! The plans are pasted at the end
> of this message.
> Now, I have all the indices created (on id of st_fti, on string of st_fti
> and on oid of st). I cannot figure out why this is so much worse than the
> straight query. Indeed, the cost to look up a single string in the st_fti
> table is way high:
> select * from st_fti where string ~ '^kepler';
> costs 36703.40, AND its doing a Seq Scan on st_fti, even though an index
> exists.
> What am I doing wrong? Is it the sheer size of the st_fti table that is
> causing problems? Any help would be greatly appreciated.
> Thanks,
> Paul C.
>
> FTI search
> NOTICE: QUERY PLAN:
> Merge Join (cost=80046.91..80628.92 rows=110 width=28)
> -> Sort (cost=41827.54..41827.54 rows=19400 width=24)
> -> Hash Join (cost=1992.80..40216.39 rows=19400 width=24)
> -> Seq Scan on st_fti f2 (cost=0.00..36703.40 rows=19400
> width=4)
> -> Hash (cost=929.94..929.94 rows=34094 width=20)
> -> Seq Scan on st s (cost=0.00..929.94 rows=34094
> width=20)
> -> Sort (cost=38219.37..38219.37 rows=19400 width=4)
> -> Seq Scan on st_fti f1 (cost=0.00..36703.40 rows=19400 width=4)
> EXPLAIN
>
> Plain search:
> NOTICE: QUERY PLAN:
> Seq Scan on st (cost=0.00..1100.41 rows=1 width=16)
> EXPLAIN
>
>
> _________________________________________________________________
> Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)

Browse pgsql-general by date

  From Date Subject
Next Message Bruno Wolff III 2001-08-22 15:08:39 Join questions
Previous Message Tom Lane 2001-08-22 14:34:52 Re: problems transfering databases