Skip site navigation (1) Skip section navigation (2)

Re: Full text indexing preformance! (long)

From: "Mitch Vincent" <mitch(at)huntsvilleal(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Full text indexing preformance! (long)
Date: 2000-05-29 19:57:20
Message-ID: 005e01bfc9a8$1a3f2da0$0300000a@doot.org (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi Tom, thanks for your reply..

I increased BLKSZ to 32k and re-compiled, then I imported all the resumes
(some of which I couldn't get before) and this problem completly
disappeared. The query is very fast now (.0.039792 seconds to be exact)..

One thing I did run into was this...
In my paging system I only have a need for 10 records at a time so I LIMIT
the query. The problem comes when I need to get a total of all the records
that matched the query (as a good search engine, I must tell people how many
records were found).. I can't count() and LIMIT in the same query, so I'm
forced to do 2 queries, one with count() and one without.

An example :

select a.appcode, a.firstname, a.middlename, a.lastname, a.state, a.degree1,
a.d1date, a.degree2, a.d2date, a.salary, a.skill1, a.skill2, a.skill3,
a.objective, a.employer, a.sic1, a.sic2, a.sic3, a.prefs1, a.prefs2, a.sells
from applicants as a,applicants_states as s, applicants_resumes as
ar,resumes_fti as rf where a.status = 'A' and lower(a.firstname) ~
lower('^a') and s.rstate='AL' and rf.string ~'^engineer' and
a.app_id=s.app_id and ar.app_id=a.app_id and rf.id=ar.oid limit 10 offset 0

Vs.

select count (a.app_id) as total from applicants as a,applicants_states as
s, applicants_resumes as ar,resumes_fti as rf where a.status = 'A' and
lower(a.firstname) ~ lower('^a') and s.rstate='AL' and rf.string
~'^engineer' and a.app_id=s.app_id and ar.app_id=a.app_id and rf.id=ar.oid

However the count() query has to go through the entire record set (which
makes sense) but it takes about 4 or 5  seconds.

The plan for the count() query.

NOTICE:  QUERY PLAN:

Aggregate  (cost=56.61..56.61 rows=1 width=20)
  ->  Nested Loop  (cost=0.00..56.61 rows=1 width=20)
        ->  Nested Loop  (cost=0.00..10.74 rows=1 width=16)
              ->  Nested Loop  (cost=0.00..8.59 rows=1 width=12)
                    ->  Index Scan using resumes_fti_index on resumes_fti rf
(cost=0.00..4.97 rows=1 width=4)
                    ->  Index Scan using applicants_resumes_index on
applicants_resumes ar  (cost=0.00..3.61 rows=1 width=8)
              ->  Index Scan using applicants_app_id on applicants a
(cost=0.00..2.14 rows=1 width=4)
        ->  Index Scan using applicants_states_app_id on applicants_states s
(cost=0.00..45.86 rows=1 width=4)

And the stats :

ProcessQuery
! system usage stats:
!       5.088647 elapsed 4.954981 user 0.125561 system sec
!       [4.976752 user 0.132817 sys total]
!       0/0 [0/0] filesystem blocks in/out
!       0/4607 [0/4846] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [3/3] messages rcvd/sent
!       0/52 [3/57] voluntary/involuntary context switches
! postgres usage stats:
!       Shared blocks:          0 read,          0 written, buffer hit rate
= 100.00%
!       Local  blocks:          0 read,          0 written, buffer hit rate
= 0.00%
!       Direct blocks:          0 read,          0 written
CommitTransactionCommand
proc_exit(0)


The "0/4607 [0/4846] page faults/reclaims" area is greatly increased in this
query that from the other. Is that to be expected? Is there anything else I
can do to get the total number of records matched by the query and still use
LIMIT (I doubt it)?

If there isn't anything I can do, which looks to be the case here, I still
appreciate all the help you've given me..

I look forward to your response. Thanks!

-Mitch







In response to

Responses

pgsql-hackers by date

Next:From: Olivier PRENANTDate: 2000-05-29 20:45:22
Subject: Re: Timestamp data type problems
Previous:From: Lamar OwenDate: 2000-05-29 19:36:57
Subject: Header File cleanup.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group