Re: tsearch2 poor performance

From: Kris Kiger <kris(at)musicrebellion(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: tsearch2 poor performance
Date: 2004-09-24 21:59:10
Message-ID: 415498AE.40000@musicrebellion.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Here is the explain analyze output, funny thing, after I ran josh's
query, mine ran a lot faster....maybe it forced a caching?;

search_test=# explain analyze select count(*) from product where vector
@@ to_tsquery('oil');
Aggregate (cost=6113.09..6113.09 rows=1 width=0) (actual
time=19643.372..19643.376 rows=1 loops=1)
-> Index Scan using vector_idx on product (cost=0.00..6105.58
rows=3001 width=0) (actual time=0.381..18145.917 rows=226357 loops=1)
Index Cond: (vector @@ '\'oil\''::tsquery)
Filter: (vector @@ '\'oil\''::tsquery)
Total runtime: 19643.597 ms

search_test=# explain analyze select count(*) from product where vector
@@ to_tsquery('hydrogen');
Aggregate (cost=6113.09..6113.09 rows=1 width=0) (actual
time=19629.766..19629.769 rows=1 loops=1)
-> Index Scan using vector_idx on product (cost=0.00..6105.58
rows=3001 width=0) (actual time=0.378..18127.573 rows=226868 loops=1)
Index Cond: (vector @@ '\'hydrogen\''::tsquery)
Filter: (vector @@ '\'hydrogen\''::tsquery)
Total runtime: 19629.992 ms

Here is Josh's;

search_test=# explain analyze SELECT count(q) FROM product,
to_tsquery('oil') AS q WHERE vector @@ q;
Aggregate (cost=6150597.03..6150597.03 rows=1 width=32) (actual
time=21769.526..21769.530 rows=1 loops=1)
-> Nested Loop (cost=0.00..6143097.02 rows=3000001 width=32)
(actual time=0.424..20450.208 rows=226357 loops=1)
-> Function Scan on q (cost=0.00..12.50 rows=1000 width=32)
(actual time=0.023..0.031 rows=1 loops=1)
-> Index Scan using vector_idx on product (cost=0.00..6105.58
rows=3000 width=32) (actual time=0.376..18165.415 rows=226357 loops=1)
Index Cond: (product.vector @@ "outer".q)
Filter: (product.vector @@ "outer".q)
Total runtime: 21769.786 ms

Disabling Index usage slowed it down:

search_test=# explain analyze select count(*) from product where vector
@@ to_tsquery('hydrogen');
Aggregate (cost=347259.51..347259.51 rows=1 width=0) (actual
time=24675.933..24675.936 rows=1 loops=1)
-> Seq Scan on product (cost=0.00..347252.00 rows=3001 width=0)
(actual time=0.320..23164.492 rows=226868 loops=1)
Filter: (vector @@ '\'hydrogen\''::tsquery)
Total runtime: 24676.091 ms

Time: 24678.842 ms

search_test=# explain analyze SELECT count(q) FROM product,
to_tsquery('oil') AS q WHERE vector @@ q;
Aggregate (cost=67847264.50..67847264.50 rows=1 width=32) (actual
time=83631.201..83631.204 rows=1 loops=1)
-> Nested Loop (cost=12.50..67839764.50 rows=3000001 width=32)
(actual time=0.214..82294.710 rows=226357 loops=1)
Join Filter: ("outer".vector @@ "inner".q)
-> Seq Scan on product (cost=0.00..339752.00 rows=3000000
width=32) (actual time=0.107..27563.952 rows=3000000 loops=1)
-> Materialize (cost=12.50..22.50 rows=1000 width=32) (actual
time=0.003..0.006 rows=1 loops=3000000)
-> Function Scan on q (cost=0.00..12.50 rows=1000
width=32) (actual time=0.019..0.023 rows=1 loops=1)
Total runtime: 83631.385 ms

Here are the results of stat:

search_test=# select * from stat('select vector from product') order by
ndoc desc, nentry;
word | ndoc | nentry
--------------+---------+---------
anoth | 1187386 | 1477442
bear | 696668 | 780963
take | 675319 | 736410
relat | 491469 | 528259
toward | 490653 | 528369
defin | 490572 | 527099
live | 490538 | 527401
beyond | 490124 | 527957
behind | 490087 | 527735
insid | 489530 | 527074
near | 489504 | 527721
around | 489244 | 526870
mean | 478201 | 512699
complex | 440339 | 468669
light | 438685 | 468140
ball | 438567 | 468168
pit | 438293 | 467807
dress | 438128 | 467260
player | 437633 | 466753
secret | 433279 | 457246
love | 423777 | 442694
give | 423691 | 441305
need | 423336 | 434409
peopl | 423336 | 434409
believ | 423336 | 434409
rememb | 423336 | 434409
howev | 421762 | 434194
real | 419906 | 435074
furthermor | 416672 | 434413
indic | 416508 | 434919
exampl | 416508 | 434919
alway | 415543 | 432861
sometim | 415543 | 432861
see | 410706 | 434586
inde | 408379 | 434283
fruit | 363203 | 381862
cook | 362674 | 381112
graduat | 362444 | 381284
chees | 362358 | 381040
hesit | 307431 | 317550
self | 301001 | 312312
hard | 300138 | 310167
spirit | 299310 | 312092
know | 298246 | 309010
laugh | 294136 | 302392
make | 287633 | 295003
find | 287550 | 294770
goe | 279336 | 287025
team | 228000 | 234703
footbal | 228000 | 234703
void | 227914 | 234681
formless | 227914 | 234681
board | 227907 | 234797
chess | 227907 | 234797
submarin | 227869 | 234727
inferior | 227858 | 234357
viper | 227855 | 234865
cylind | 227847 | 234505
suit | 227822 | 234376
class | 227822 | 234376
action | 227822 | 234376
diskett | 227802 | 234786
roller | 227792 | 234524
coaster | 227792 | 234524
mate | 227785 | 234431
ritual | 227785 | 234431
engin | 227784 | 234575
steam | 227784 | 234575
industri | 227780 | 234312
fire | 227775 | 234532
hydrant | 227775 | 234532
briar | 227769 | 234524
patch | 227769 | 234524
mastadon | 227677 | 234665
defend | 227617 | 234410
blade | 227603 | 234356
razor | 227603 | 234356
cab | 227578 | 234554
driver | 227578 | 234554
cough | 227570 | 234324
syrup | 227570 | 234324
cowboy | 227566 | 234663
chop | 227564 | 234437
pork | 227564 | 234437
ceo | 227557 | 234760
rattlesnak | 227554 | 234323
hell | 227540 | 234313
flavor | 227540 | 234313
maelstrom | 227537 | 234404
mulch | 227531 | 234311
cyprus | 227531 | 234311
tack | 227525 | 234462
carpet | 227525 | 234462
movi | 227505 | 234207
theater | 227505 | 234207
spider | 227466 | 234524
cone | 227463 | 234198
pine | 227463 | 234198
garbag | 227459 | 234207
beer | 227443 | 234077
bottl | 227443 | 234077
polygon | 227438 | 234267
judg | 227425 | 234565
blith | 227409 | 233979
traffic | 227403 | 234051
paper | 227397 | 234028
napkin | 227397 | 234028
apart | 227393 | 233911
build | 227393 | 233911
cocker | 227368 | 233926
spaniel | 227368 | 233926
bay | 227358 | 234261
cargo | 227358 | 234261
order | 227357 | 233885
short | 227357 | 233885
polar | 227326 | 234118
demon | 227324 | 234442
minivan | 227317 | 234292
bulb | 227314 | 234089
fundrais | 227308 | 234235
eggplant | 227306 | 234202
cake | 227299 | 234075
bowl | 227299 | 234110
paycheck | 227295 | 234224
sheriff | 227292 | 234313
turkey | 227271 | 234267
turn | 227265 | 234210
signal | 227265 | 234210
chestnut | 227250 | 234104
hole | 227239 | 233975
puncher | 227239 | 233975
tabloid | 227238 | 234341
microscop | 227236 | 234067
reclin | 227234 | 233946
dolphin | 227231 | 234080
pen | 227222 | 234269
pig | 227222 | 234269
wed | 227221 | 233860
bullfrog | 227211 | 234144
truck | 227208 | 233980
pickup | 227208 | 233980
agent | 227201 | 233840
insur | 227201 | 233840
girl | 227201 | 233934
scout | 227201 | 233934
drill | 227200 | 233986
power | 227200 | 233986
ocean | 227187 | 234211
case | 227173 | 233983
crank | 227173 | 233983
squid | 227169 | 234056
senat | 227167 | 234147
fraction | 227161 | 234065
custom | 227152 | 234128
burglar | 227148 | 234014
grizzli | 227133 | 233955
wheel | 227122 | 233813
asteroid | 227108 | 233928
anomali | 227106 | 234156
acceler | 227103 | 233428
particl | 227103 | 233428
saw | 227082 | 233934
chain | 227082 | 233934
reactor | 227035 | 234061
wedg | 227033 | 234143
photon | 227029 | 234017
deficit | 227029 | 234102
vacuum | 227021 | 233760
cleaner | 227021 | 233760
cashier | 227010 | 233858
scyth | 227001 | 233928
cloud | 226981 | 233569
format | 226981 | 233569
tornado | 226968 | 234058
grand | 226936 | 233730
piano | 226936 | 233730
tripod | 226930 | 233755
tomato | 226928 | 233915
sandwich | 226923 | 233786
earring | 226912 | 233665
train | 226912 | 233712
freight | 226912 | 233712
skyscrap | 226901 | 233755
abstract | 226890 | 233658
mortician | 226883 | 233781
warranti | 226876 | 233935
atom | 226868 | 233467
hydrogen | 226868 | 233467
satellit | 226866 | 233680
corpor | 226858 | 233818
globul | 226853 | 233980
cow | 226832 | 233808
jersey | 226832 | 233808
salad | 226830 | 233400
buzzard | 226804 | 233825
lot | 226794 | 233643
park | 226794 | 233643
prime | 226793 | 233325
minist | 226793 | 233325
clot | 226780 | 233380
blood | 226780 | 233380
tuba | 226765 | 233575
tape | 226749 | 233388
record | 226749 | 233388
line | 226747 | 233574
dancer | 226747 | 233574
nation | 226736 | 233796
bartend | 226653 | 233422
hockey | 226645 | 233178
canyon | 226617 | 233699
ski | 226610 | 233451
lodg | 226610 | 233451
stovepip | 226608 | 233489
crane | 226590 | 233578
sand | 226572 | 233270
grain | 226572 | 233270
dust | 226570 | 233391
bunni | 226570 | 233391
lover | 226564 | 233628
fairi | 226554 | 233743
plaintiff | 226537 | 233563
wheelbarrow | 226520 | 233206
food | 226445 | 233228
stamp | 226445 | 233228
umbrella | 226380 | 233273
avocado | 226375 | 232942
oil | 226357 | 233266
filter | 226357 | 233266
financi | 220105 | 225116
complet | 162829 | 164065
ridicul | 162346 | 163592
handl | 162200 | 163390
singl | 162200 | 163390
single-handl | 162200 | 163390
greedili | 162123 | 163379
careless | 162009 | 163193
somewhat | 161979 | 163205
accur | 161975 | 163228
overwhelm | 161946 | 163107
usual | 161930 | 163158
ostens | 161826 | 163020
lazili | 161809 | 163133
slyli | 161803 | 163149
underhand | 161751 | 162955
non | 161585 | 162823
chalant | 161585 | 162823
non-chal | 161585 | 162823
seldom | 161525 | 162739
accident | 161511 | 162676
almost | 161508 | 162782
often | 161488 | 162733
bare | 161401 | 162659
eager | 161278 | 162513
wise | 161073 | 162341
inexor | 161042 | 162265
feverish | 160805 | 162020
thorough | 160611 | 161823
home | 154672 | 155766
return | 154672 | 155766
lost | 154655 | 155567
glori | 154655 | 155567
start | 154655 | 155567
reminisc | 154655 | 155567
rumin | 154577 | 155776
read | 154529 | 155642
magazin | 154529 | 155642
pray | 154478 | 155748
floor | 154396 | 155477
sweep | 154396 | 155477
nag | 154271 | 155259
feel | 154271 | 155259
remors | 154271 | 155259
procrastin | 154256 | 155371
wake | 154220 | 155397
sleep | 154217 | 155353
panic | 154189 | 155346
get | 154168 | 155253
drunk | 154168 | 155253
stink | 154168 | 155253
hibern | 154158 | 155358
die | 153973 | 155223
fli | 153943 | 155056
rage | 153943 | 155056
flagel | 153916 | 155067
self-flagel | 153916 | 155067
daydream | 153864 | 155043
medit | 153816 | 154935
ceas | 153735 | 154815
exist | 153735 | 154815
joy | 153672 | 154754
beam | 153672 | 154754
trembl | 153656 | 154799
loud | 153635 | 154665
hide | 153592 | 154797
break | 153559 | 154599
coffe | 153559 | 154599
earn | 153538 | 154540
mile | 153538 | 154540
flier | 153538 | 154540
frequent | 153538 | 154540
leav | 153535 | 154730
rejoic | 153226 | 154412
sell | 147231 | 148103
plan | 147046 | 147809
escap | 147046 | 147809
throw | 146973 | 147764
negoti | 146905 | 147704
prenupti | 146905 | 147704
agreement | 146905 | 147704
card | 146892 | 147731
trade | 146892 | 147731
basebal | 146892 | 147731
oper | 146888 | 147787
small | 146888 | 147787
stand | 146888 | 147787
drink | 146881 | 147727
night | 146881 | 147727
steal | 146835 | 147847
pencil | 146835 | 147847
seek | 146816 | 148029
figur | 146801 | 147908
write | 146736 | 147720
letter | 146736 | 147720
recogn | 146723 | 147823
truce | 146684 | 147630
eat | 146670 | 147874
compet | 146647 | 147760
buy | 146642 | 147522
gift | 146642 | 147522
expens | 146642 | 147522
big | 146626 | 147717
fan | 146626 | 147717
fall | 146597 | 147601
assist | 146587 | 147589
requir | 146587 | 147589
chang | 146542 | 147479
heart | 146542 | 147479
conquer | 146542 | 147695
money | 146481 | 147450
borrow | 146481 | 147450
ignor | 146475 | 147643
share | 146415 | 147404
shower | 146415 | 147404
fault | 146413 | 147361
subtl | 146413 | 147361
kind | 146402 | 147492
great | 146397 | 147367
upon | 146396 | 147366
honor | 146396 | 147366
bestow | 146396 | 147366
pee | 146394 | 147477
avoid | 146392 | 147388
contact | 146392 | 147388
pink | 146372 | 147347
slip | 146372 | 147347
aid | 146367 | 147225
teach | 146366 | 147516
sanit | 146361 | 147477
lice | 146360 | 147409
buri | 146360 | 147483
cold | 146357 | 147220
pour | 146357 | 147220
freez | 146357 | 147220
water | 146357 | 147220
sea | 146347 | 147217
deep | 146347 | 147217
fish | 146347 | 147217
organ | 146321 | 147476
grit | 146289 | 147227
satiat | 146251 | 147349
assimil | 146251 | 147377
tri | 146188 | 147200
seduc | 146188 | 147200
reach | 146132 | 147008
understand | 146132 | 147008
brainwash | 146068 | 147158
admir | 146050 | 147021
caricatur | 145989 | 147107
deriv | 145941 | 146790
pervers | 145941 | 146790
satisfact | 145941 | 146790
moral | 145854 | 146733
lectur | 145854 | 146733
befriend | 145799 | 146963
learn | 145758 | 146666
lesson | 145758 | 146666
play | 145738 | 146706
pinochl | 145738 | 146706
peek | 145698 | 146737
danc | 145555 | 146637
fashion | 78762 | 79203
muddi | 78750 | 79236
hypnot | 78747 | 79204
childlik | 78579 | 79002
loyal | 78575 | 79056
mysteri | 78554 | 79047
annoy | 78532 | 79032
slow | 78517 | 78996
twist | 78515 | 79016
unstabl | 78510 | 78945
feder | 78501 | 78967
rever | 78501 | 79008
wrinkl | 78495 | 78965
rude | 78495 | 78975
boil | 78493 | 78972
high | 78481 | 78940
paid | 78481 | 78940
geosynchron | 78478 | 78931
greasi | 78476 | 78961
cosmopolitan | 78459 | 78903
fat | 78438 | 78935
inciner | 78429 | 78896
dot | 78426 | 78864
polka | 78426 | 78864
polka-dot | 78426 | 78864
outer | 78415 | 78910
phoni | 78411 | 78895
pathet | 78405 | 78869
purpl | 78405 | 78895
frozen | 78403 | 78886
nearest | 78396 | 78879
statesmanlik | 78386 | 78830
dirt | 78376 | 78828
encrust | 78376 | 78828
dirt-encrust | 78376 | 78828
sur | 78371 | 78895
obsequi | 78369 | 78805
salti | 78360 | 78834
imagin | 78356 | 78808
south | 78325 | 78787
american | 78325 | 78787
load | 78318 | 78832
righteous | 78282 | 78760
fractur | 78281 | 78737
educ | 78278 | 78682
colleg | 78278 | 78682
college-educ | 78278 | 78682
mitochondri | 78269 | 78745
treacher | 78265 | 78697
spartan | 78252 | 78707
felin | 78244 | 78713
ravish | 78242 | 78765
patern | 78241 | 78701
psychot | 78238 | 78693
shabbi | 78228 | 78685
dreamlik | 78224 | 78642
loath | 78221 | 78653
self-loath | 78221 | 78653
world | 78203 | 78658
call | 78183 | 78610
so-cal | 78183 | 78610
radioact | 78182 | 78623
alleg | 78178 | 78664
cantanker | 78159 | 78620
makeshift | 78159 | 78648
gentl | 78156 | 78609
fri | 78143 | 78648
linguist | 78141 | 78586
overrip | 78134 | 78572
varig | 78132 | 78609
vapor | 78105 | 78548
impromptu | 78104 | 78569
actual | 78104 | 78592
self-actu | 78104 | 78592
frighten | 78100 | 78544
molten | 78100 | 78567
gratifi | 78098 | 78528
bur | 78094 | 78563
hairi | 78092 | 78563
foreign | 78083 | 78569
tatter | 78050 | 78518
frustrat | 78044 | 78474
stoic | 78036 | 78503
eurasian | 78033 | 78513
proverbi | 78031 | 78519
green | 78024 | 78450
skinni | 78023 | 78524
familiar | 78016 | 78477
optim | 78006 | 78483
bohemian | 78002 | 78500
overpr | 77983 | 78411
pompous | 77955 | 78460
difficult | 77938 | 78375
raspi | 77924 | 78461
soggi | 77912 | 78381
resplend | 77910 | 78351
blotch | 77910 | 78380
fals | 77908 | 78409
infect | 77907 | 78399
magnific | 77898 | 78350
snooti | 77897 | 78422
moron | 77886 | 78362
moldi | 77865 | 78370
precis | 77860 | 78331
crispi | 77856 | 78324
smelli | 77813 | 78279
tempor | 77810 | 78244
alaskan | 77808 | 78258
elus | 77775 | 78245
miser | 77772 | 78232
flatul | 77761 | 78201
orbit | 77723 | 78157
mean-spirit | 77660 | 78113
flabbi | 77649 | 78110
nuclear | 77609 | 78069
go | 15532 | 15545
made | 1 | 1
america | 1 | 1

If you need anything else, let me know!

Kris

Oleg Bartunov wrote:

>Kris,
>
>could you post 'explain analyze' output ?
>Also, could you disable index usage (set enable_indexscan=off)
>and rerun search using tsearch2 ?
>
>also, could you run 'stat' function to see frequency distribution
>of words. See http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes
>for details.
>
>Oleg
>
>
>>Hi all. I am doing some work with tsearch2 and am not sure what to
>>expect out of it, performance wise. Here is my setup:
>>
>> Table "public.product"
>> Column | Type | Modifiers
>>-------------+----------+-------------------------------------------------
>> description | text |
>> product_id | integer | default nextval('product_product_id_seq'::text)
>> vector | tsvector |
>>Indexes:
>> "vector_idx" gist (vector)
>>Triggers:
>> tsvectorupdate BEFORE INSERT OR UPDATE ON product FOR EACH ROW EXECUTE PROCEDURE tsearch2('vector', 'description')
>>
>>This table has 3,000,000 rows in it. Each description field has roughly 50 characters. There are fewer than ten thousand distinct words in my 3,000,000 rows. The vector was filled using the description fields values. I ran a vacuum full analyze before executing any of my queries.
>>
>>Here are a couple of tests I performed using the tsearch index and like;
>>
>>search_test=# select count(*) from product where vector @@ to_tsquery('oil');
>> count
>>--------
>> 226357
>>(1 row)
>>
>>Time: 191056.230 ms
>>
>>search_test=# select count(*) from product where vector @@ to_tsquery('hydrogen');
>> count
>>--------
>> 226868
>>(1 row)
>>
>>Time: 306411.957 ms
>>
>>search_test=# select count(*) from product where description like '% oil %';
>> count
>>--------
>> 226357
>>(1 row)
>>
>>Time: 38426.851 ms
>>
>>search_test=# select count(*) from product where description like '% hydrogen %';
>> count
>>--------
>> 226868
>>(1 row)
>>
>>Time: 38265.421 ms
>>
>>
>>Both of the likes are using a sequential scan and both of the tsearch queries use the gist index. Did I miss a configuration parameter, are these queries incorrectly using tsearch2,or is this tsearch2's average performance? Thanks in advance for the input!
>>
>>Kris
>>
>>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Gregory S. Williamson 2004-09-25 00:31:52 Re: tsearch2 poor performance
Previous Message Joshua D. Drake 2004-09-24 21:35:01 Re: View Different Schemas