Skip site navigation (1) Skip section navigation (2)

Re: Wich hardware suits best for large full-text indexed

From: Ericson Smith <eric(at)did-it(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Bill Moran <wmoran(at)potentialtech(dot)com>,Diogo Biazus <diogo(at)ikono(dot)com(dot)br>, pgsql-general(at)postgresql(dot)org
Subject: Re: Wich hardware suits best for large full-text indexed
Date: 2004-03-31 16:35:17
Message-ID: 406AF345.1070206@did-it.com (view raw or flat)
Thread:
Lists: pgsql-general
Oleg Bartunov wrote:

>it's very different story ! There are hundreds *standalone* search engine
>based on inverted indices, but you don't have *native* access to metadata
>stored in database, so your search collection isn't consistent.
>tsearch2 was developed specially for online update and consistency
>(think about access control to documents). If you're not care about that
>you don't need tsearch2. btw, tsearch2 scaled much better with long
>queries.
>
>  
>
Actually swish-e has excellent support for metadata. This allows you to 
nicely partition your indices, or to search only user-defined parts 
based on as much custom meta-data as you'd care to define.  Granted 
tsearch2 allows you to have *live* updates to the index. But we usually 
reindex nightly and that tends to be good enough for most cases.

- Ericson Smith

>
>
>  
>
>>- Ericson
>>
>>Bill Moran wrote:
>>
>>    
>>
>>>Diogo Biazus wrote:
>>>
>>>      
>>>
>>>>Hi folks,
>>>>
>>>>I have a database using tsearch2 to index 300 000 documents.
>>>>I've already have optimized the queries, and the database is vacuumed
>>>>on a daily basis.
>>>>The stat function tells me that my index has aprox. 460 000 unique
>>>>words (I'm using stemmer and a nice stopword list).
>>>>The problem is performance, some queries take more than 10 seconds to
>>>>execute, and I'm not sure if my bottleneck is memory or io.
>>>>The server is a Athlon XP 2000, HD ATA133, 1.5 GB RAM running
>>>>postgresql 7.4.3 over freebsd 5.0 with lots of shared buffers and
>>>>sort_mem...
>>>>
>>>>Does anyone has an idea of a more cost eficient solution?
>>>>How to get a better performance without having to invest some
>>>>astronomicaly high amount of money?
>>>>        
>>>>
>>>This isn't hardware related, but FreeBSD 5 is not a particularly
>>>impressive
>>>performer.  Especially 5.0 ... 5.2.1 would be better, but if you're
>>>shooting
>>>for performance, 4.9 will probably outperform both of them at this
>>>stage of
>>>the game.
>>>
>>>Something to consider if the query tuning that others are helping with
>>>doesn't
>>>solve the problem.  Follow through with that _first_ though.
>>>
>>>However, if you insist on running 5, make sure your kernel is compiled
>>>without
>>>WITNESS ... it speeds things up noticably.
>>>
>>>      
>>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 6: Have you searched our list archives?
>>
>>               http://archives.postgresql.org
>>
>>    
>>
>
>	Regards,
>		Oleg
>_____________________________________________________________
>Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
>Sternberg Astronomical Institute, Moscow University (Russia)
>Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
>phone: +007(095)939-16-83, +007(095)939-23-83
>
>  
>

Attachment: eric.vcf
Description: text/x-vcard (315 bytes)

In response to

pgsql-general by date

Next:From: Bruno Wolff IIIDate: 2004-03-31 16:35:58
Subject: Re: select statement sorting
Previous:From: Karl O. PincDate: 2004-03-31 16:19:08
Subject: Documentation clairification?, CHECK constraints

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group