Re: Postgresql.org search engine.

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Dave Page <dpage(at)vale-housing(dot)co(dot)uk>
Cc: pgsql-www(at)postgresql(dot)org
Subject: Re: Postgresql.org search engine.
Date: 2004-01-30 17:13:30
Message-ID: Pine.GSO.4.58.0401302007180.19778@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Fri, 30 Jan 2004, Dave Page wrote:

> Hi Oleg,
>
> > -----Original Message-----
> > From: Oleg Bartunov [mailto:oleg(at)sai(dot)msu(dot)su]
> > Sent: 30 January 2004 16:03
> > To: Dave Page
> > Cc: pgsql-www(at)postgresql(dot)org
> > Subject: Re: [pgsql-www] Postgresql.org search engine.
> >
> >
> > I'd recommend to use ispell dictionaries, so 'databases' and
> > 'database'
> > will produce the same results.
>
> Thanks, installed.
>
> BTW, searching for 'database' really makes it think! Other queries that
> generate less hits (eg. Mvcc or psqlodbc) seem to be far quicker.

It would think much longer if you search 'pgsql database' :(
Just tried and got ~100 sec.

This is feature of search engines based on inverted indices.
tsearch2 does just the other way - the more words in query the faster
searching.

I suggest to include 'postgresql', 'pgsql', 'postgres' into stop words
list :( btw, you may look at word statistics and let top N words
as stop words.

>
> I have also added some weighting to the indexed sites to try to give
> preference to those that are more 'authoritative' and of global interest
> than others. Any comments or suggestions for changes welcome as always!

Hmm, I thought aspseek has sort of page rank, so let him works.

>
> # Primary sites
> SiteWeight http://www.postgresql.org/ 100
> SiteWeight http://advocacy.postgresql.org/ 100
> SiteWeight http://jdbc.postgresql.org/ 100
> SiteWeight http://developer.postgresql.org/ 100
>
> # Authoritiative project sites
> SiteWeight http://gborg.postgresql.org/ 75
> SiteWeight http://pgadmin.postgresql.org/ 75
> SiteWeight http://phppgadmin.sourceforge.net/ 75
>
> # User contributed stuff
> SiteWeight http://techdocs.postgresql.org/ 50
> SiteWeight http://archives.postgresql.org/ 50
>
> # Outside but reliable
> SiteWeight http://www.varlena.com/ 25
>
> # And the rest...
> SiteWeight http://www.postgresql.cl/ 0
> SiteWeight http://postgresql.ok.cz/ 0
> SiteWeight http://www.postgresql.jp/ 0
> SiteWeight http://pgsql-fr.tuxfamily.org/ 0
> SiteWeight http://www.linuxshare.ru/ 0
> SiteWeight http://www.postgres.de/ 0
> SiteWeight http://www.pgsqldb.org/ 0
> SiteWeight http://www.postgresql.org.br/ 0
>
> Regards, Dave.
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Dave Page 2004-01-30 18:07:31 Re: Postgresql.org search engine.
Previous Message Marc G. Fournier 2004-01-30 17:07:27 Re: Postgresql.org search engine.