Re: Postgresql.org search engine.

From: "Dave Page" <dpage(at)vale-housing(dot)co(dot)uk>
To: <oleg(at)sai(dot)msu(dot)su>
Cc: <pgsql-www(at)postgresql(dot)org>
Subject: Re: Postgresql.org search engine.
Date: 2004-01-30 18:07:31
Message-ID: 50076.80.177.99.193.1075486051.squirrel@ssl.vale-housing.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

It's rumoured that Oleg Bartunov once said:
> On Fri, 30 Jan 2004, Dave Page wrote:
>
>> BTW, searching for 'database' really makes it think! Other queries
>> that generate less hits (eg. Mvcc or psqlodbc) seem to be far quicker.
>
> It would think much longer if you search 'pgsql database' :(
> Just tried and got ~100 sec.
>
Meep!

>
> I suggest to include 'postgresql', 'pgsql', 'postgres' into stop words
> list :( btw, you may look at word statistics and let top N words
> as stop words.

OK, I'll look at that after dinner - thanks.

>> I have also added some weighting to the indexed sites to try to give
>> preference to those that are more 'authoritative' and of global
>> interest than others. Any comments or suggestions for changes welcome
>> as always!
>
> Hmm, I thought aspseek has sort of page rank, so let him works.

It does, but I'm trying to give a little preference to results on sites
with maximum appeal (ie. those in English), and the most authoritative
(ie. those that are published docs rather than list archives or user
docs).
Also, bear in mind that by default results are grouped by site on the main
search page, so generally you will see results from *all* sites indexed on
a single page (sorted with the site weighting factored in), but then drill
down into a specific site which is unaffected by the site weighting.
Regards, Dave.

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Josh Berkus 2004-01-30 18:30:30 Re: Postgresql.org search engine.
Previous Message Oleg Bartunov 2004-01-30 17:13:30 Re: Postgresql.org search engine.