A counter productive conversation about search.

From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: PostgreSQL WWW <pgsql-www(at)postgresql(dot)org>
Subject: A counter productive conversation about search.
Date: 2006-08-29 03:12:28
Message-ID: 44F3B09C.3010104@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

Hello,

Now that I have effectively slapped myself silly by being rude to Tom
about search. Let me bring up some points about search and see if there
is a way to resolve them.

The problem:

Search really isn't that good. Tom has good results with it, but I am
guessing that because he is looking for specific things, likely just in
archives as I doubt he often searches the documentation ;).

A quick search on google:

site:archives.postgresql.org index bloat

archives.postgresql.org/pgsql-performance/2005-04/msg00617.php
archives.postgresql.org/pgsql-performance/2005-04/msg00594.php
archives.postgresql.org/pgsql-performance/2005-04/msg00608.php

archives.postgresql.org:

http://archives.postgresql.org/pgsql-performance/2005-04/msg00575.php
http://archives.postgresql.org/pgsql-general/2004-12/msg00288.php
http://archives.postgresql.org/pgsql-general/2005-07/msg00186.php

site:www.postgresql.org create index
www.postgresql.org/docs/7.4/static/sql-createindex.html
www.postgresql.org/docs/8.1/static/sql-createindex.html
www.postgresql.org/files/documentation/books/aw_pgsql/node216.html

search.postgresql.org:
http://www.postgresql.org/files/documentation/books/aw_pgsql/node216.html
http://www.postgresql.org/files/documentation/books/pghandbuch/html/sql-createindex.html
http://developer.postgresql.org/~petere/past-events/lsm2003-slides/foil20.html

The first search is "reasonable" between the two, although it does not
appear to correctly follow the thread path.

The second search to me is completely wrong. CREATE INDEX should always
return the current documentation first. I can forgive google for showing
7.4 first because it has been around longer and yet is still widely in use.

I have on multiple occasions brought up the idea of another search
engine. I wrote the pgsql.ru guys and asked if they would share their
code. To their benefit they said they would be willing but didn't have
the time to install it for us. I told them I would be happy to muscle
through it if they would just answer some emails. I never heard back.

Other options include lucene, and rolling our own.

Rolling our own really wouldn't be that hard "if" we can create a
reasonably smart web page grabber. We have all the tools (tsearch2 and
pg_pgtrm) to easily do the searches.

So is anyone up for helping develop a page grabber?

Sincerely,

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Greg Sabino Mullane 2006-08-29 04:20:12 Getting better Google search results
Previous Message Joshua D. Drake 2006-08-29 02:43:12 Re: Search out of sync