Quick Links

Re: A counter productive conversation about search.

From:	Tino Wildenhain <tino(at)wildenhain(dot)de>
To:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc:	PostgreSQL WWW <pgsql-www(at)postgresql(dot)org>
Subject:	Re: A counter productive conversation about search.
Date:	2006-08-29 05:04:38
Message-ID:	44F3CAE6.4040204@wildenhain.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-www

Joshua D. Drake wrote:
...
> Rolling our own really wouldn't be that hard "if" we can create a
> reasonably smart web page grabber. We have all the tools (tsearch2 and
> pg_pgtrm) to easily do the searches.
>
> So is anyone up for helping develop a page grabber?

Thats not the hardest part but why do we need to grab if the contents
of the pages could be in the database? But admittedly, I don't know
any good CMS w/ postgresql backend. But anyway, grabbing the sources
of the pages while they are published (like the docbook stuff
for the documentation) makes a lot more sense imho. Ditto for the
archives. Its much easier to get an idea of the structure and nature
of the data when you dont have to deal with the final result (e.g. HTML)

So a couple of scripts that fire when mail comes in, documentation
is compiled and when some other publishing takes place could
really help to keep the index in sync w/o having to crawl all sites
over and over again.

Regards
Tino Wildenhain

In response to

A counter productive conversation about search. at 2006-08-29 03:12:28 from Joshua D. Drake

Responses

Re: A counter productive conversation about search. at 2006-08-29 05:37:04 from Oleg Bartunov

Browse pgsql-www by date

	From	Date	Subject
Next Message	Oleg Bartunov	2006-08-29 05:37:04	Re: A counter productive conversation about search.
Previous Message	Greg Sabino Mullane	2006-08-29 04:20:12	Getting better Google search results