Re: A counter productive conversation about search.

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Tino Wildenhain <tino(at)wildenhain(dot)de>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, PostgreSQL WWW <pgsql-www(at)postgresql(dot)org>
Subject: Re: A counter productive conversation about search.
Date: 2006-08-29 05:37:04
Message-ID: Pine.GSO.4.63.0608290935180.16344@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Tue, 29 Aug 2006, Tino Wildenhain wrote:

> Joshua D. Drake wrote:
> ...
>> Rolling our own really wouldn't be that hard "if" we can create a
>> reasonably smart web page grabber. We have all the tools (tsearch2 and
>> pg_pgtrm) to easily do the searches.
>>
>> So is anyone up for helping develop a page grabber?
>
> Thats not the hardest part but why do we need to grab if the contents
> of the pages could be in the database? But admittedly, I don't know
> any good CMS w/ postgresql backend. But anyway, grabbing the sources
> of the pages while they are published (like the docbook stuff
> for the documentation) makes a lot more sense imho. Ditto for the
> archives. Its much easier to get an idea of the structure and nature
> of the data when you dont have to deal with the final result (e.g. HTML)
>
> So a couple of scripts that fire when mail comes in, documentation
> is compiled and when some other publishing takes place could
> really help to keep the index in sync w/o having to crawl all sites
> over and over again.

This is exactly what we have on pgsql.ru/db/mw. We use procmail to fire
our backend to process incoming message. This is not a problem, the
most complex thing is a backend.

>
> Regards
> Tino Wildenhain
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Browse pgsql-www by date

  From Date Subject
Next Message John Hansen 2006-08-29 07:21:46 Re: Search out of sync
Previous Message Tino Wildenhain 2006-08-29 05:04:38 Re: A counter productive conversation about search.