From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Tino Wildenhain <tino(at)wildenhain(dot)de> |
Cc: | "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, PostgreSQL WWW <pgsql-www(at)postgresql(dot)org> |
Subject: | Re: A counter productive conversation about search. |
Date: | 2006-08-29 05:37:04 |
Message-ID: | Pine.GSO.4.63.0608290935180.16344@ra.sai.msu.su |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-www |
On Tue, 29 Aug 2006, Tino Wildenhain wrote:
> Joshua D. Drake wrote:
> ...
>> Rolling our own really wouldn't be that hard "if" we can create a
>> reasonably smart web page grabber. We have all the tools (tsearch2 and
>> pg_pgtrm) to easily do the searches.
>>
>> So is anyone up for helping develop a page grabber?
>
> Thats not the hardest part but why do we need to grab if the contents
> of the pages could be in the database? But admittedly, I don't know
> any good CMS w/ postgresql backend. But anyway, grabbing the sources
> of the pages while they are published (like the docbook stuff
> for the documentation) makes a lot more sense imho. Ditto for the
> archives. Its much easier to get an idea of the structure and nature
> of the data when you dont have to deal with the final result (e.g. HTML)
>
> So a couple of scripts that fire when mail comes in, documentation
> is compiled and when some other publishing takes place could
> really help to keep the index in sync w/o having to crawl all sites
> over and over again.
This is exactly what we have on pgsql.ru/db/mw. We use procmail to fire
our backend to process incoming message. This is not a problem, the
most complex thing is a backend.
>
> Regards
> Tino Wildenhain
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | John Hansen | 2006-08-29 07:21:46 | Re: Search out of sync |
Previous Message | Tino Wildenhain | 2006-08-29 05:04:38 | Re: A counter productive conversation about search. |