Re: Postgresql.org search engine.

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc: Dave Page <dpage(at)vale-housing(dot)co(dot)uk>, josh(at)agliodbs(dot)com, pgsql-www(at)postgresql(dot)org
Subject: Re: Postgresql.org search engine.
Date: 2004-01-31 12:37:26
Message-ID: Pine.GSO.4.58.0401311518220.28603@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Sat, 31 Jan 2004, Marc G. Fournier wrote:

> On Sat, 31 Jan 2004, Oleg Bartunov wrote:
>
> > Marc and Dave,
> >
> > at the same time, could you see how to generating right http headers
> > (LAST-MODIFIED), so search engines could cache documents and don't waste
> > server resources . What I still don't understand is if
> > http://archives.postgresql.org/pgsql-hackers/2004-01/msg00745t.php
> > is static page or dynamic :-? If dynamic I don't see any problem generating
> > headers, if static - you could always use 'touch' hack to set correct
> > last modification date to file.
>
> Huh? The t.php one above was just to show Dave what the search engines
> are seeing (ie. minus the search/banner/links, just the message) ... its
> not part of the system, just a copy of an existing message ...
>
> re last-modified time ... what is wrong with it? According to my browser,
> it is being displayed correctly, or are you still hung up on the fact that
> it doesn't equal the posting date of the message itself? If that is all
> it is, I'm planning on trying something this weekend to get that in place,
> but the last time I tried it didn't work ... again, if you have better

yes, correct http headers are what I'd like to see, many crawlers/spiders
take them into account. Saves bandwidth and server didn't overloaded
I count two attempts, one failed because of incorrect format of date,
and second - because all pages have the same time modification date -
moment of page creating, not the date of posting. Apache's mod_headers
could generate last modified header for static pages using information
about file modification time, so it's possible to use command 'touch'
to get file modification date equal to date of posting. Dynamic pages is
another story and http header should be generated by software responsible
for displaying page.

> software you can recommend then what we are using now to generate the
> archives (mhonarc), please speak up before I go through the trouble of
> regenerating everything all over again ...

Dont know :( We have our mailware, which you've seen on fts.postgresql.org
and soon will appear on www.pgsql.ru, but it's not end user application.

I've seen mailman (http://www.list.org/) which connects somehow with
mhonarc. A wide list of MLM's is available from
http://www.sympa.org/robots.html

Also , Sympa has support for Mhonarc archives - http://www.sympa.org/

>
>
> >
> > Oleg
> >
> > On Fri, 30 Jan 2004, Dave Page wrote:
> >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Marc G. Fournier [mailto:scrappy(at)postgresql(dot)org]
> > > > Sent: 30 January 2004 21:02
> > > > To: Dave Page
> > > > Cc: Marc G. Fournier; Oleg Bartunov; josh(at)agliodbs(dot)com;
> > > > pgsql-www(at)postgresql(dot)org
> > > > Subject: RE: [pgsql-www] Postgresql.org search engine.
> > > >
> > > >
> > > > D'oh ... I was going to say that I didn't think taht was
> > > > possible, but, it just might be ... seems I have a section
> > > > declared twice (note that someone else wrote this originally,
> > > > I've only just begun to understand it to modify it), so the
> > > > second section is overriding the first, but I was only ever
> > > > seeing the first ...
> > >
> > > Huh? You've lost me there...
> > >
> > > > Let me play with this over the weekend, I'll do a 'small
> > > > sample set' that you can look at the messages in, and we can
> > > > go from there ...
> > >
> > > Ok. If you can do it in a directory away from the archives themselves
> > > then I can play if need be without breaking anything by accident...
> > >
> > > /D
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 9: the planner will ignore your desire to choose an index scan if your
> > > joining column's datatypes do not match
> > >
> >
> > Regards,
> > Oleg
> > _____________________________________________________________
> > Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> > Sternberg Astronomical Institute, Moscow University (Russia)
> > Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> > phone: +007(095)939-16-83, +007(095)939-23-83
> >
>
> ----
> Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
> Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

In response to

Browse pgsql-www by date

  From Date Subject
Next Message Oleg Bartunov 2004-01-31 12:45:28 Re: Postgresql.org search engine.
Previous Message Devrim GUNDUZ 2004-01-31 11:26:02 Re: New Event