Re: Archives too slow

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-www(at)postgresql(dot)org
Subject: Re: Archives too slow
Date: 2004-08-30 08:43:34
Message-ID: Pine.GSO.4.58.0408300947400.5201@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Sun, 29 Aug 2004, Marc G. Fournier wrote:

>
> The reason why they are dynamic was so that we could 'strip' the garbage
> out for the search engines, so that they didn't search all of the extra
> verbage, only the message itself ... it also meant that if we changed the
> format (ie. added a new list to the left menu), we didn't have to
> regenerate all the pages ...

You always could use SSI for that.

>
> I'm open to moving them back to static myself, if nobody objects *shrug*
>
> In fact, if we get rid of the php code altogether, we could go with what
> one other suggested, and use a 'light weight' web server instead of apache
> ... you mention thttpd below, someone else mentioned one called Boa(?) ...
> never having used either, I'm flexible either way ...
>

thttpd is nice for simple binary data, but currently we evaluate lighttpd
http://jan.kneschke.de/projects/lighttpd/, which looks realy powerful.
It's faster than thttpd having much more features.

> Does anyone care if I get rid of the .php code? Before I do that,
> assuming no ... does anyone know a way of 'hiding' sections of HTML code
> from search engines? Right now, we're doing that with the PHP, and I nkow
> there is/was a <!-- --> way of doing it, but someone (Oleg?) mentioned
> that it isn't very consistent in being honored by search engines ... ?
>

<!-- --> doesn't help, because comment tags also hide content from browser :)
In principle, smart search engines should understand firm elements
like navigation bar and penalize their weight. We do that, at least.

>
>
>
> On Sun, 29 Aug 2004, Oleg Bartunov wrote:
>
> > I always didn't understand why not have static pages for mailing list
> > archive if they already generated.
> >
> > Another thing to consider is using
> > lightweight frontend with ability to cache pages generated by heavy backend.
> > www.pgsql.ru uses 3-servers setup - frontend (apache+mod_accel),
> > backend ( apache + modperl ) and thttpd (very light and fast) for serving
> > binary data (images, for example). Only frontend interacts with user, so slow
> > clients (bad connectivity) don't bother heavy backend and, consequently,
> > db server.
> >
> >
> > Oleg
> > On Sun, 29 Aug 2004, Greg Sabino Mullane wrote:
> >
> >>
> >> -----BEGIN PGP SIGNED MESSAGE-----
> >> Hash: SHA1
> >>
> >>
> >>> I'm going to look at mirroring it onto the same server that runs
> >>> ftp.postgresql.org ... archives is the worst site that we run, since its
> >>> all a bunch of little flat files, so when it gets indexed by the various
> >>> search engines, disk I/O goes through the roof ... we had googlebot index
> >>> it once where we had to literally shut down the server for a few minutes
> >>> while we waited for load to drop ...
> >>
> >> On googlebot's page[1], they claim they never go more than once every few
> >> seconds. Surely this should not be a problem as long as these are static
> >> pages. They also have an email address on that page where you can request
> >> that Google go a little gentler on your site.
> >>
> >> Also, if the pages are static (or static plus simple cgis), have you considered
> >> using boa? [2] I use it for a large site that has a lot of static pages
> >> and it does great - it's a small, clean, minimal web server written in C.
> >>
> >> A final option is an accelerator cache [3]. Not sure if PG is using one
> >> yet, but it probably should be.
> >>
> >> [1] http://www.google.com/bot.html
> >>
> >> [2] http://www.boa.org/
> >>
> >> [3] http://www.squid-cache.org/Doc/FAQ/FAQ-20.html#what-is-httpd-accelerator
> >>
> >> - --
> >> Greg Sabino Mullane greg(at)turnstep(dot)com
> >> PGP Key: 0x14964AC8 200408290732
> >> -----BEGIN PGP SIGNATURE-----
> >>
> >> iD8DBQFBMcEGvJuQZxSWSsgRAhK9AKCLQ4CIW2JQQDg+BFI12DyhaFiFVgCg1DW5
> >> VkM2ayTI9OK6M1kIscAwgxs=
> >> =1uGd
> >> -----END PGP SIGNATURE-----
> >>
> >>
> >>
> >> ---------------------------(end of broadcast)---------------------------
> >> TIP 6: Have you searched our list archives?
> >>
> >> http://archives.postgresql.org
> >>
> >
> > Regards,
> > Oleg
> > _____________________________________________________________
> > Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> > Sternberg Astronomical Institute, Moscow University (Russia)
> > Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> > phone: +007(095)939-16-83, +007(095)939-23-83
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 9: the planner will ignore your desire to choose an index scan if your
> > joining column's datatypes do not match
> >
>
> ----
> Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
> Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

In response to

Browse pgsql-www by date

  From Date Subject
Next Message John Hansen 2004-08-30 14:05:36 Re: Archives too slow
Previous Message Marc G. Fournier 2004-08-29 22:29:02 Re: Archives too slow