Re: Archives too slow

From: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-www(at)postgresql(dot)org
Subject: Re: Archives too slow
Date: 2004-08-29 22:29:02
Message-ID: 20040829192319.A763@ganymede.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www


The reason why they are dynamic was so that we could 'strip' the garbage
out for the search engines, so that they didn't search all of the extra
verbage, only the message itself ... it also meant that if we changed the
format (ie. added a new list to the left menu), we didn't have to
regenerate all the pages ...

I'm open to moving them back to static myself, if nobody objects *shrug*

In fact, if we get rid of the php code altogether, we could go with what
one other suggested, and use a 'light weight' web server instead of apache
... you mention thttpd below, someone else mentioned one called Boa(?) ...
never having used either, I'm flexible either way ...

Does anyone care if I get rid of the .php code? Before I do that,
assuming no ... does anyone know a way of 'hiding' sections of HTML code
from search engines? Right now, we're doing that with the PHP, and I nkow
there is/was a <!-- --> way of doing it, but someone (Oleg?) mentioned
that it isn't very consistent in being honored by search engines ... ?

On Sun, 29 Aug 2004, Oleg Bartunov wrote:

> I always didn't understand why not have static pages for mailing list
> archive if they already generated.
>
> Another thing to consider is using
> lightweight frontend with ability to cache pages generated by heavy backend.
> www.pgsql.ru uses 3-servers setup - frontend (apache+mod_accel),
> backend ( apache + modperl ) and thttpd (very light and fast) for serving
> binary data (images, for example). Only frontend interacts with user, so slow
> clients (bad connectivity) don't bother heavy backend and, consequently,
> db server.
>
>
> Oleg
> On Sun, 29 Aug 2004, Greg Sabino Mullane wrote:
>
>>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>>
>>> I'm going to look at mirroring it onto the same server that runs
>>> ftp.postgresql.org ... archives is the worst site that we run, since its
>>> all a bunch of little flat files, so when it gets indexed by the various
>>> search engines, disk I/O goes through the roof ... we had googlebot index
>>> it once where we had to literally shut down the server for a few minutes
>>> while we waited for load to drop ...
>>
>> On googlebot's page[1], they claim they never go more than once every few
>> seconds. Surely this should not be a problem as long as these are static
>> pages. They also have an email address on that page where you can request
>> that Google go a little gentler on your site.
>>
>> Also, if the pages are static (or static plus simple cgis), have you considered
>> using boa? [2] I use it for a large site that has a lot of static pages
>> and it does great - it's a small, clean, minimal web server written in C.
>>
>> A final option is an accelerator cache [3]. Not sure if PG is using one
>> yet, but it probably should be.
>>
>> [1] http://www.google.com/bot.html
>>
>> [2] http://www.boa.org/
>>
>> [3] http://www.squid-cache.org/Doc/FAQ/FAQ-20.html#what-is-httpd-accelerator
>>
>> - --
>> Greg Sabino Mullane greg(at)turnstep(dot)com
>> PGP Key: 0x14964AC8 200408290732
>> -----BEGIN PGP SIGNATURE-----
>>
>> iD8DBQFBMcEGvJuQZxSWSsgRAhK9AKCLQ4CIW2JQQDg+BFI12DyhaFiFVgCg1DW5
>> VkM2ayTI9OK6M1kIscAwgxs=
>> =1uGd
>> -----END PGP SIGNATURE-----
>>
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 6: Have you searched our list archives?
>>
>> http://archives.postgresql.org
>>
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
> joining column's datatypes do not match
>

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Oleg Bartunov 2004-08-30 08:43:34 Re: Archives too slow
Previous Message Oleg Bartunov 2004-08-29 19:24:49 Re: Archives too slow