Re: robots.txt on git.postgresql.org

From: Greg Stark <stark(at)mit(dot)edu>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: robots.txt on git.postgresql.org
Date: 2013-07-11 13:43:21
Message-ID: CAM-w4HPdUbND-qA8ho1EB-wvj+tXcX=0H_6JtQNbkd_UZsDmHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 10, 2013 at 9:36 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> We already run this, that's what we did to make it survive at all. The
> problem is there are so many thousands of different URLs you can get
> to on that site, and google indexes them all by default.

There's also https://support.google.com/webmasters/answer/48620?hl=en
which lets us control how fast the Google crawler crawls. I think it's
adaptive though so if the pages are slow it should be crawling slowly

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-07-11 13:50:58 Re: robots.txt on git.postgresql.org
Previous Message KONDO Mitsumasa 2013-07-11 12:29:07 Re: Improvement of checkpoint IO scheduler for stable transaction responses