From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Greg Stark <stark(at)mit(dot)edu> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: robots.txt on git.postgresql.org |
Date: | 2013-07-09 15:50:52 |
Message-ID: | 51DC315C.4080806@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 07/09/2013 11:24 AM, Greg Stark wrote:
> I note that git.postgresql.org's robot.txt refuses permission to crawl
> the git repository:
>
> http://git.postgresql.org/robots.txt
>
> User-agent: *
> Disallow: /
>
>
> I'm curious what motivates this. It's certainly useful to be able to
> search for commits. I frequently type git commit hashes into Google to
> find the commit in other projects. I think I've even done it in
> Postgres before and not had a problem. Maybe Google brought up github
> or something else.
>
> Fwiw the reason I noticed this is because I searched for "postgresql
> git log" and the first hit was for "see the commit that fixed the
> issue, with all the gory details" which linked to
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a6e0cd7b76c04acc8c8f868a3bcd0f9ff13e16c8
>
> This was indexed despite the robot.txt because it was linked to from
> elsewhere (Hence the interesting link title). There are ways to ask
> Google not to index pages if that's really what we're after but I
> don't see why we would be.
It's certainly not universal. For example, the only reason I found
buildfarm client commit d533edea5441115d40ffcd02bd97e64c4d5814d9, for
which the repo is housed at GitHub, is that Google has indexed the
buildfarm commits mailing list on pgfoundry. Do we have a robots.txt on
the postgres mailing list archives site?
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Dimitri Fontaine | 2013-07-09 15:56:27 | Re: robots.txt on git.postgresql.org |
Previous Message | Fabien COELHO | 2013-07-09 15:42:24 | Re: Patch to add regression tests for SCHEMA |