Re: Fixing Google Search on the docs (redux)

From: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
To: Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL WWW <pgsql-www(at)postgresql(dot)org>
Subject: Re: Fixing Google Search on the docs (redux)
Date: 2020-11-18 16:44:01
Message-ID: 9388577f-5fda-97b4-d38e-02851547d5f5@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On 11/18/20 11:20 AM, Dave Page wrote:
> I was looking at our analytic data, and saw that the vast majority of
> inbound traffic to the docs, hits the 9.1 version. We've known this has
> been an issue for years and have tried various remedies, clearly none of
> which are working.
>
> Should we try an experiment for a couple of months, in which we simply
> block anything that matches \/docs\/((\d+)|(\d.\d))\/ in robots.txt?
> It's a much more drastic option, but at least it might force Google into
> indexing the latest doc version with the highest priority.

If we're going down this road, I would suggest borrowing a concept from
the Django Project documentation which has a similar issue to us. In
their codebase, use a <link> tag with rel="canonical" to point to the
latest version of docs on their page[1].

So for example, given 3.1 is their latest release, you will find
something similar to this:

<link rel="canonical"
href="https://docs.djangoproject.com/en/3.1/ref/templates/builtins/">

From a quick test of searching various Django concepts, it seems that
the 3.1 pages tend to turn up first.

Our equivalent would be "current".

Jonathan

[1]
https://developers.google.com/search/docs/advanced/crawling/consolidate-duplicate-urls

Attachment Content-Type Size
OpenPGP_0xF1049C729F1C6527.asc application/pgp-keys 12.4 KB

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Magnus Hagander 2020-11-18 17:28:49 Re: Fixing Google Search on the docs (redux)
Previous Message Dave Page 2020-11-18 16:20:05 Fixing Google Search on the docs (redux)