Re: no mailing list hits in google

From: Andres Freund <andres(at)anarazel(dot)de>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL WWW <pgsql-www(at)lists(dot)postgresql(dot)org>
Subject: Re: no mailing list hits in google
Date: 2019-08-29 14:50:13
Message-ID: 20190829145013.j77ni3ubn5ys2mdb@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-www

Hi,

On 2019-08-29 13:12:00 +0200, Magnus Hagander wrote:
> On Wed, Aug 28, 2019 at 7:45 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> > Hi,
> >
> > On 2019-08-28 19:09:40 +0200, Magnus Hagander wrote:
> > > It blocks /list/ which has the subjects only.
> >
> > Yea. But there's no way to actually get to all the individual messages
> > without /list/? Sure, some will be linked to from somewhere else, but
> > without the content below /list/, most won't be reached?
> >
>
> That is indeed a good point. But it has been that way for many years, so
> something must've changed. We last modified this in 2013....

Hm. I guess it's possible that most pages were found due to the
next/prev links in individual messages, once one of them is linked from
somewhere externally. Any chance there's enough logs around to see
from where to where the indexers currently move?

> I wonder if we can inject these into Google using a sitemap. I think that
> should work -- will need some investigation on exactly how to do it, as
> sitemaps also have individual restrictions on the number of urls per file,
> and we do have quite a few messages.

Hm. You mean in addition to allowing /list/ or solely?

> > Why is that /list/ exclusion there in the first place?

> Because there are basically infinite number of pages in that space, due to
> the fact that you can pick an arbitrary point in time to view from.

You mean because of the per-day links, that aren't really per-day? I
think the number of links due to that would still be manageable traffic
wise? Or are they that expensive to compute? Perhaps we could make the
"jump to day" links smarter in some way? Perhaps by not including
content for the following days in the per-day pages?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-08-29 14:55:35 Re: no mailing list hits in google
Previous Message Jeevan Ladhe 2019-08-29 14:41:04 Re: block-level incremental backup

Browse pgsql-www by date

  From Date Subject
Next Message Andres Freund 2019-08-29 14:55:35 Re: no mailing list hits in google
Previous Message Liaqat Andrabi 2019-08-29 14:48:26 Update PostgreSQL User Groups Listing