New archives for testing

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: PostgreSQL WWW <pgsql-www(at)postgresql(dot)org>
Subject: New archives for testing
Date: 2012-12-28 14:32:21
Message-ID: CABUevEwmR5+9fc1i1cFVpc1S8XShHwvKopcaKKfq-Txa+_mq_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

After way too much time, in mainly due to long delays between doing
different parts, I think we're finally ready to do some public testing
of the new mailinglist archive site/code. The biggest technical change
is that with this code, the data is now stored in (I know, this is
shocking for a project like ours!) a database, which means we can do a
lot more with the presentation than we could before. The main
differences from before are:

* No more breaking threads at month boundaries. A thread can go on for
any amount of time, containing any amount of messages. Our current
record is 344 messages in a thread. Threading is not perfect, of
course, so there are bound to be some threads - particularly old ones
- that are broken into two. This becomes more visible now that we
don't arbitrarily break *all* threads, but it's fundamentally a
problem with the underlying data not containing all the information we
need.

* A thread is no longer bound to a list. A list is just a "tag" on a
thread, meaning a single thread can be in multiple lists. As soon as a
message is CCed to multiple lists, the whole thread will show up on
them, meaning it's now possible to follow a discussion when it's
moved.

* It's now possible to navigate the whole thread from inside a message
view, using a dropdown listing the whole thing. It's still possible to
navigate with to the next and previous in thread at the bottom of the
message, just like before.

* There's also a "flat view" that shows an entire message thread on
one page, for those who prefer that kind of view. (It works pretty
well in most cases, but becomes a huge page for example on the 344
message thread)

* There is no longer a per-month, per-list sequence number for each
message, that causes havoc if accidentally reset (as has happened a
couple of times in the mhonarc history, when the mbox files have been
edited - intenitionally or not). Instead, the messageid url is the
primary URL for all pages, and it's permanent.

* For SEO reasons mainly, the new archives live under the main website
URL space, starting point is http://www.postgresql.org/list/

* Raw messages and mbox files are now protected with http basic
authentication and a fixed password. In the old archives, the software
worked hard to obfuscate email addresses in the headers and such of
the visible email, but they were all fully visible if you clicked
"raw" or downloaded the mbox, making the antispam measures kind of
pointless. The username and password for the protection is listed in
the password prompt, so it shouldn't be a problem for any manual
navigation, but hopefully enough to keep bots out.

* Contents are no longer updated by a cronjob running every 15
minutes. Instead, email is injected directly into the archives as soon
as it leaves majordomo on the list server, and becomes available
within second(s). It still needs to pass antispam, enter majordomo and
possibly be moderated, so it doesn't mean a second after someone hit
"send" in their MUA, but it should be significantly faster than
before.

Now, having said that, I'd like to see some more people testing it
than the few people I've forced to do it so far. The way to test it is
to go to http://www.postgresql.org/list/ and pick your list. You can
also go to http://www.postgresql.org/message-id/<message-id> to view a
message in a thread - that should also work fine if you just replace
"archives" with "www" in the URL of an existing message in the
archives *if* you were using the message-id based url.

A few things that are known not to work at this point:
* Searching by message-id will not work. This is because the new
search code hasn't been deployed, since it would replace the old one
and can't be launched side by side.

* This also means that searching in general will return search hits on
the old archives site, not the new one, even when searching from the
new one.

* There is no redirect from the old archives. There is code available
so that once we consider it production, all old
archives.postgresql.org URLs will redirect to the new site - directly
to the proper message. But this is not deployed yet, as this is a
partial deploy.

So. Please go test it. And give some feedback.

While there is a lot that can be done to improve the experience, and
we can discuss that endlessly, I would like to ask people for now to
focus on things that *don't work*, or things that are *worse than
before*. Once we've tried to deal with those as well as we can, we can
switch over. Then we can improve things further. But let's not get
bogged down trying to add every new feature now - let's just get it
far enough to make it better than before as a first step.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Bruce Momjian 2012-12-28 23:22:07 Re: New archives for testing
Previous Message Magnus Hagander 2012-12-26 14:39:42 Re: Spam on -general