Re: mailing list archiver chewing patches

From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: Matteo Beccati <php(at)beccati(dot)com>, Abhijit Menon-Sen <ams(at)toroid(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Tim Bunce <Tim(dot)Bunce(at)pobox(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: mailing list archiver chewing patches
Date: 2010-01-12 09:05:56
Message-ID: 87bpgzbsln.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-www

Dave Page <dpage(at)pgadmin(dot)org> writes:
> 2010/1/12 Matteo Beccati <php(at)beccati(dot)com>:
>> So, I've decided to spend a bit more time on this and here is a proof of concept web app that displays mailing list archives reading from the AOX database:
>>
>> http://archives.beccati.org/
>
> Seems to work.

Hehe, nice a beginning!

> So just to put this into perspective and give anyone paying attention
> an idea of the pain that lies ahead should they decide to work on
> this:
>
> - We need to import the old archives (of which there are hundreds of
> thousands of messages, the first few years of which have, umm, minimal
> headers.

Anyone having a local copy of this in his mailboxes? At some point there
were some NNTP gateway, so maybe there's a copy this way.

> - We need to generate thread indexes

We have CTEs :)

> - We need to re-generate the original URLs for backwards compatibility

I guess the message-id one ain't the tricky one... and it should be
possible to fill a relation table like
monharc_compat(message_id, list, year, month, message_number);

Then we'd need some help from the webserver (rewrite rules I guess) so
that the current URL is transformed to call a catch-all script:
http://archives.postgresql.org/pgsql-xxx/YYYY-MM/msg01234.php
-> http://archives.postgresql.org/compat.php?l=xxx&y=YYYY&m=MM&n=01234

In that compat.php script you then issue the following query or the like
to get the message_id, then use the newer infrastructure to get to
display it:

SELECT message_id
FROM monharc_compat
WHERE list = ? and year = ? and month = ? and message_number = ?;

Regards,
--
dim

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Takahiro Itagaki 2010-01-12 09:06:22 Pretty printed trigger in psql
Previous Message Simon Riggs 2010-01-12 08:40:03 Re: Hot Standy introduced problem with query cancel behavior

Browse pgsql-www by date

  From Date Subject
Next Message Magnus Hagander 2010-01-12 09:30:41 Re: mailing list archiver chewing patches
Previous Message Dave Page 2010-01-12 04:54:03 Re: mailing list archiver chewing patches