Re: mailing list archiver chewing patches

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Matteo Beccati <php(at)beccati(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, David Fetter <david(at)fetter(dot)org>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, Dave Page <dpage(at)pgadmin(dot)org>, Abhijit Menon-Sen <ams(at)toroid(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Tim Bunce <Tim(dot)Bunce(at)pobox(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: mailing list archiver chewing patches
Date: 2010-02-01 14:03:52
Message-ID: 9837222c1002010603u5b51a22dqfb67799dc64241e4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-www

2010/2/1 Matteo Beccati <php(at)beccati(dot)com>:
> On 01/02/2010 10:26, Magnus Hagander wrote:
>>
>> Does the MBOX importer support incremental loading? Because majordomo
>> spits out MBOX files for us already.
>
> Unfortunately the aoximport shell command doesn't support incremental loading.
>
>> One option could be to use SMTP with a subscription as the primary way
>> (and we could set up a dedicated relaying from the mailserver for this
>> of course, so it's not subject to graylisting or anything like that),
>> and then daily or so load the MBOX files to cover anything that was
>> lost?
>
> I guess we could write a script that parses the mbox and adds whatever is missing, as long as we keep it as a last resort if we can't make the primary delivery a fail proof.
>
> My main concern is that we'd need to overcomplicate the thread detection algorithm so that it better deals with delayed messages: as it currently works, the replies to a missing message get linked to the "grand-parent". Injecting the missing message afterwards will put it at the same level as its replies. If it happens only once in a while I guess we can live with it, but definitely not if it happens tens of times a day.

That can potentially be a problem.

Consider the case where message A it sent. Mesasge B is a response to
A, and message C is a response to B. Now assume B is held for
moderation (because the poser is not on the list, or because it trips
some other thing), then message C will definitely arrive before
message B. Is that going to cause problems with this method?

Another case where the same thing will happen is if message delivery
of B gets for example graylisted, or is just slow from sender B, but
gets quickly delivered to the author of message A (because of a direct
CC). In this case, the author of message A may respond to it (making
message D),and this will again arrive before message B because author
A is not graylisted.

So the system definitely needs to deal with out-of-order delivery.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matteo Beccati 2010-02-01 14:10:14 Re: mailing list archiver chewing patches
Previous Message Thom Brown 2010-02-01 13:51:42 Re: Review: listagg aggregate

Browse pgsql-www by date

  From Date Subject
Next Message Matteo Beccati 2010-02-01 14:10:14 Re: mailing list archiver chewing patches
Previous Message Matteo Beccati 2010-02-01 13:36:21 Re: mailing list archiver chewing patches