Skip site navigation (1) Skip section navigation (2)

Re: mailing list archiver chewing patches

From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Matteo Beccati <php(at)beccati(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>,Dimitri Fontaine <dfontaine(at)hi-media(dot)com>,Dave Page <dpage(at)pgadmin(dot)org>, Abhijit Menon-Sen <ams(at)toroid(dot)org>,Alvaro Herrera <alvherre(at)commandprompt(dot)com>,Andrew Dunstan <andrew(at)dunslane(dot)net>,Tim Bunce <Tim(dot)Bunce(at)pobox(dot)com>,pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: mailing list archiver chewing patches
Date: 2010-01-12 20:16:47
Message-ID: 20100112201647.GC18076@oak.highrise.ca (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-www
I'll note that the whole idea of a "email archive" interface might be a
very good "advocacy" project as well.  AOX might not be a perfect fit,
but it could be a good learning experience... Really, all the PG mail
archives need is:

1) A nice normalized DB schema representing mail messages and their
   relations to other message and "recipients" (or "folders")

2) A "injector" that can parse an email message, and de-compose it into
   the various parts/tables of the DB schema, and insert it

3) A nice set of SQL queries to return message, parts, threads,
   folders based on $criteria (search, id, folder, etc)

4) A web interface to view the messages/thread/parts #3 returns

The largest part of this is #1, but a good schema would be a very good
candidate to show of some of PG's more powerful features in a way that
"others" could see (like the movie store sample somewhere) , such as:
  1) full text search
  2) text vs bytea handling (thinking of all the mime parts, and encoding,
     etc)
  3) CTEs, ltree, recursion, etc, for threading/searching
  4) Triggers for "materialized views" (for quick threading/folder queries)
  5) expression indexes

a.

* Matteo Beccati <php(at)beccati(dot)com> [100112 14:56]:

> Having played with it, here's my feedback about AOX:
>
> pros:
> - seemed to be working reliably;
> - does most of the dirty job of parsing emails, splitting parts, etc
> - highly normalized schema
> - thread support (partial?)
>
> cons:
> - directly publishing the live email feed might not be desirable
> - queries might end up being a bit complicate for simple tasks
> - might be not easy to add additional processing in the workflow

> If there isn't a fully usable thread hierarchy I was more thinking to  
> ltree, mainly because I've successfully used it in past and I haven't  
> had enough time yet to look at CTEs. But if performance is comparable I  
> don't see a reason why we shouldn't use them.

> With all that said, I can't promise anything as it all depends on how  
> much spare time I have, but I can proceed with the evaluation if you  
> think it's useful. I have a feeling that AOX is not truly the right tool  
> for the job, but we might be able to customise it to suit our needs. Are  
> there any other requirements that weren't specified?

-- 
Aidan Van Dyk                                             Create like a god,
aidan(at)highrise(dot)ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

In response to

Responses

pgsql-www by date

Next:From: Matteo BeccatiDate: 2010-01-12 20:37:50
Subject: Re: mailing list archiver chewing patches
Previous:From: Magnus HaganderDate: 2010-01-12 20:04:27
Subject: Re: mailing list archiver chewing patches

pgsql-hackers by date

Next:From: Matteo BeccatiDate: 2010-01-12 20:37:50
Subject: Re: mailing list archiver chewing patches
Previous:From: Bruce MomjianDate: 2010-01-12 20:11:54
Subject: Re: Streaming replication status

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group