Re: mailing list archiver chewing patches

From: David Fetter <david(at)fetter(dot)org>
To: Matteo Beccati <php(at)beccati(dot)com>
Cc: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, Magnus Hagander <magnus(at)hagander(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, Abhijit Menon-Sen <ams(at)toroid(dot)org>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Tim Bunce <Tim(dot)Bunce(at)pobox(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: mailing list archiver chewing patches
Date: 2010-01-14 18:36:09
Message-ID: 20100114183609.GB6859@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-www

On Thu, Jan 14, 2010 at 03:08:22PM +0100, Matteo Beccati wrote:
> Il 14/01/2010 14:39, Dimitri Fontaine ha scritto:
> >Matteo Beccati<php(at)beccati(dot)com> writes:
> >>I've extended AOX with a trigger that takes care of filling a separate table
> >>that's used to display the index pages. The new table also stores threading
> >>information (standard headers + Exchange headers support) and whether or not
> >>the email has attachments.
> >>
> >>Please check the updated PoC: http://archives.beccati.org/
> >
> >Looks pretty good, even if some thread are still separated (this one for
> >example), and the ordering looks strange.
>
> This one is separated as the first one is not in the archive yet,
> thus to the system there are multiple parent messages. It shouldn't
> happen with full archives. About sorting, here's the query I've used
> (my first try with CTEs incidentally):
>
> WITH RECURSIVE t (mailbox, uid, date, subject, sender,
> has_attachments, parent_uid, idx, depth) AS (
> SELECT mailbox, uid, date, subject, sender, has_attachments,
> parent_uid, uid::text, 1
> FROM arc_messages
> WHERE parent_uid IS NULL AND mailbox = 15
> UNION ALL
> SELECT a.mailbox, a.uid, a.date, a.subject, a.sender,
> a.has_attachments, a.parent_uid, t.idx || '.' || a.uid::text,
> t.depth + 1
> FROM t JOIN arc_messages a USING (mailbox)
> WHERE t.uid = a.parent_uid
> ) SELECT * FROM t ORDER BY idx

> Any improvements to sorting are welcome :)

This is probably better written as:

WITH RECURSIVE t (
mailbox,
uid,
date,
subject,
sender,
has_attachments,
"path"
)
AS (
SELECT
mailbox,
uid,
date,
subject,
sender,
has_attachments,
ARRAY[uid]
FROM
arc_messages
WHERE
parent_uid IS NULL AND
mailbox = 15
UNION ALL
SELECT
a.mailbox,
a.uid,
a.date,
a.subject,
a.sender,
a.has_attachments,
t."path" || a.uid,
FROM
t JOIN arc_messages a
ON (
a.mailbox = t.mailbox AND
t.uid = a.parent_uid
)
)
SELECT *
FROM t
ORDER BY "path";

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2010-01-14 18:40:32 Re: EXPLAIN, utility statement parameters, and recent plpgsql changes
Previous Message Hitoshi Harada 2010-01-14 18:32:32 Re: review: More frame options in window functions

Browse pgsql-www by date

  From Date Subject
Next Message Matteo Beccati 2010-01-15 19:38:50 Re: mailing list archiver chewing patches
Previous Message Magnus Hagander 2010-01-14 15:09:43 Re: mailing list archiver chewing patches