Re: A challenge for the SQL gurus out there...

From: "Uwe C(dot) Schroeder" <uwe(at)oss4u(dot)com>
To: pgsql-general(at)postgresql(dot)org
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: A challenge for the SQL gurus out there...
Date: 2008-09-07 19:31:49
Message-ID: 200809071231.49379.uwe@oss4u.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sunday 07 September 2008, Gregory Stark wrote:
> "Uwe C. Schroeder" <uwe(at)oss4u(dot)com> writes:
> > I want to get a list looking like
> >
> > forum id thread_id post_id
> > 1 6 443
> > 2 9 123
> > 3 3 557
>
> ...
>
> > It all boils down to me not being able to come up with a query that gives
> > me the latest post per forum_id.
>
> In a situation like this I would probably denormalize the tables slightly
> by adding a form_id key to the individual posts. That would make it hard to
> ever move a thread from one forum to another, though not impossible, but
> would help in this case as well as any other time you want to do an
> operation on all posts in a forum regardless of thread.
>
> If you add that column then you could index <form_id,date> and get the
> result you're looking for instantly with a DISTINCT ON query (which is a
> Postgres SQL extension).
>
> SELECT DISTINCT ON (form_id)
> forum_id, thread_id, post_id
> FROM thread
> ORDER BY forum_id, date DESC
>
> (actually you would have to make the index on <form_id, date DESC> or make
> both columns DESC in the query and then re-order them in an outer query)
>
> Alternatively you could have a trigger on posts which updates a
> last_updated field on every thread (and possibly a recent_post_id) then you
> could have a query on forums which pulls the most recently updated thread
> directly without having to join on form_post at all. That would slow down
> inserts but speed up views -- possibly a good trade-off for a forum system.

Thanks Gregory.
Just to put my final solution on the list: I ended up with a combined approach
of what you suggested:
I added the forum_id to the posts table and created 2 triggers: one that sets
the forum_id in the posts table to the forum_id in the threads table on
insert (therefor no change in the application was necessary).
The second trigger is to overcome the downside of adding the forum_id to the
posts table. On an update to forum_thread.forum_id the trigger updates all
posts in that thread to reflect the change in forum_id. That way one can just
move the whole thread by changing the forum_id and the posts are moved along
by the trigger.

Very nice! The query time is now 198ms instead of up to 48seconds !!!

Thanks for the idea

Uwe

In response to

Browse pgsql-general by date

  From Date Subject
Next Message c k 2008-09-07 20:32:05 secure connections
Previous Message Andrew Dunstan 2008-09-07 17:39:01 Re: [HACKERS] New shapshot RPMs (Sep 7 2008) are ready for testing