A challenge for the SQL gurus out there...

From: "Uwe C(dot) Schroeder" <uwe(at)oss4u(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: A challenge for the SQL gurus out there...
Date: 2008-09-07 09:53:15
Message-ID: 200809070253.15422.uwe@oss4u.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


or maybe not and I'm just not getting it.
So here's the scenario:

I have 3 tables

forum: with primary key "id"
forum_thread: again primary key "id" and a foreign key "forum_id" referencing
th primary key of the forum table
forum_post: again primary key "id" with a forign key "thread_id" referencing
the primary key of the forum_thread table

The forum_post table also has a field "date_posted" (timestamp) with an index
on it.

What I need is an efficient way to create overviews (just think about a forum)
I.e. the forum table has 3 records, one for each forum category

I want to get a list looking like

forum id thread_id post_id
1 6 443
2 9 123
3 3 557

The trick is, that I need the latest post (by the date posted column) for each
category (speak forum_id). Due to the keys the forum_thread table has to be
involved.

I've been thinking about this for hours now, but I just can't come up with a
query that will give me 3 records, one for each category showing the latest
post.
I do have a rather expensive query that involves a lot of sorting, but the
forum I'm running has around 40000 posts now and the query takes around 4
seconds - which is unacceptable. So there has to be a better way to query
this.

Currently I'm using a view to assemble a list with the latest post for each
forum thread and then I join that view with the forum categories, sort it and
limit it. The thing is that the sorting takes waaay to long, simply because I
sort a ton of records just to limit them. So my idea was to limit the
resultset before sorting takes place, which would probably cut the query
execution time to milliseconds instead of seconds and it would deliver
predictable results that are not as dependent on number of posts as they are
now.

The number of posts per thread is usually fairly equal. Even the longest
threads won't make it past 1000 posts, so my intention is to sort a maximum
of 1000 records instead of 40000 (due to the join).

It all boils down to me not being able to come up with a query that gives me
the latest post per forum_id.

So any input would be very much appreciated.

Uwe

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Gregory Stark 2008-09-07 10:09:35 Re: A challenge for the SQL gurus out there...
Previous Message Gregory Stark 2008-09-07 09:50:17 Re: recover in single-user backend fails