Re: Simple query: how to optimize

From: "PostgreSQL" <martin(at)portant(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Simple query: how to optimize
Date: 2005-10-29 00:37:11
Message-ID: djug7r$lth$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Postgres is somewhat speed-challenged on aggregate functions.
The most-repeated work-around would be something like:

SELECT u.user_id,
(SELECT activity_date
FROM user_activity
WHERE user_activity.user_id = pp_users.user_id
AND user_activity_type_id = 7
ORDER BY activity_date DESC
LIMIT 1)
FROM pp_users u
WHERE u.userstatus_id <> 4
AND age(u.joined_date) < interval '30 days'

(code above is untested) I've read that aggregate functions are
improved in the 8.1 code. I'm running 8.1beta3 on one machine
but haven't experimented to verify the claimed improvements.

Martin Nickel

"Collin Peters" <cadiolis(at)gmail(dot)com> wrote in message
news:df01c91b0510281453v5c7ed502rfb3757e886046607(at)mail(dot)gmail(dot)com(dot)(dot)(dot)
>I have two tables, one is called 'users' the other is 'user_activity'.
> The 'users' table simply contains the users in the system there is
> about 30,000 rows. The 'user_activity' table stores the activities
> the user has taken. This table has about 430,000 rows and also
> (notably) has a column which tracks the type of activity. 90% of the
> table is type 7 which indicates the user logged into the system.
>
> I am trying to write a simple query that returns the last time each
> user logged into the system. This is how the query looks at the
> moment:
>
> SELECT u.user_id, MAX(ua.activity_date)
> FROM pp_users u
> LEFT OUTER JOIN user_activity ua ON (u.user_id = ua.user_id AND
> ua.user_activity_type_id = 7)
> WHERE u.userstatus_id <> 4
> AND age(u.joined_date) < interval '30 days'
> GROUP BY u.user_id
>
> The above query takes about 5 seconds but I'm wondering how it can be
> optimized. When the query is formatted as above it does use an index
> on the user_id column of the user_activity table... but the cost is
> huge (cost=0.00..1396700.80).
>
> I have tried formatting it another way with a sub-query but it takes
> about the same amount to completed:
>
> SELECT u.user_id, ua.last
> FROM pp_users u
> LEFT OUTER JOIN (SELECT max(activity_date) as last, user_id FROM
> user_activity WHERE user_activity_type_id = 7 GROUP BY user_id) as ua
> ON (u.user_id = ua.user_id)
> WHERE u.userstatus_id <> 4
> AND age(u.joined_date) < interval '30 days'
>
> Can anybody offer any pointers on this scenario?
>
> Regards,
> Collin
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Martin Lesser 2005-10-29 11:10:31 Effects of cascading references in foreign keys
Previous Message Steinar H. Gunderson 2005-10-29 00:12:24 Re: Simple query: how to optimize