Simple query: how to optimize

From: Collin Peters <cadiolis(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Simple query: how to optimize
Date: 2005-10-28 21:53:32
Message-ID: df01c91b0510281453v5c7ed502rfb3757e886046607@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I have two tables, one is called 'users' the other is 'user_activity'.
The 'users' table simply contains the users in the system there is
about 30,000 rows. The 'user_activity' table stores the activities
the user has taken. This table has about 430,000 rows and also
(notably) has a column which tracks the type of activity. 90% of the
table is type 7 which indicates the user logged into the system.

I am trying to write a simple query that returns the last time each
user logged into the system. This is how the query looks at the
moment:

SELECT u.user_id, MAX(ua.activity_date)
FROM pp_users u
LEFT OUTER JOIN user_activity ua ON (u.user_id = ua.user_id AND
ua.user_activity_type_id = 7)
WHERE u.userstatus_id <> 4
AND age(u.joined_date) < interval '30 days'
GROUP BY u.user_id

The above query takes about 5 seconds but I'm wondering how it can be
optimized. When the query is formatted as above it does use an index
on the user_id column of the user_activity table... but the cost is
huge (cost=0.00..1396700.80).

I have tried formatting it another way with a sub-query but it takes
about the same amount to completed:

SELECT u.user_id, ua.last
FROM pp_users u
LEFT OUTER JOIN (SELECT max(activity_date) as last, user_id FROM
user_activity WHERE user_activity_type_id = 7 GROUP BY user_id) as ua
ON (u.user_id = ua.user_id)
WHERE u.userstatus_id <> 4
AND age(u.joined_date) < interval '30 days'

Can anybody offer any pointers on this scenario?

Regards,
Collin

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Havasvölgyi Ottó 2005-10-28 22:13:18 Re: Best way to check for new data.
Previous Message Rodrigo Madera 2005-10-28 21:39:10 Best way to check for new data.