Re: Slow-ish Query Needs Some Love

From: Andy Colson <andy(at)squeakycode(dot)net>
To: Matt White <mattw922(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow-ish Query Needs Some Love
Date: 2010-02-02 20:11:34
Message-ID: 4B6886F6.4080402@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2/2/2010 1:03 PM, Matt White wrote:
> On Feb 2, 6:06 am, Edgardo Portal<egportal2(dot)(dot)(dot)(at)yahoo(dot)com> wrote:
>> On 2010-02-02, Matt White<mattw(dot)(dot)(dot)(at)gmail(dot)com> wrote:
>>
>>> I have a relatively straightforward query that by itself isn't that
>>> slow, but we have to run it up to 40 times on one webpage load, so it
>>> needs to run much faster than it does. Here it is:
>>
>>> SELECT COUNT(*) FROM users, user_groups
>>> WHERE users.user_group_id = user_groups.id AND NOT users.deleted AND
>>> user_groups.partner_id IN
>>> (partner_id_1, partner_id_2);
>>
>>> The structure is partners have user groups which have users. In the
>>> test data there are over 200,000 user groups and users but only ~3000
>>> partners. Anyone have any bright ideas on how to speed this query up?
>>
>> Can you avoid running it 40 times, maybe by restructuring the
>> query (or making a view) along the lines of the following and
>> adding some logic to your page?
>>
>> SELECT p.partner_id, ug.user_group_id, u.id, count(*)
>> FROM partners p
>> LEFT JOIN user_groups ug
>> ON ug.partner_id=p.partner_id
>> LEFT JOIN users u
>> ON u.user_group_id=ug.id
>> WHERE NOT u.deleted
>> GROUP BY 1,2,3
>> ;
>
> Thanks for the suggestion. The view didn't seem to speed things up.
> Perhaps we can reduce the number of times it's called, we'll see. Any
> additional ideas would be helpful. Thanks.

I agree with Edgardo, I think the biggest time saver will be reducing
trips to the database.

But... do you have an index on users.user_group_id?

Does rewriting it change the plan any?

SELECT COUNT(*) FROM users
inner join user_groups on (users.user_group_id = user_groups.id)
where NOT users.deleted
AND user_groups.partner_id IN (partner_id_1, partner_id_2);

And... it looks like the row guestimate is off a litte:

Index Scan using user_groups_partner_id_idx
on user_groups
(cost=0.00..133.86 rows=3346 width=8)
(actual time=0.049..96.992 rows=100001 loops=2)

It guessed 3,346 rows, but actually got 100,001. Have you run an
analyze on it? If so, maybe bumping up the stats might help?

-Andy

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2010-02-02 20:25:28 Re: System overload / context switching / oom, 8.3
Previous Message Ing. Marcos Orti­z Valmaseda 2010-02-02 19:59:07 Re: Queries within a function