Re: Slow-ish Query Needs Some Love

From: Andy Colson <andy(at)squeakycode(dot)net>
To: Matt White <mattw922(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow-ish Query Needs Some Love
Date: 2010-02-03 18:50:56
Message-ID: 4B69C590.2070709@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2/3/2010 11:17 AM, Matt White wrote:
> On Feb 2, 1:11 pm, a(dot)(dot)(dot)(at)squeakycode(dot)net (Andy Colson) wrote:
>> On 2/2/2010 1:03 PM, Matt White wrote:
>>
>>
>>
>>
>>
>>> On Feb 2, 6:06 am, Edgardo Portal<egportal2(dot)(dot)(dot)(at)yahoo(dot)com> wrote:
>>>> On 2010-02-02, Matt White<mattw(dot)(dot)(dot)(at)gmail(dot)com> wrote:
>>
>>>>> I have a relatively straightforward query that by itself isn't that
>>>>> slow, but we have to run it up to 40 times on one webpage load, so it
>>>>> needs to run much faster than it does. Here it is:
>>
>>>>> SELECT COUNT(*) FROM users, user_groups
>>>>> WHERE users.user_group_id = user_groups.id AND NOT users.deleted AND
>>>>> user_groups.partner_id IN
>>>>> (partner_id_1, partner_id_2);
>>
>>>>> The structure is partners have user groups which have users. In the
>>>>> test data there are over 200,000 user groups and users but only ~3000
>>>>> partners. Anyone have any bright ideas on how to speed this query up?
>>
>>>> Can you avoid running it 40 times, maybe by restructuring the
>>>> query (or making a view) along the lines of the following and
>>>> adding some logic to your page?
>>
>>>> SELECT p.partner_id, ug.user_group_id, u.id, count(*)
>>>> FROM partners p
>>>> LEFT JOIN user_groups ug
>>>> ON ug.partner_id=p.partner_id
>>>> LEFT JOIN users u
>>>> ON u.user_group_id=ug.id
>>>> WHERE NOT u.deleted
>>>> GROUP BY 1,2,3
>>>> ;
>>
>>> Thanks for the suggestion. The view didn't seem to speed things up.
>>> Perhaps we can reduce the number of times it's called, we'll see. Any
>>> additional ideas would be helpful. Thanks.
>>
>> I agree with Edgardo, I think the biggest time saver will be reducing
>> trips to the database.
>>
>> But... do you have an index on users.user_group_id?
>>
>> Does rewriting it change the plan any?
>>
>> SELECT COUNT(*) FROM users
>> inner join user_groups on (users.user_group_id = user_groups.id)
>> where NOT users.deleted
>> AND user_groups.partner_id IN (partner_id_1, partner_id_2);
>>
>> And... it looks like the row guestimate is off a litte:
>>
>> Index Scan using user_groups_partner_id_idx
>> on user_groups
>> (cost=0.00..133.86 rows=3346 width=8)
>> (actual time=0.049..96.992 rows=100001 loops=2)
>>
>> It guessed 3,346 rows, but actually got 100,001. Have you run an
>> analyze on it? If so, maybe bumping up the stats might help?
>>
>> -Andy
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performa(dot)(dot)(dot)(at)postgresql(dot)org)
>> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance
>
> Andy,
>
> I have run analyze, see my query plan in my original post. You'll have
> to forgive me for being a bit of a Postgres noob but what do you mean
> by "bumping up the stats"?

Thats not what I mean. "explain analyze select..." is what you did, and
correct. What I meant was "analyze user_groups".

see:
http://www.postgresql.org/docs/8.4/interactive/sql-analyze.html

an analyze will make PG look at a table, and calc stats on it, so it can
make better guesses. By default analyze only looks at a few rows (well
a small percent of rows) and makes guesses about the entire table based
on those rows. If it guesses wrong, sometimes you need to tell it to
analyze more rows (ie. a bigger percentage of the table).

By "bumping the stats" I was referring to this:

http://wiki.postgresql.org/wiki/Planner_Statistics

I have never had to do it, so dont know much about it. It may or may
not help. Just thought it was something you could try.

-Andy

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Mridula Mahadevan 2010-02-03 20:06:12 Re: Queries within a function
Previous Message Andy Colson 2010-02-03 18:41:39 Re: Optimizing Postgresql server and FreeBSD for heavy read and writes