Re: Why is this query running slowly?

From: Yang Zhang <yanghatespam(at)gmail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Why is this query running slowly?
Date: 2011-09-15 09:07:49
Message-ID: CAKxBDU9yOpjxQytG8dXS8TWyoFjUeURroXuakxovJi=XCSeYxg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Sep 15, 2011 at 1:22 AM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> On 15 Září 2011, 9:53, Yang Zhang wrote:
>> I have a simple query that's been running for a while, which is fine,
>> but it seems to be running very slowly, which is a problem:
>>
>> mydb=# explain select user_id from den where user_id not in (select
>> duid from user_mappings) and timestamp between '2009-04-01' and
>> '2010-04-01';
>>
>>            QUERY PLAN
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>  Seq Scan on den  (cost=711.58..66062724212.74 rows=22634720 width=4)
>>    Filter: (("timestamp" >= '2009-04-01 00:00:00'::timestamp without
>> time zone) AND ("timestamp" <= '2010-04-01 00:00:00'::timestamp
>> without time zone) AND (NOT (SubPlan 1)))
>>    SubPlan 1
>>      ->  Materialize  (cost=711.58..1223.38 rows=36780 width=4)
>>            ->  Seq Scan on user_mappings  (cost=0.00..530.80 rows=36780
>> width=4)
>>
>> user_mappings is fairly small:
>
> The problem is that for each of the 22634720 rows in "den" a separate
> uncorrelated subquery (a seq scan on user_mappings) has to be executed.
> Althogh the subquery is not very expensive, multiplied by the number of
> rows in "den" the total cost is extreme.
>
> The only solution is to get rid of the "not in" subquery - try to turn it
> to a join like this:
>
> SELECT user_id FROM den LEFT JOIN user_mappings ON (user_id = duid)
> WHERE (timestamp BETWEEN '2009-04-01' AND '2010-04-01')
>  AND (duid IS NULL)
>
> That should give the same result I guess.

This worked great, thank you. Too bad the planner isn't smart enough
to do this yet!

>
> Tomas
>
>

--
Yang Zhang
http://yz.mit.edu/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tomas Vondra 2011-09-15 09:29:30 Re: Why is this query running slowly?
Previous Message Toby Corkindale 2011-09-15 08:26:25 Re: Why is this query running slowly?