From: | Benoit Delbosc <bdelbosc(at)nuxeo(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Bad query plan inside EXISTS clause |
Date: | 2010-03-10 16:59:38 |
Message-ID: | 4B97CFFA.6030000@nuxeo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Tom Lane a écrit :
> Benoit Delbosc <bdelbosc(at)nuxeo(dot)com> writes:
>> I am trying to understand why inside an EXISTS clause the query planner
>> does not use the index:
>
> I'm not sure this plan is as bad as all that. The key point is that the
> planner is expecting 52517 rows that match that users_md5 value (and the
> true number is evidently 51446, so that estimate isn't far off). That's
> about 1/48th of the table. It knows that the EXISTS case can stop as
> soon as it finds one match, so it's betting that a plain seqscan will
> hit a match faster than an index lookup would be able to, ie,
> seqscanning about 48 tuples is faster than one index lookup. This might
> be a bad bet if the users_md5 values are correlated with physical order,
> ie the matches are not randomly scattered but are all towards the end of
> the table.
exact, the data is not randomly scattered but ordered this explains why
in my case seq scan is a bad bet
Barring that, though, it could be a good bet if the table
> isn't swapped in. Which is what the default cost parameters are set
> up to assume.
there are lots of shared buffers and effective memory on this instance,
the query is executed many times I can assume that the table isn't
swapped in right ?
> I suspect your real complaint is that you expect the table to be swapped
> in, in which case what you ought to be doing is adjusting the planner's
> cost parameters. Some playing around here with a similar case suggests
> that even a small reduction in random_page_cost would make it prefer an
> indexscan for this type of situation.
excellent !
Changing the random_page_cost from 4 to 2 do the trick
SET random_page_cost = 2;
EXPLAIN ANALYZE SELECT 1 WHERE EXISTS (SELECT 1 FROM read_acls_cache
WHERE users_md5 = '9bc9012eb29c0bb2ae3cc7b5e78c2acf');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=1.06..1.07 rows=1 width=0) (actual time=0.048..0.048
rows=1 loops=1)
One-Time Filter: $0
InitPlan 1 (returns $0)
-> Index Scan using read_acls_cache_users_md5_idx on
read_acls_cache (cost=0.00..55664.21 rows=52517 width=0) (actual
time=0.045..0.045 rows=1 loops=1)
Index Cond: ((users_md5)::text =
'9bc9012eb29c0bb2ae3cc7b5e78c2acf'::text)
Total runtime: 0.087 ms
(6 rows)
466/0.087 = 5360 thanks !
kind regards
ben
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2010-03-10 21:55:32 | Re: Strange workaround for slow query |
Previous Message | Harald Fuchs | 2010-03-10 16:29:25 | Re: Strange workaround for slow query |