Re: Optimization idea

From: Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Optimization idea
Date: 2010-04-26 02:52:23
Message-ID: 4BD4FFE7.9090305@dc.baikal.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


> On Thu, Apr 22, 2010 at 10:37 PM, Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru> wrote:
>
>> I don't think this is just an issue with statistics, because the same
>> problem arises when I try executing a query like this:
>>
>
> I'm not sure how you think this proves that it isn't a problem with
> statistics, but I think what you should be focusing on here, looking
> back to your original email, is that the plans that are actually much
> faster have almost as much estimated cost as the slower one. Since
> all your data is probably fully cached, at a first cut, I might try
> setting random_page_cost and seq_page_cost to 0.005 or so, and
> adjusting effective_cache_size to something appropriate.
>
> ...Robert
>
>

Ok. I thougth it's quite obvious because of these two queries. I can't
understand why the estimated rows count is 40040 in the first plan.

test=# explain analyze select * from t2 join t1 on t1.t = t2.t where
t1.t in (2,3,4);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1.09..2319.87 rows=40040 width=32) (actual
time=0.050..356.269 rows=400 loops=1)
Hash Cond: (t2.t = t1.t)
-> Seq Scan on t2 (cost=0.00..1543.00 rows=100100 width=16) (actual
time=0.013..176.087 rows=100100 loops=1)
-> Hash (cost=1.07..1.07 rows=2 width=16) (actual time=0.023..0.023
rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on t1 (cost=0.00..1.07 rows=2 width=16) (actual
time=0.006..0.014 rows=3 loops=1)
Filter: (t = ANY ('{2,3,4}'::bigint[]))
Total runtime: 356.971 ms
(8 rows)

test=# explain analyze select * from t2 join t1 on t1.t = t2.t where
t1.t = 2 union all select * from t2 join t1 on t1.t = t2.t where t1.t =
3 union all select * from t2 join t1 on t1.t = t2.t where t1.t = 4;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..112.42 rows=407 width=32) (actual time=0.048..3.487
rows=400 loops=1)
-> Nested Loop (cost=0.00..47.51 rows=197 width=32) (actual
time=0.045..1.061 rows=200 loops=1)
-> Seq Scan on t1 (cost=0.00..1.06 rows=1 width=16) (actual
time=0.011..0.014 rows=1 loops=1)
Filter: (t = 2)
-> Index Scan using t_idx on t2 (cost=0.00..44.48 rows=197
width=16) (actual time=0.026..0.382 rows=200 loops=1)
Index Cond: (pg_temp_2.t2.t = 2)
-> Nested Loop (cost=0.00..32.67 rows=117 width=32) (actual
time=0.019..0.599 rows=100 loops=1)
-> Seq Scan on t1 (cost=0.00..1.06 rows=1 width=16) (actual
time=0.003..0.006 rows=1 loops=1)
Filter: (t = 3)
-> Index Scan using t_idx on t2 (cost=0.00..30.43 rows=117
width=16) (actual time=0.010..0.211 rows=100 loops=1)
Index Cond: (pg_temp_2.t2.t = 3)
-> Nested Loop (cost=0.00..28.17 rows=93 width=32) (actual
time=0.017..0.534 rows=100 loops=1)
-> Seq Scan on t1 (cost=0.00..1.06 rows=1 width=16) (actual
time=0.005..0.008 rows=1 loops=1)
Filter: (t = 4)
-> Index Scan using t_idx on t2 (cost=0.00..26.18 rows=93
width=16) (actual time=0.007..0.187 rows=100 loops=1)
Index Cond: (pg_temp_2.t2.t = 4)
Total runtime: 4.190 ms
(17 rows)

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Cédric Villemain 2010-04-26 09:33:29 Re: Optimization idea
Previous Message Коротков Александр 2010-04-25 18:22:29 Planner issue on sorting joining of two tables with limit