Skip site navigation (1) Skip section navigation (2)

Optimizer's issue

From: Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>
To: pgsql-performance(at)postgresql(dot)org
Subject: Optimizer's issue
Date: 2009-04-17 04:50:15
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
I have a problem with a part of big query because of incorrect 
estimation. It's easy to emulate the case:

create table a (id bigint, id2 bigint);
create table b (id bigint, id2 bigint);

insert into a (id, id2)
select random() * 100000, random() * 100
from generate_series(1, 100000);

insert into b (id, id2)
select id, case when random() < 0.1 then random() * 100 else id2 end
from a;

alter table a alter column id set statistics 1000;
alter table a alter column id2 set statistics 1000;
alter table b alter column id set statistics 1000;
alter table b alter column id2 set statistics 1000;

analyze a;
analyze b;

explain analyze
select *
from a
 join b on = and b.id2 = a.id2;

"Hash Join  (cost=1161.00..3936.15 rows=1661 width=32) (actual 
time=424.865..1128.194 rows=91268 loops=1)"
"  Hash Cond: (( = AND (a.id2 = b.id2))"
"  ->  Seq Scan on a  (cost=0.00..791.00 rows=100000 width=16) (actual 
time=0.013..197.908 rows=100000 loops=1)"
"  ->  Hash  (cost=791.00..791.00 rows=100000 width=16) (actual 
time=424.777..424.777 rows=100000 loops=1)"
"        ->  Seq Scan on b  (cost=0.00..791.00 rows=100000 width=16) 
(actual time=0.010..197.536 rows=100000 loops=1)"
"Total runtime: 1305.121 ms"

pgsql-performance by date

Next:From: Tom LaneDate: 2009-04-17 15:02:34
Subject: Re: No hash join across partitioned tables?
Previous:From: Craig RingerDate: 2009-04-17 01:22:08
Subject: Re: GiST index performance

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group