Optimizer's issue

From: Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>
To: pgsql-performance(at)postgresql(dot)org
Subject: Optimizer's issue
Date: 2008-04-24 01:14:54
Message-ID: 480FDF0E.3000909@dc.baikal.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I found strange issue in very simple query. Statistics for all columns
is on the level 1000 but I also tried other levels.

create table g (
id bigint primary key,
isgroup boolean not null);

create table a (
groupid bigint references g(id),
id bigint,
unique(id, groupid));

analyze g;
analyze a;

select count(*) from a
294

select count(*) from g
320

explain analyze
select *
from g
join a on a.groupid = g.id
where g.isgroup

Hash Join (cost=5.35..11.50 rows=11 width=25) (actual time=0.261..1.755
rows=294 loops=1)
Hash Cond: (a.groupid = g.id)
-> Seq Scan on a (cost=0.00..4.94 rows=294 width=16) (actual
time=0.047..0.482 rows=294 loops=1)
-> Hash (cost=5.20..5.20 rows=12 width=9) (actual time=0.164..0.164
rows=12 loops=1)
-> Seq Scan on g (cost=0.00..5.20 rows=12 width=9) (actual
time=0.042..0.136 rows=12 loops=1)
Filter: isgroup
Total runtime: 2.225 ms

And this is more interesting:
explain analyze
select *
from g
join a on a.groupid = g.id
where not g.isgroup

Hash Join (cost=9.05..17.92 rows=283 width=25) (actual
time=2.038..2.038 rows=0 loops=1)
Hash Cond: (a.groupid = g.id)
-> Seq Scan on a (cost=0.00..4.94 rows=294 width=16) (actual
time=0.046..0.478 rows=294 loops=1)
-> Hash (cost=5.20..5.20 rows=308 width=9) (actual time=1.090..1.090
rows=308 loops=1)
-> Seq Scan on g (cost=0.00..5.20 rows=308 width=9) (actual
time=0.038..0.557 rows=308 loops=1)
Filter: (NOT isgroup)
Total runtime: 2.126 ms

PostgreSQL 8.3
These queries are part of big query and optimizer put them on the leaf
of query tree, so rows miscount causes a real problem.

Statistics for table a:
id
--
histogram_bounds: {1,40,73,111,143,174,204,484,683,715,753}
correlation: 0.796828

groupid
-------
n_distinct: 12
most_common_vals: {96,98,21,82,114,131,48,44,173,682,752}
most_common_freqs:
{0.265306,0.166667,0.163265,0.136054,0.0884354,0.0782313,0.0714286,0.00680272,0.00680272,0.00680272,0.00680272}
correlation: 0.366704

for table g:
id
--
histogram_bounds: {1,32,64,101,134,166,199,451,677,714,753}
correlation: 1

isgroup
-------
n_distinct: 2
most_common_freqs: {0.9625,0.0375}
correlation: 0.904198

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Albe Laurenz 2008-04-24 08:47:32 Re: Optimizer's issue
Previous Message Joshua D. Drake 2008-04-23 21:17:28 Re: Sun Talks about MySQL