Re: Inaccurate row count estimation

From: "Vyacheslav Kalinin" <vka(at)mgcp(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PGSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: Inaccurate row count estimation
Date: 2008-07-15 14:41:30
Message-ID: 9b1af80e0807150741u41968008rd8ccc66a475ac03@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks for the reply, Tom.

After tracing through this I see that the problem is that we don't have
> statistics for inheritance trees, and so you're getting a default
> estimate for the selectivity of the join condition.
>

I might be wrong but I suspect that the inheritance is not the only reason
here. If I change the table definitions to:

create table pinfo_p00 (
pid integer,
constraint pk_pinfo_p00 primary key (pid),
constraint cc_pinfo_p00_pid check(pid > 0 and pid < 100000)
);

create table pinfo_p01 (
pid integer,
constraint pk_pinfo_p01 primary key (pid),
constraint cc_pinfo_p01_pid check(pid >= 100000 and pid < 200000)
);

and create a view pinfo, or just do a query with subselect:

explain analyze
select *
from contacts c
left join (
select * from pinfo_p00
union all
select * from pinfo_p01
) pi on (pi.pid = c.cpid)
where c.pid = 200 ;

the row-count assessment doesn't seem to be different:

QUERY PLAN
Nested Loop Left Join (cost=4.56..514.25 rows=3896 width=16) (actual
time=0.125..3.976 rows=40 loops=1)
Join Filter: (pinfo_p00.pid = c.cpid)
-> Bitmap Heap Scan on contacts c (cost=4.56..100.34 rows=39 width=12)
(actual time=0.069..0.421 rows=40 loops=1)
Recheck Cond: (pid = 200)
-> Bitmap Index Scan on ix_contacts_pid (cost=0.00..4.55 rows=39
width=0) (actual time=0.042..0.042 rows=40 loops=1)
Index Cond: (pid = 200)
-> Append (cost=0.00..10.59 rows=2 width=4) (actual time=0.033..0.061
rows=1 loops=40)
-> Index Scan using pk_pinfo_p00 on pinfo_p00 (cost=0.00..5.29
rows=1 width=4) (actual time=0.011..0.015 rows=0 loops=40)
Index Cond: (pinfo_p00.pid = c.cpid)
-> Index Scan using pk_pinfo_p01 on pinfo_p01 (cost=0.00..5.29
rows=1 width=4) (actual time=0.012..0.015 rows=0 loops=40)
Index Cond: (pinfo_p01.pid = c.cpid)
Total runtime: 4.341 ms

It scares me a bit as it seems that innocent-looking combination of union's
and join's could destroy the subsequent plan completely.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Daniel Johnson 2008-07-15 14:42:07 Re: [pgsql-advocacy] Pg booth staffing at OSCON
Previous Message Tom Lane 2008-07-15 14:36:21 Re: C-procedure crashed in Postgres 8.3.3 when using 'text' variable (WinXP)