Skip site navigation (1) Skip section navigation (2)

Re: 8.1 planner problem ?

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Gianluca Alberici <gianluca(at)abinetworks(dot)biz>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: 8.1 planner problem ?
Date: 2007-11-27 10:12:38
Message-ID: 474BED96.2020208@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-performance
Gianluca Alberici wrote:
> I have ran into an interesting problem with 8.1 and i would like anybody
> to explain me if there's a problem with the planner or there's a problem
> with myself. In both cases a solution is welcome. The following query:
> 
> SELECT sum(qty) FROM
>     _abi_main_pof_r ampr
>     inner join _abi_main_items ami on ampr.itemid=ami.id
>     inner join _abi_main_pof_t ampt on ampr.poftid=ampt.poftid
>     inner join _abi_main_contacts amc on ampt.contactid=amc.codest
> WHERE
>     ampt.doctypeid in ('BCA','BSC')
>     and amc.descr ilike '%SOCO%'
>     and ampr.type=0;
> 
> is explain analyze'd as follows
> 
> Aggregate  (cost=1220.65..1220.66 rows=1 width=4) (actual 
> time=95937.824..95937.824 rows=1 loops=1)
>   ->  Nested Loop  (cost=0.00..1220.65 rows=1 width=4) (actual 
> time=3695.250..95936.292 rows=1503 loops=1)
>         ->  Nested Loop  (cost=0.00..1214.64 rows=1 width=8) (actual 
> time=3695.229..95924.697 rows=1503 loops=1)
>               Join Filter: ("inner".poftid = "outer".poftid)
>               ->  Nested Loop  (cost=0.00..79.16 rows=1 width=4) (actual 
> time=0.039..22.547 rows=2437 loops=1)
>                     Join Filter: ("inner".contactid = "outer".codest)
>                     ->  Seq Scan on _abi_main_contacts amc 
> (cost=0.00..1.29 rows=1 width=4) (actual time=0.029..0.034 rows=1 loops=1)
>                           Filter: ((descr)::text ~~* '%SOCO%'::text)
>                     ->  Seq Scan on _abi_main_pof_t ampt 
> (cost=0.00..77.53 rows=27 width=8) (actual time=0.006..15.820 rows=2702 
> loops=1)
>                           Filter: (((doctypeid)::text = 'BCA'::text) OR 
> ((doctypeid)::text = 'BSC'::text))
>               ->  Seq Scan on _abi_main_pof_r ampr  (cost=0.00..1132.81 
> rows=214 width=12) (actual time=0.034..35.986 rows=8271 loops=2437)
>                     Filter: ("type" = 0)

The problem is in the estimate of that seq scan of _abi_main_pof_r 
table. The planner estimates that there's 214 rows with type=0, but in 
reality there's 8271. That misestimate throws off the rest of the plan.

> ...The same query, but with a condition change as ampr.type != 1 instead
> of ampr.type=0
> 
> SELECT sum(qty) FROM
>     _abi_main_pof_r ampr
>     inner join _abi_main_items ami on ampr.itemid=ami.id
>     inner join _abi_main_pof_t ampt on ampr.poftid=ampt.poftid
>     inner join _abi_main_contacts amc on ampt.contactid=amc.codest
> WHERE
>     ampt.doctypeid in ('BCA','BSC')
>     and amc.descr ilike '%SOCO%'
>     and ampr.type != 1;
> 
> 
> is explain analyze'd as follows:
> 
> Aggregate  (cost=1446.13..1446.14 rows=1 width=4) (actual 
> time=81.609..81.609 rows=1 loops=1)
>   ->  Nested Loop  (cost=77.60..1446.12 rows=2 width=4) (actual 
> time=22.597..80.944 rows=1503 loops=1)
>         ->  Nested Loop  (cost=77.60..1434.12 rows=2 width=8) (actual 
> time=22.577..72.785 rows=1503 loops=1)
>               Join Filter: ("inner".contactid = "outer".codest)
>               ->  Seq Scan on _abi_main_contacts amc  (cost=0.00..1.29 
> rows=1 width=4) (actual time=0.030..0.036 rows=1 loops=1)
>                     Filter: ((descr)::text ~~* '%SOCO%'::text)
>               ->  Hash Join  (cost=77.60..1427.52 rows=425 width=12) 
> (actual time=22.536..69.034 rows=8271 loops=1)
>                     Hash Cond: ("outer".poftid = "inner".poftid)
>                     ->  Seq Scan on _abi_main_pof_r ampr 
> (cost=0.00..1132.81 rows=42571 width=12) (actual time=0.035..37.045 
> rows=8271 loops=1)
>                           Filter: ("type" <> 1)

Here you can see that the planner is estimating that there's 42571 rows 
with type <> 1, which is also way off the reality in the other 
direction, but this time the plan it chooses happens to be better.

Run ANALYZE. The planner needs to have statistics on how many rows of 
each type there is to come up with the proper plan.

-- 
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

In response to

pgsql-performance by date

Next:From: clusterDate: 2007-11-27 16:33:36
Subject: Query only slow on first run
Previous:From: Gianluca AlbericiDate: 2007-11-27 07:24:20
Subject: 8.1 planner problem ?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group