Re: [HACKERS] why is postgres estimating so badly?

From: "Luis Alberto Amigo Navarro" <lamigo(at)atc(dot)unican(dot)es>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-sql(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] why is postgres estimating so badly?
Date: 2002-07-18 08:18:35
Message-ID: 002701c22e33$b68b7dc0$cab990c1@atc.unican.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql


> > AND part.name LIKE '%green%'
>
> It's difficult for the planner to produce a decent estimate for the
> selectivity of an unanchored LIKE clause, since there are no statistics
> it can use for the purpose. We recently changed FIXED_CHAR_SEL in
> src/backend/utils/adt/selfuncs.c from 0.04 to 0.20, which would make
> this particular case come out better. (I believe the estimate would
> work out to about 320, if part is 200K rows; that should be enough to
> produce at least some change of plan.) You could try patching your
> local installation likewise.

Here are the results, worse than before:
NOTICE: QUERY PLAN:

Sort (cost=25209.88..25209.88 rows=1 width=93) (actual
time=1836143.78..1836144.48 rows=175 loops=1)
-> Aggregate (cost=25209.85..25209.87 rows=1 width=93) (actual
time=1803559.97..1836136.47 rows=175 loops=1)
-> Group (cost=25209.85..25209.86 rows=2 width=93) (actual
time=1803348.04..1816093.89 rows=325302 loops=1)
-> Sort (cost=25209.85..25209.85 rows=2 width=93) (actual
time=1803347.97..1804795.41 rows=325302 loops=1)
-> Hash Join (cost=25208.43..25209.84 rows=2 width=93)
(actual time=1744714.61..1772790.19 rows=325302 loops=1)
-> Seq Scan on nation (cost=0.00..1.25 rows=25
width=15) (actual time=13.92..14.84 rows=25 loops=1)
-> Hash (cost=25208.42..25208.42 rows=2
width=78) (actual time=1744603.74..1744603.74 rows=0 loops=1)
-> Nested Loop (cost=0.00..25208.42 rows=2
width=78) (actual time=139.21..1740110.04 rows=325302 loops=1)
-> Nested Loop (cost=0.00..25201.19
rows=2 width=70) (actual time=122.37..1687895.49 rows=325302 loops=1)
-> Nested Loop
(cost=0.00..25187.93 rows=4 width=62) (actual time=121.75..856097.27
rows=325302 loops=1)
-> Nested Loop
(cost=0.00..17468.91 rows=1280 width=24) (actual time=78.43..19698.77
rows=43424 loops=1)
-> Seq Scan on part
(cost=0.00..12399.00 rows=320 width=4) (actual time=29.57..4179.70
rows=10856 loops=1)
-> Index Scan using
partsupp_pkey on partsupp (cost=0.00..15.79 rows=4 width=20) (actual
time=1.17..1.33 rows=4 loops=10856)
-> Index Scan using
l_partsupp_index on lineitem (cost=0.00..6.02 rows=1 width=38) (actual
time=2.83..18.97 rows=7 loops=43424)
-> Index Scan using orders_pkey
on orders (cost=0.00..3.23 rows=1 width=8) (actual time=2.47..2.50 rows=1
loops=325302)
-> Index Scan using supplier_pkey on
supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.08..0.09 rows=1
loops=325302)
Total runtime: 1836375.16 msec

It looks even worse, another advice?, or maybe a query change. here is the
query again:
SELECT
nation,
o_year,
CAST((sum(amount))AS NUMERIC(10,2))AS sum_profit
FROM(
SELECT
nation.name AS nation,
EXTRACT(year FROM orders.orderdate) AS o_year,

lineitem.extendedprice*(1-lineitem.discount)-partsupp.supplycost*lineitem.qu
antity AS amount
FROM
part,
supplier,
lineitem,
partsupp,
orders,
nation
WHERE
supplier.suppkey=lineitem.suppkey
AND partsupp.suppkey=lineitem.suppkey
AND partsupp.partkey=lineitem.partkey
AND part.partkey=lineitem.partkey
AND orders.orderkey=lineitem.orderkey
AND supplier.nationkey=nation.nationkey
AND part.name LIKE '%green%'
) AS profit
GROUP BY
nation,
o_year
ORDER BY
nation,
o_year DESC;

Thanks and regards

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Naeslund(f) 2002-07-18 09:41:51 [PATCH] Win32 native fixes after SSL updates (+more)
Previous Message Zeugswetter Andreas SB SD 2002-07-18 08:01:20 Re: error codes

Browse pgsql-sql by date

  From Date Subject
Next Message Steve Brett 2002-07-18 09:41:25 Re: how do i import my sql query result to a file
Previous Message Rajesh Kumar Mallah. 2002-07-18 06:39:48 Re: Cascading deletions does not seem to work inside PL/PGSQL functions.