why is postgres estimating so badly?

From: "Luis Alberto Amigo Navarro" <lamigo(at)atc(dot)unican(dot)es>
To: <pgsql-sql(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: why is postgres estimating so badly?
Date: 2002-07-17 10:17:40
Message-ID: 01b901c22d7b$2e9e9f90$cab990c1@atc.unican.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql

I have a query and estimations and results don´t look similar, here is explain analyze:

NOTICE: QUERY PLAN:

Sort (cost=12443.90..12443.90 rows=1 width=93) (actual time=505331.94..505332.67 rows=175 loops=1)
-> Aggregate (cost=12443.88..12443.89 rows=1 width=93) (actual time=472520.29..505326.48 rows=175 loops=1)
-> Group (cost=12443.88..12443.89 rows=1 width=93) (actual time=472307.31..485173.92 rows=325302 loops=1)
-> Sort (cost=12443.88..12443.88 rows=1 width=93) (actual time=472307.24..473769.79 rows=325302 loops=1)
-> Nested Loop (cost=12439.25..12443.87 rows=1 width=93) (actual time=103787.68..441614.43 rows=325302 loops=1)
-> Hash Join (cost=12439.25..12440.64 rows=1 width=85) (actual time=103733.76..120916.86 rows=325302 loops=1)
-> Seq Scan on nation (cost=0.00..1.25 rows=25 width=15) (actual time=7.81..8.72 rows=25 loops=1)
-> Hash (cost=12439.25..12439.25 rows=1 width=70) (actual time=103722.25..103722.25 rows=0 loops=1)
-> Nested Loop (cost=0.00..12439.25 rows=1 width=70) (actual time=95.43..100162.91 rows=325302 loops=1)
-> Nested Loop (cost=0.00..12436.23 rows=1 width=62) (actual time=84.91..47502.93 rows=325302 loops=1)
-> Nested Loop (cost=0.00..12412.93 rows=4 width=24) (actual time=66.86..8806.01 rows=43424 loops=1)
-> Seq Scan on part (cost=0.00..12399.00 rows=1 width=4) (actual time=24.88..4076.81 rows=10856 loops=1)
-> Index Scan using partsupp_pkey on partsupp (cost=0.00..13.89 rows=4 width=20) (actual time=0.20..0.34 rows=4 loops=10856)
-> Index Scan using l_partsupp_index on lineitem (cost=0.00..6.02 rows=1 width=38) (actual time=0.20..0.61 rows=7 loops=43424)
-> Index Scan using supplier_pkey on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.08..0.10 rows=1 loops=325302)
-> Index Scan using orders_pkey on orders (cost=0.00..3.22 rows=1 width=8) (actual time=0.85..0.87 rows=1 loops=325302)
Total runtime: 505563.85 msec

estimated 12000msec

here is the query:
SELECT
nation,
o_year,
CAST((sum(amount))AS NUMERIC(10,2))AS sum_profit
FROM(
SELECT
nation.name AS nation,
EXTRACT(year FROM orders.orderdate) AS o_year,
lineitem.extendedprice*(1-lineitem.discount)-partsupp.supplycost*lineitem.quantity AS amount
FROM
part,
supplier,
lineitem,
partsupp,
orders,
nation
WHERE
supplier.suppkey=lineitem.suppkey
AND partsupp.suppkey=lineitem.suppkey
AND partsupp.partkey=lineitem.partkey
AND part.partkey=lineitem.partkey
AND orders.orderkey=lineitem.orderkey
AND supplier.nationkey=nation.nationkey
AND part.name LIKE '%green%'
) AS profit
GROUP BY
nation,
o_year
ORDER BY
nation,
o_year DESC;

lineitem is about 6M rows
partsupp 800K rows
part 200K rows

any advice?
Thanks and regards

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rod Taylor 2002-07-17 11:41:19 Re: ELOGs doubled up
Previous Message Hannu Krosing 2002-07-17 10:15:03 Re: DROP COLUMN

Browse pgsql-sql by date

  From Date Subject
Next Message Bruno Wolff III 2002-07-17 12:06:46 Re: Indexing UNIONs
Previous Message Achilleus Mantzios 2002-07-17 10:15:07 Re: How to find out if an index is unique?