Re: [HACKERS] please help on query

From: "Luis Alberto Amigo Navarro" <lamigo(at)atc(dot)unican(dot)es>
To: "Manfred Koizar" <mkoi-pg(at)aon(dot)at>
Cc: <pgsql-sql(at)postgresql(dot)org>
Subject: Re: [HACKERS] please help on query
Date: 2002-07-12 11:37:47
Message-ID: 012101c22998$8bbe5670$cab990c1@atc.unican.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql

> The cost is now only 1141741215.35 compared to 2777810917708.17
> before; this is an improvement factor of more than 2000. So what's
> your problem? ;-)
>
> Servus
> Manfred
>
In fact planner is estimating incredibly badly, it took only 833msecs now
runs perfectly

I'm going to keep on asking about another query:

SELECT
customer.name,
customer.custkey,
orders.orderkey,
orders.orderdate,
orders.totalprice,
sum(lineitem.quantity)
FROM
customer,
orders,
lineitem
WHERE
exists(
SELECT
lineitem.orderkey
FROM
lineitem
WHERE
lineitem.orderkey=orders.orderkey
GROUP BY
lineitem.orderkey HAVING
sum(lineitem.quantity)>300
)
AND customer.custkey=orders.custkey
AND orders.orderkey=lineitem.orderkey
GROUP BY
customer.name,
customer.custkey,
orders.orderkey,
orders.orderdate,
orders.totalprice

ORDER BY
orders.totalprice DESC,
orders.orderdate;

NOTICE: QUERY PLAN:

Sort (cost=26923941.97..26923941.97 rows=300061 width=66)
-> Aggregate (cost=26851634.86..26896644.05 rows=300061 width=66)
-> Group (cost=26851634.86..26889142.52 rows=3000612 width=66)
-> Sort (cost=26851634.86..26851634.86 rows=3000612
width=66)
-> Hash Join (cost=26107574.81..26457309.10
rows=3000612 width=66)
-> Seq Scan on lineitem (cost=0.00..222208.25
rows=6001225 width=8)
-> Hash (cost=26105699.81..26105699.81
rows=750000 width=58)
-> Hash Join (cost=7431.00..26105699.81
rows=750000 width=58)
-> Seq Scan on orders
(cost=0.00..26083268.81 rows=750000 width=25)
SubPlan
-> Aggregate
(cost=0.00..17.35 rows=1 width=8)
-> Group
(cost=0.00..17.34 rows=5 width=8)
-> Index Scan
using lineitem_pkey on lineitem (cost=0.00..17.33 rows=5 width=8)
-> Hash (cost=7056.00..7056.00
rows=150000 width=33)
-> Seq Scan on customer
(cost=0.00..7056.00 rows=150000 width=33)

again:
orders 1500000 tuples
lineitem 6000000 tuples there are 1 to 7 lineitems per orderkey
Customer 150000 tuples

select attname,n_distinct,correlation from pg_stats where
tablename='lineitem';
attname | n_distinct | correlation
---------------+------------+-------------
orderkey | -0.199847 | 1
partkey | 196448 | 0.0223377
suppkey | 9658 | -0.00822751
linenumber | 7 | 0.17274
quantity | 50 | 0.0150153
extendedprice | 25651 | -0.00790245
discount | 11 | 0.103761
tax | 9 | 0.0993771
returnflag | 3 | 0.391434
linestatus | 2 | 0.509791
shipdate | 2440 | 0.0072777
commitdate | 2497 | 0.00698162
receiptdate | 2416 | 0.00726686
shipinstruct | 4 | 0.241511
shipmode | 7 | 0.138432
comment | 275488 | 0.0188006
(16 rows)

select attname,n_distinct,correlation from pg_stats where
tablename='orders';
attname | n_distinct | correlation
---------------+------------+-------------
orderkey | -1 | -0.999925
custkey | 76309 | 0.00590596
orderstatus | 3 | 0.451991
totalprice | -1 | -0.00768806
orderdate | 2431 | -0.0211354
orderpriority | 5 | 0.182489
clerk | 1009 | 0.00546939
shippriority | 1 | 1
comment | -0.750125 | -0.0123887

Customer
attname | n_distinct | correlation
------------+------------+-------------
custkey | -1 | 1
name | -1 | 1
address | -1 | -0.00510274
nationkey | 25 | 0.0170533
phone | -1 | -0.0227816
acctbal | -0.83444 | -0.00220958
mktsegment | 5 | 0.205013
comment | -1 | 0.0327827

This query takes 12 minutes to run and returns about 50 customers.
lineitem.quantity takes values from 1 to 50, so 300 per orderkey is very
restrictive

May someone help on improving performance?
Again thanks in advance
Regards

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jakub Ouhrabka 2002-07-12 11:50:29 Re: [HACKERS] please help on query
Previous Message eutm 2002-07-12 10:31:18 Bug of PL/pgSQL parser

Browse pgsql-sql by date

  From Date Subject
Next Message Jakub Ouhrabka 2002-07-12 11:50:29 Re: [HACKERS] please help on query
Previous Message frederik nietzsche 2002-07-12 10:29:58 Re: how to inherits the references...