Re: [HACKERS] please help on query

From: "Luis Alberto Amigo Navarro" <lamigo(at)atc(dot)unican(dot)es>
To: "Jakub Ouhrabka" <jakub(dot)ouhrabka(at)comgate(dot)cz>
Cc: <pgsql-sql(at)postgresql(dot)org>
Subject: Re: [HACKERS] please help on query
Date: 2002-07-12 15:32:50
Message-ID: 019201c229b9$61acfa50$cab990c1@atc.unican.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql

Lineitem is being modified on run time, so creating a temp table don't
solves my problem
The time of creating this table is the same of performing the subselect (or
so I think), it could be done creating a new table, and a new trigger, but
there are already triggers to calculate
lineitem.extendedprice=part.retailprice*lineitem.quantity*(1+taxes)*(1-disco
unt) and to calculate orderstatus in order with linestatus and to calculate
orders.totalprice as sum(extendedprice) where
lineitem.orderkey=new.orderkey. A new trigger in order to insert orderkey if
sum(quantity) where orderkey=new.orderkey might be excessive.
Any other idea?
Thanks And Regards

----- Original Message -----
From: "Jakub Ouhrabka" <jakub(dot)ouhrabka(at)comgate(dot)cz>
To: "Luis Alberto Amigo Navarro" <lamigo(at)atc(dot)unican(dot)es>
Cc: "Manfred Koizar" <mkoi-pg(at)aon(dot)at>; <pgsql-sql(at)postgresql(dot)org>
Sent: Friday, July 12, 2002 1:50 PM
Subject: Re: [SQL] [HACKERS] please help on query

> hi,
>
> avoid subselect: create a temp table and use join...
>
> CREATE TEMP TABLE tmp AS
> SELECT
> lineitem.orderkey
> FROM
> lineitem
> WHERE
> lineitem.orderkey=orders.orderkey
> GROUP BY
> lineitem.orderkey HAVING
> sum(lineitem.quantity)>300;
>
> CREATE INDEX tmp_idx ON tmp (orderkey);
>
> SELECT
> customer.name,
> customer.custkey,
> orders.orderkey,
> orders.orderdate,
> orders.totalprice,
> sum(lineitem.quantity)
> FROM
> customer,
> orders,
> lineitem,
> tmp
> WHERE
> orders.orderkey=tmp.orderkey
> AND customer.custkey=orders.custkey
> AND orders.orderkey=lineitem.orderkey
> GROUP BY
> customer.name,
> customer.custkey,
> orders.orderkey,
> orders.orderdate,
> orders.totalprice
> ORDER BY
> orders.totalprice DESC,
> orders.orderdate;
>
>
> may be the index is not necessary...
>
> kuba
>
>
> > I'm going to keep on asking about another query:
> >
> > SELECT
> > customer.name,
> > customer.custkey,
> > orders.orderkey,
> > orders.orderdate,
> > orders.totalprice,
> > sum(lineitem.quantity)
> > FROM
> > customer,
> > orders,
> > lineitem
> > WHERE
> > exists(
> > SELECT
> > lineitem.orderkey
> > FROM
> > lineitem
> > WHERE
> > lineitem.orderkey=orders.orderkey
> > GROUP BY
> > lineitem.orderkey HAVING
> > sum(lineitem.quantity)>300
> > )
> > AND customer.custkey=orders.custkey
> > AND orders.orderkey=lineitem.orderkey
> > GROUP BY
> > customer.name,
> > customer.custkey,
> > orders.orderkey,
> > orders.orderdate,
> > orders.totalprice
> >
> > ORDER BY
> > orders.totalprice DESC,
> > orders.orderdate;
> >
> > NOTICE: QUERY PLAN:
> >
> > Sort (cost=26923941.97..26923941.97 rows=300061 width=66)
> > -> Aggregate (cost=26851634.86..26896644.05 rows=300061 width=66)
> > -> Group (cost=26851634.86..26889142.52 rows=3000612 width=66)
> > -> Sort (cost=26851634.86..26851634.86 rows=3000612
> > width=66)
> > -> Hash Join (cost=26107574.81..26457309.10
> > rows=3000612 width=66)
> > -> Seq Scan on lineitem
(cost=0.00..222208.25
> > rows=6001225 width=8)
> > -> Hash (cost=26105699.81..26105699.81
> > rows=750000 width=58)
> > -> Hash Join
(cost=7431.00..26105699.81
> > rows=750000 width=58)
> > -> Seq Scan on orders
> > (cost=0.00..26083268.81 rows=750000 width=25)
> > SubPlan
> > -> Aggregate
> > (cost=0.00..17.35 rows=1 width=8)
> > -> Group
> > (cost=0.00..17.34 rows=5 width=8)
> > -> Index Scan
> > using lineitem_pkey on lineitem (cost=0.00..17.33 rows=5 width=8)
> > -> Hash (cost=7056.00..7056.00
> > rows=150000 width=33)
> > -> Seq Scan on customer
> > (cost=0.00..7056.00 rows=150000 width=33)
> >
> > again:
> > orders 1500000 tuples
> > lineitem 6000000 tuples there are 1 to 7 lineitems per orderkey
> > Customer 150000 tuples
> >
> > select attname,n_distinct,correlation from pg_stats where
> > tablename='lineitem';
> > attname | n_distinct | correlation
> > ---------------+------------+-------------
> > orderkey | -0.199847 | 1
> > partkey | 196448 | 0.0223377
> > suppkey | 9658 | -0.00822751
> > linenumber | 7 | 0.17274
> > quantity | 50 | 0.0150153
> > extendedprice | 25651 | -0.00790245
> > discount | 11 | 0.103761
> > tax | 9 | 0.0993771
> > returnflag | 3 | 0.391434
> > linestatus | 2 | 0.509791
> > shipdate | 2440 | 0.0072777
> > commitdate | 2497 | 0.00698162
> > receiptdate | 2416 | 0.00726686
> > shipinstruct | 4 | 0.241511
> > shipmode | 7 | 0.138432
> > comment | 275488 | 0.0188006
> > (16 rows)
> >
> > select attname,n_distinct,correlation from pg_stats where
> > tablename='orders';
> > attname | n_distinct | correlation
> > ---------------+------------+-------------
> > orderkey | -1 | -0.999925
> > custkey | 76309 | 0.00590596
> > orderstatus | 3 | 0.451991
> > totalprice | -1 | -0.00768806
> > orderdate | 2431 | -0.0211354
> > orderpriority | 5 | 0.182489
> > clerk | 1009 | 0.00546939
> > shippriority | 1 | 1
> > comment | -0.750125 | -0.0123887
> >
> > Customer
> > attname | n_distinct | correlation
> > ------------+------------+-------------
> > custkey | -1 | 1
> > name | -1 | 1
> > address | -1 | -0.00510274
> > nationkey | 25 | 0.0170533
> > phone | -1 | -0.0227816
> > acctbal | -0.83444 | -0.00220958
> > mktsegment | 5 | 0.205013
> > comment | -1 | 0.0327827
> >
> > This query takes 12 minutes to run and returns about 50 customers.
> > lineitem.quantity takes values from 1 to 50, so 300 per orderkey is very
> > restrictive
> >
> > May someone help on improving performance?
> > Again thanks in advance
> > Regards
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Lockhart 2002-07-12 15:42:44 CHAR constants
Previous Message Andrew Sullivan 2002-07-12 15:28:20 Re: string cast/compare broken?

Browse pgsql-sql by date

  From Date Subject
Next Message Josh Berkus 2002-07-12 15:56:33 Re: config postgresql.conf??
Previous Message Jie Liang 2002-07-12 15:32:41 config postgresql.conf??