The nested view from hell - Restricting a subquerry

From: Bryce Nesbitt <bryce1(at)obviously(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: The nested view from hell - Restricting a subquerry
Date: 2007-07-22 03:10:55
Message-ID: 46A2CABF.3080108@obviously.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

I've got a legacy app with a hefty performance problem. The basic
problem is stupid design. It takes 10-15 seconds of CPU time to look up
an invoice.
Basically it's trying to mash up extra columns on an otherwise simple
query, and those extra columns are subtotals. Simplified (this looks
best in a fixed width font):

SELECT max(order_view.order_id),max(order_view.invoice_id)
,sum(order_view.mileage)
FROM (SELECT order_id,invoice_id, 0 as miles FROM eg_order
UNION
SELECT order_id,0 , miles FROM eg_order_line)
order_view GROUP BY order_view.order_id;

A select by order_id is fast. The problem is the application uses
"select * from view where invoice_id=x", and the second part of the
UNION returns all possible rows in the database. These get filtered out
later, but at considerable performance hit.

Is there a way to get the "where invoice_id=x" into the subquery?
"select distinct order_id from eg_order where invoice_id=x" would do it.
I can't redesign the view, because it all goes into an object relational
mapper that thinks it's a real table.

-Bryce Nesbitt

stage=# \d eg_invoice_summary_view
View "public.eg_invoice_summary_view"
Column | Type | Modifiers
----------------+------------------------+-----------
invoice_id | integer |
cso_id | integer |
period_id | integer |
account_id | integer |
invoice_number | character varying(192) |
invoice_date | date |
amount | numeric |
tax | bigint |
invoice_style | integer |
plan_name | character varying(128) |
View definition:
SELECT i.invoice_id, i.cso_id, i.period_id, i.account_id,
i.invoice_number, i.invoice_date, i.amount, sum(tax.tax_amount) AS tax,
i.invoice_style, i.plan_name
FROM ( SELECT i.invoice_id, i.cso_id, i.period_id, i.account_id,
i.invoice_number, i.invoice_date, sum(o.amount) AS amount,
i.invoice_style, i.plan_name
FROM eg_invoice i
LEFT JOIN *eg_order_summary_view* o ON i.invoice_id = o.invoice_id
GROUP BY i.invoice_id, i.cso_id, i.period_id, i.account_id,
i.invoice_number, i.invoice_date, i.invoice_style, i.plan_name) i
LEFT JOIN eg_invoice_tax tax ON i.invoice_id = tax.invoice_id
GROUP BY i.invoice_id, i.cso_id, i.period_id, i.account_id,
i.invoice_number, i.invoice_date, i.amount, i.invoice_style, i.plan_name;

stage=# \d eg_order_summary_view
View "public.eg_order_summary_view"
Column | Type | Modifiers
------------+--------------------------+-----------
order_id | integer |
d | "unknown" |
cso_id | integer |
invoice_id | integer |
period_id | integer |
ref_id | integer |
order_type | integer |
desc1 | text |
desc2 | text |
desc3 | text |
desc4 | text |
desc5 | text |
desc6 | text |
desc7 | text |
desc8 | text |
order_from | timestamp with time zone |
order_to | timestamp with time zone |
hours | double precision |
mileage | double precision |
amount | bigint |
View definition:
SELECT *order_view.order_id*, 'D' AS d, max(order_view.cso_id) AS
cso_id, *max(order_view.invoice_id) AS invoice_id*,
max(order_view.period_id) AS period_id, max(order_view.ref_id) AS
ref_id, max(order_view.order_type) AS order_type,
max(order_view.desc1::text) AS desc1, max(order_view.desc2::text) AS
desc2, max(order_view.desc3::text) AS desc3, max(order_view.desc4::text)
AS desc4, max(order_view.desc5::text) AS desc5,
max(order_view.desc6::text) AS desc6, max(order_view.desc7::text) AS
desc7, max(order_view.desc8::text) AS desc8, max(order_view.order_from)
AS order_from, max(order_view.order_to) AS order_to,
sum(order_view.hours) AS hours, sum(order_view.mileage) AS mileage,
sum(order_view.amount) AS amount
FROM ( SELECT eg_order.order_id, eg_order.cso_id,
e*g_order.invoice_id*, eg_order.period_id, eg_order.ref_id,
eg_order.order_type, eg_order.desc1, eg_order.desc2, eg_order.desc3,
eg_order.desc4, eg_order.desc5, eg_order.desc6, eg_order.desc7,
eg_order.desc8, eg_order.order_from, eg_order.order_to, 0 AS hours, 0 AS
mileage, 0 AS amount
FROM eg_order
UNION
( SELECT *eg_order_line.order_id*, 0, *0*, 0, 0, 0,
NULL::"unknown" AS "unknown", NULL::"unknown" AS "unknown",
NULL::"unknown" AS "unknown", NULL::"unknown" AS "unknown",
NULL::"unknown" AS "unknown", NULL::"unknown" AS "unknown",
NULL::"unknown" AS "unknown", NULL::"unknown" AS "unknown",
to_timestamp('1970-01-01'::text, 'YYYY-MM-DD'::text) AS to_timestamp,
to_timestamp('1970-01-01'::text, 'YYYY-MM-DD'::text) AS to_timestamp, 0
AS hours, eg_order_line.quantity AS mileage, eg_order_line.amt_value
FROM eg_order_line
WHERE eg_order_line.order_line_type = 20
UNION
SELECT *eg_order_line.order_id*, 0, *0,* 0, 0, 0,
NULL::"unknown" AS "unknown", NULL::"unknown" AS "unknown",
NULL::"unknown" AS "unknown", NULL::"unknown" AS "unknown",
NULL::"unknown" AS "unknown", NULL::"unknown" AS "unknown",
NULL::"unknown" AS "unknown", NULL::"unknown" AS "unknown",
to_timestamp('1970-01-01'::text, 'YYYY-MM-DD'::text) AS to_timestamp,
to_timestamp('1970-01-01'::text, 'YYYY-MM-DD'::text) AS to_timestamp,
eg_order_line.quantity AS hours, 0 AS mileage, eg_order_line.amt_value
FROM eg_order_line
WHERE eg_order_line.order_line_type <> 20)) order_view
*GROUP BY order_view.order_id*;

stage=# explain select * from eg_invoice_summary_view where invoice_id=5;

QUERY
PLAN


---------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
-------------------
GroupAggregate *(cost=485551.60..485551.80 rows=1 width=736)*
-> Sort (cost=485551.60..485551.61 rows=7 width=736)
Sort Key: i.invoice_id, i.cso_id, i.period_id, i.account_id,
i.invoice_number, i.invoice_date, i.amoun
t, i.invoice_style, i.plan_name
-> Nested Loop Left Join (cost=485541.78..485551.50 rows=7
width=736)
-> HashAggregate (cost=485540.75..485540.77 rows=1
width=56)
-> Nested Loop Left Join
(cost=417895.29..485536.25 rows=200 width=56)
-> Index Scan using eg_invoice_pkey on
eg_invoice i (cost=0.00..3.01 rows=1 width=
48)
Index Cond: (invoice_id = 5)
-> GroupAggregate
(cost=417895.29..485529.24 rows=200 width=316)
Filter: (max("?column3?") = 5)
-> Unique (cost=417895.29..448632.54
rows=614745 width=200)
-> Sort
(cost=417895.29..419432.15 rows=614745 width=200)
Sort Key: order_id, cso_id,
invoice_id, period_id, ref_id, order_t
ype, desc1, desc2, desc3, desc4, desc5, desc6, desc7, desc8, order_from,
order_to, hours, mileage, amount
-> Append
(cost=0.00..106638.70 rows=614745 width=200)
-> Subquery Scan
"*SELECT* 1" (cost=0.00..9621.64 rows=233
432 width=200)
-> Seq Scan on
eg_order (cost=0.00..7287.32 rows=233
432 width=200)
-> Result
(cost=77951.41..97017.06 rows=381313 width=16)
-> Unique
(cost=77951.41..97017.06 rows=381313 width
=16)
-> Sort
(cost=77951.41..78904.69 rows=381313 w
idth=16)

Sort Key: order_id, cso_id, invoice_id, pe
riod_id, ref_id, order_type, desc1, desc2, desc3, desc4, desc5, desc6,
desc7, desc8, order_from, order_to, hour
s, mileage, amount
->
Append (cost=0.00..25112.52 rows=3813
13 width=16)

-> Subquery Scan "*SELECT* 2" (cos
t=0.00..11887.06 rows=146043 width=16)

-> Seq Scan on eg_order_line
(cost=0.00..10426.63 rows=146043 width=16)

Filter: (order_line_type
= 20)

-> Subquery Scan "*SELECT* 3" (cos
t=0.00..13225.46 rows=235270 width=16)

-> Seq Scan on eg_order_line
(cost=0.00..10872.76 rows=235270 width=16)

Filter: (order_line_type
<> 20)
-> Bitmap Heap Scan on eg_invoice_tax tax
(cost=1.03..10.65 rows=7 width=8)
Recheck Cond: (invoice_id = 5)
-> Bitmap Index Scan on ix2f10773c8edf278d
(cost=0.00..1.03 rows=7 width=0)
Index Cond: (invoice_id = 5)
(31 rows)

--
----
Visit http://www.obviously.com/

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Nis Jørgensen 2007-07-22 09:06:45 Re: The nested view from hell - Restricting a subquerry
Previous Message ljb 2007-07-20 01:41:05 Re: UNION and LIMIT issue