Skip site navigation (1) Skip section navigation (2)

Re: The nested view from hell - Restricting a subquerry

From: Nis Jørgensen <nis(at)superlativ(dot)dk>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: The nested view from hell - Restricting a subquerry
Date: 2007-07-22 09:06:45
Message-ID: f7v6na$2gc$1@sea.gmane.org (view raw or flat)
Thread:
Lists: pgsql-sql
Bryce Nesbitt skrev:
> I've got a legacy app with a hefty performance problem.  The basic
> problem is stupid design. It takes 10-15 seconds of CPU time to look up
> an invoice.
> Basically it's trying to mash up extra columns on an otherwise simple
> query, and those extra columns are subtotals.  Simplified (this looks
> best in a fixed width font):
> 
>     SELECT  max(order_view.order_id),max(order_view.invoice_id)
> ,sum(order_view.mileage)
>     FROM    (SELECT order_id,invoice_id, 0 as miles FROM eg_order
>              UNION   
>              SELECT order_id,0         ,      miles FROM eg_order_line)
>              order_view GROUP BY order_view.order_id;
> 
> A select by order_id is fast.  The problem is the application uses
> "select * from view where invoice_id=x", and the second part of the
> UNION returns all possible rows in the database.  These get filtered out
> later, but at considerable performance hit.

Just for the record, I believe your simplified example should look like
this (changed "max(order_id)" to "order_id" in outer select , changed
"miles" to "mileage"):

    SELECT  order_id,
	max(order_view.invoice_id),
	sum(order_view.mileage)
    FROM    (SELECT order_id,invoice_id, 0 as mileage FROM eg_order
             UNION
             SELECT order_id, 0, mileage FROM eg_order_line)
             order_view GROUP BY order_view.order_id;


It is pretty clear that the problem comes from joining on the result of
an aggregate. PG apparently is not smart enough to recognize that the
result of a max must be one of the values of the column (meaning that it
can use an index)

It is not clear whether there is a FK relation between eg_order and
eg_order_line and what the PK of eg_order is. If there is a FK, you can
do something along the lines of

SELECT  order_id,
	invoice_id
	COALESCE(sum(mileage),0) as mileage
    FROM    eg_order LEFT JOIN eg_order_line USING order_id
    GROUP BY order_id, invoice_id


If there can be more than one invoice_id per order_id, you might need to add

HAVING invoice_id = (SELECT max(invoice_id) FROM eg_order eo2 WHERE
eg_order.order_id = eo2.order_id)

or similar.

Hope this helps,

Nis


In response to

Responses

pgsql-sql by date

Next:From: Tom LaneDate: 2007-07-22 18:05:07
Subject: Re: The nested view from hell - Restricting a subquerry
Previous:From: Bryce NesbittDate: 2007-07-22 03:10:55
Subject: The nested view from hell - Restricting a subquerry

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group