Re: The nested view from hell - Restricting a subquerry

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Nis Jørgensen <nis(at)superlativ(dot)dk>
Cc: <pgsql-sql(at)postgresql(dot)org>
Subject: Re: The nested view from hell - Restricting a subquerry
Date: 2007-07-23 11:23:03
Message-ID: 87d4yjmkiw.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Nis Jørgensen <nis(at)superlativ(dot)dk> writes:

>>> Well, the query can be satisfied by looking only at the rows with an
>>> order_id matching the invoice_id given. The condition that this is the
>>> largest invoice in the group then needs to be checked afterwards.
>>>
>>> I certainly did not expect the query planner to be able to deduce this,
>>> though.
>>
>> No, that's not true. If you had two records in eg_order with the same order_id
>> but different invoice_ids then the query would need both records to satisfy
>> the query.
>
> I assume you mean "... then both records are necessary in order to
> calculate the results of the query". This does not contradict what I wrote.

Sorry I meant, "the query as written can not be satisfied by looking only at
the rows with the specified invoice_id".

> SELECT order_id,
> max(order_view.invoice_id),
> sum(order_view.mileage)
> FROM (SELECT order_id,invoice_id, 0 as mileage FROM eg_order
> UNION
> SELECT order_id, 0, mileage FROM eg_order_line)
> order_view GROUP BY order_view.order_id;
>
> This is then restricted on max(invoice_id)
>
> As far as I can tell, these steps produce the correct results (without
> the later information about primary keys provided by Bryce)
>
> INPUT: my_invoice_id
>
> 1. Look up all order_ids for which (order_id,my_invoice_id) appear in
> eg_orders
>
> 2. Find all rows (in both branches of the UNION) with these id_s
>
> 3. Group the rows, and calculate max(invoice_id)
>
> 4. Filter the result rows on max(invoice_id) = my_invoice_id.

So here's a hypothetical data set for which this algorithm fails:

order_id invoice_id mileage
--------------------------------------------
1 1 100
1 2 100

Your algorithm would produce

order_id max(invoice_id) sum(mileage)
--------------------------------------------
1 1 100

Whereas the correct output would be to output no records at all.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Nis Jørgensen 2007-07-23 11:57:11 Re: The nested view from hell - Restricting a subquerry
Previous Message Jyoti Seth 2007-07-23 09:25:21 Re: Database Synchronization