Re: The nested view from hell - Restricting a subquerry

From: Nis Jørgensen <nis(at)superlativ(dot)dk>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: The nested view from hell - Restricting a subquerry
Date: 2007-07-23 05:36:28
Message-ID: f81ep0$emn$1@sea.gmane.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Gregory Stark skrev:
> Nis Jørgensen <nis(at)superlativ(dot)dk> writes:
>
>> Well, the query can be satisfied by looking only at the rows with an
>> order_id matching the invoice_id given. The condition that this is the
>> largest invoice in the group then needs to be checked afterwards.
>>
>> I certainly did not expect the query planner to be able to deduce this,
>> though.
>
> No, that's not true. If you had two records in eg_order with the same order_id
> but different invoice_ids then the query would need both records to satisfy
> the query.

I assume you mean "... then both records are necessary in order to
calculate the results of the query". This does not contradict what I wrote.

If you mean "... then both records need to satisfy <some criteria>" then
I don't understand which criteria you are talking about. The query in
question was:

SELECT order_id,
max(order_view.invoice_id),
sum(order_view.mileage)
FROM (SELECT order_id,invoice_id, 0 as mileage FROM eg_order
UNION
SELECT order_id, 0, mileage FROM eg_order_line)
order_view GROUP BY order_view.order_id;

This is then restricted on max(invoice_id)

As far as I can tell, these steps produce the correct results (without
the later information about primary keys provided by Bryce)

INPUT: my_invoice_id

1. Look up all order_ids for which (order_id,my_invoice_id) appear in
eg_orders

2. Find all rows (in both branches of the UNION) with these id_s

3. Group the rows, and calculate max(invoice_id)

4. Filter the result rows on max(invoice_id) = my_invoice_id.

> The query planner can't deduce that this can't happen because it simply does
> not have that information.

Well, I realize that the existing query planner can't. Since I could
arrive at the plan above by deduction so could a hypothetical different
query planner. Whether it is faster is of course unknown - my guess is
that it would be in this case.

> The more I look at this view the more I think it's just seriously broken.
> Why is it grouping by order_id at all if, I suspect, there will only be one
> record per order_id in eg_orders??

Bryce has confirmed this. The above is only of academic interest.

Yours,
Nis Jorgensen

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Aarni Ruuhimäki 2007-07-23 07:42:36 Re: how to download linux 7.3 image
Previous Message Gregory Stark 2007-07-23 04:21:48 Re: The nested view from hell - Restricting a subquerry