Re: PG 7.2b4 bug?

From: Don Baccus <dhogaza(at)pacifier(dot)com>
To:
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PG 7.2b4 bug?
Date: 2001-12-19 20:59:04
Message-ID: 3C20FF98.8080005@pacifier.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Some more bug-or-not-bug thoughts ...

I thought I'd add a quote from Date that furthers my belief that the
subselect example I posted does indeed expose a bug:

(T1 is the table conceptually created by the various joins, etc)

"[if] the select-item takes the form "scalar-expression [[AS] column]"

...

For each row in T1 the scalar-expression is evaluated .."

(page 151 Date & Darwin)

SQL92 didn't support subselects in the select-item-list. SQL3 extends
the expression to include one-row selects that return a single scalar
value. It does NOT however add any wording that allows the subselect to
be yanked and evaluated once rather than evaluated for each row. The
standard uses the word "applied" not "evaluated". I interpret this to
mean "evaluated" and it appears that Date does, too.

On the other hand the view example is giving the proper result in PG
7.2, though only by luck, as Tom pointed out earler. For (given the
view "create view foo as select nextval('foo_sequence') as nextval;")

select foo.nextval
from multiple_rows;

isn't actually legal SQL. It must be stated as:

select foo.nextval
from foo, multiple_rows;

(all PG does is add "foo" to the from clause for me if I leave it out).

The semantics of this are obvious when you think about it - materialize
"foo" then cross-join the resulting table with multiple_rows. Since
"foo" returns a single row computed by "nextval('foo_sequence')"
obviously the result seen with PG 7.2 is correct. Date is quite clear
on the semantics of this and it makes tons of sense since views are
meant to be treated like tables.

So:

1. If an explicit scalar subselect appears in the target list, it should
be executed for every row in the result set.

2. A view referenced in the target list is actually supposed to be
materialized in the FROM clause (even if implictly added to it for
you) then joined to the other tables in the query, if any. Meaning
it should always be executed once and only once. The standard
doesn't have PG-style rules, of course, but such tables are also
should be in the FROM clause, evaluated and joined afterwards
IMO.

At least that's my reading and I've spent quite a bit of time on this now.

Unfortunately PG currently doesn't use the form of the query to decide
whether or not to execute the subselect or view once or for each row,
but rather does so depending on the estimated cost of each approach.

That's the real bug it seems. The form of the query, not the whim of
the optimizer, is the determinant.

Neither of these cases is likely to arise frequently in practice, so if
I ruled Middle Earth I'd decree that:

1. It be filed as a bug

2. It not be assigned a high priority.

However it's not merely of academic interest. The semantics of the view
example is such that you should be able to force single-evaluation of a
function by simply wrapping it in a view, regardless of whether or not
it has side-effects.

Meanwhile I get to go off and inspect the roughly 750 queries that use
this particular style view and determine which ones incorrectly assume
that the view's evaluated more than once per query! :)

--
Don Baccus
Portland, OR
http://donb.photo.net, http://birdnotes.net, http://openacs.org

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Vince Vielhaber 2001-12-19 21:17:14 Re: Explicit config patch 7.2B4
Previous Message Andrew G. Hammond 2001-12-19 20:41:38 Re: Explicit config patch 7.2B4