Re: Confusing EXPLAIN output in case of inherited tables

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Confusing EXPLAIN output in case of inherited tables
Date: 2012-01-28 02:38:17
Message-ID: 29791.1327718297@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> It's a feature, not a bug, that we schema-qualify names when VERBOSE
> is specified. That was done on purpose for the benefit of external
> tools that might need this information to disambiguate which object is
> being referenced.

> Table *aliases*, of course, should not be schema-qualified, but I
> don't think that's what we're doing. You could make it more clear by
> including an alias in the query, like this:

> explain verbose select * into table ramp from road hwy where name ~ '.*Ramp';

I think you are both focusing on the wrong thing. There is a lot of
squishiness in what EXPLAIN prints out, since SQL notation is not always
well suited to what an execution plan actually does. But this code has
a hard and fast requirement that it dump view definitions correctly,
else pg_dump doesn't work. And after looking at this I think Ashutosh
has in fact found a bug. Consider this example:

regression=# create schema s1;
CREATE SCHEMA
regression=# create schema s2;
CREATE SCHEMA
regression=# create table s1.t1 (f1 int);
CREATE TABLE
regression=# create table s2.t1 (f1 int);
CREATE TABLE
regression=# create view v1 as
regression-# select * from s1.t1 where exists (
regression(# select 1 from s2.t1 where s2.t1.f1 = s1.t1.f1
regression(# );
CREATE VIEW
regression=# \d+ v1
View "public.v1"
Column | Type | Modifiers | Storage | Description
--------+---------+-----------+---------+-------------
f1 | integer | | plain |
View definition:
SELECT t1.f1
FROM s1.t1
WHERE (EXISTS ( SELECT 1
FROM s2.t1
WHERE t1.f1 = s1.t1.f1));

regression=# alter table s2.t1 rename to tx;
ALTER TABLE
regression=# \d+ v1
View "public.v1"
Column | Type | Modifiers | Storage | Description
--------+---------+-----------+---------+-------------
f1 | integer | | plain |
View definition:
SELECT t1.f1
FROM s1.t1
WHERE (EXISTS ( SELECT 1
FROM s2.tx t1
WHERE t1.f1 = s1.t1.f1));

Both of the above displays of the view are formally correct, in that the
variables will be taken to refer to the correct upper or lower RTE.
But let's change that back and rename the other table:

regression=# alter table s2.tx rename to t1;
ALTER TABLE
regression=# alter table s1.t1 rename to tx;
ALTER TABLE
regression=# \d+ v1
View "public.v1"
Column | Type | Modifiers | Storage | Description
--------+---------+-----------+---------+-------------
f1 | integer | | plain |
View definition:
SELECT t1.f1
FROM s1.tx t1
WHERE (EXISTS ( SELECT 1
FROM s2.t1
WHERE t1.f1 = s1.t1.f1));

This is just plain wrong, as you'll see if you try to execute that
query:

regression=# SELECT t1.f1
regression-# FROM s1.tx t1
regression-# WHERE (EXISTS ( SELECT 1
regression(# FROM s2.t1
regression(# WHERE t1.f1 = s1.t1.f1));
ERROR: invalid reference to FROM-clause entry for table "t1"
LINE 5: WHERE t1.f1 = s1.t1.f1));
^
HINT: There is an entry for table "t1", but it cannot be referenced
from this part of the query.

(The HINT is a bit confused here, but the query is certainly invalid.)

So what we have here is a potential failure to dump and reload view
definitions, which is a lot more critical in my book than whether
EXPLAIN's output is confusing.

If we stick with the existing rule for attaching a fake alias to renamed
RTEs, I think that Ashutosh's patch or something like it is probably
appropriate, because the variable-printing code ought to be in step with
the RTE-printing code. Unfortunately, I think the hack to attach a fake
alias to renamed RTEs creates some issues of its own. Consider

select * from s1.t1
where exists (select 1 from s2.t2 t1 where t1.f1 = s1.t1.f1);

If s1.t1 is now renamed to s1.tx, it is still possible to express
the same semantics:

select * from s1.tx
where exists (select 1 from s2.t2 t1 where t1.f1 = s1.tx.f1);

But when we attach a fake alias, it's broken:

select * from s1.tx t1
where exists (select 1 from s2.t2 t1 where t1.f1 = ?.f1);

There is no way to reference the outer RTE anymore from the subquery,
because the conflicting lower alias masks it.

We may be between a rock and a hard place though, because it's not that
hard to demonstrate cases where not adding a fake alias breaks it too:

select * from s1.t1 tx
where exists (select 1 from s2.t1 where s2.t1.f1 = tx.f1);

If s2.t1 is renamed to s2.tx, there's no longer any way to reference the
upper alias tx, unless you alias the lower RTE to some different name.
I think that when we put in the fake-alias behavior, we made a value
judgment that this type of situation was more common than the other,
but I'm not really sure why.

Maybe what we need to do instead is create totally-made-up, unique
aliases when something like this happens.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-01-28 02:51:26 Re: Unreliable "pg_ctl -w start" again
Previous Message MauMau 2012-01-28 02:36:41 Re: Unreliable "pg_ctl -w start" again