Is this a bug or a feature? Column visibility in subquery from outer query

From: Paolo Saul <paolo(dot)saul(at)telogis(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Is this a bug or a feature? Column visibility in subquery from outer query
Date: 2009-05-18 05:14:22
Message-ID: 265789020905172214s3933d333qb9bec616b7082b4b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

postgres=# create table public.ps_test_x (x1 oid, x2 oid, x3 oid);
CREATE TABLE
postgres=# create table public.ps_test_y (y1 oid, y2 oid, y3 oid);
CREATE TABLE
postgres=# explain select * from public.ps_test_x where x1 in (select x1
from public.ps_test_y);
QUERY PLAN
---------------------------------------------------------------------
Seq Scan on ps_test_x (cost=0.00..28462.75 rows=885 width=12)
Filter: (subplan)
SubPlan
-> Seq Scan on ps_test_y (cost=0.00..27.70 rows=1770 width=0)
(4 rows)

postgres=# explain select * from public.ps_test_x where x1 in (select x2
from public.ps_test_y);
QUERY PLAN
---------------------------------------------------------------------
Seq Scan on ps_test_x (cost=0.00..28462.75 rows=885 width=12)
Filter: (subplan)
SubPlan
-> Seq Scan on ps_test_y (cost=0.00..27.70 rows=1770 width=0)
(4 rows)

postgres=# explain select * from public.ps_test_x where x1 in (select x3
from public.ps_test_y);
QUERY PLAN
---------------------------------------------------------------------
Seq Scan on ps_test_x (cost=0.00..28462.75 rows=885 width=12)
Filter: (subplan)
SubPlan
-> Seq Scan on ps_test_y (cost=0.00..27.70 rows=1770 width=0)
(4 rows)

postgres=# explain select * from public.ps_test_x where x1 in (select x4
from public.ps_test_y);
ERROR: column "x4" does not exist
LINE 1: ...elect * from public.ps_test_x where x1 in (select x4 from pu...
^
postgres=# explain select * from public.ps_test_x where x1 in (select y1
from public.ps_test_y);
QUERY PLAN
-------------------------------------------------------------------------------
Hash Join (cost=36.62..88.66 rows=1770 width=12)
Hash Cond: (ps_test_x.x1 = ps_test_y.y1)
-> Seq Scan on ps_test_x (cost=0.00..27.70 rows=1770 width=12)
-> Hash (cost=34.12..34.12 rows=200 width=4)
-> HashAggregate (cost=32.12..34.12 rows=200 width=4)
-> Seq Scan on ps_test_y (cost=0.00..27.70 rows=1770
width=4)
(6 rows)

I just want to point out that the sub-query is using a column from the outer
query (eg. x1) without an alias from the table in the outer query. This can
lead to a confusion when, for example:

delete from table1 where foreign_id in (select foreign_id from table2)

-- ! table2 does not have the foreign_id column !

This would do a table scan on table1 and delete all its rows.
Why isn't it like:

delete from table1 where foreign_id in (select table1.foreign_id from
table2)

where you must specify the outer query's table reference inside the
subquery. I suspect the original intent was to use the outer query columns
in some processing inside the subquery, which is valid. I'm just wondering
why an explicit reference isn't required to distinguish the column. For
convenience, or part of the SQL spec?

Cheers!

--Paolo Saul

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stuart Bishop 2009-05-18 05:21:35 Re: [fulltext]Gin index full scan
Previous Message John R Pierce 2009-05-18 03:19:31 Re: Daylight saving time question