Re: Subquery with toplevel reference used to work in pg 8.4

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Murawski <markm-lists(at)intellasoft(dot)net>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Subquery with toplevel reference used to work in pg 8.4
Date: 2012-03-24 16:35:11
Message-ID: 19235.1332606911@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Mark Murawski <markm-lists(at)intellasoft(dot)net> writes:
> I agree the query is a little odd, but I like backwards compatibility!

AFAICT, 8.4 is broken too --- did you try any cases where the
WHERE-condition should filter rows?

I created this similar test case using the regression database:

select * from
int8_tbl t1 left join
(select q1 as x, 42 as y from int8_tbl t2) ss
on t1.q2 = ss.x
where
1 = (select 1 from int8_tbl t3 where ss.y is not null limit 1);

The raw output without any WHERE clause is

q1 | q2 | x | y
------------------+-------------------+------------------+----
123 | 456 | |
123 | 4567890123456789 | 4567890123456789 | 42
123 | 4567890123456789 | 4567890123456789 | 42
123 | 4567890123456789 | 4567890123456789 | 42
4567890123456789 | 123 | 123 | 42
4567890123456789 | 123 | 123 | 42
4567890123456789 | 4567890123456789 | 4567890123456789 | 42
4567890123456789 | 4567890123456789 | 4567890123456789 | 42
4567890123456789 | 4567890123456789 | 4567890123456789 | 42
4567890123456789 | -4567890123456789 | |
(10 rows)

The WHERE clause ought to be effectively just the same as "where ss.y is
not null", ie it should eliminate the two null-extended rows. And
in 8.3 and before, that's what you get:

q1 | q2 | x | y
------------------+------------------+------------------+----
123 | 4567890123456789 | 4567890123456789 | 42
123 | 4567890123456789 | 4567890123456789 | 42
123 | 4567890123456789 | 4567890123456789 | 42
4567890123456789 | 123 | 123 | 42
4567890123456789 | 123 | 123 | 42
4567890123456789 | 4567890123456789 | 4567890123456789 | 42
4567890123456789 | 4567890123456789 | 4567890123456789 | 42
4567890123456789 | 4567890123456789 | 4567890123456789 | 42
(8 rows)

but 8.4 and 9.0 produce all 10 rows, ie no filtering happens.
And 9.1 and HEAD produce
ERROR: Upper-level PlaceHolderVar found where not expected

After investigating, I believe the problem is that
SS_replace_correlation_vars needs to replace outer PlaceHolderVars just
as if they were outer Vars. What is getting pushed into the subquery
is a PlaceHolderVar wrapping the constant 42, and if that's left alone
then the subquery WHERE clause ends up as just "42 is not null", which
is not what we need. That has to be converted into a Param referencing
a value from the outer query. 9.1 and HEAD are correctly bleating about
the fact that this outer-level PlaceHolderVar shouldn't be there by the
time the complaining code runs.

Kinda surprising that this bug escaped detection this long ...

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2012-03-24 23:07:20 Re: check_locale() and the empty string
Previous Message Jaime Casanova 2012-03-24 04:47:44 Re: Subquery with toplevel reference used to work in pg 8.4