pg_dump and join aliases (was Re: [BUGS] ERROR: cannot handle unplanned sub-select)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>
Subject: pg_dump and join aliases (was Re: [BUGS] ERROR: cannot handle unplanned sub-select)
Date: 2010-07-07 22:55:50
Message-ID: 2804.1278543350@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Merlin sent me a test case off-list for the problem mentioned here:
http://archives.postgresql.org/pgsql-bugs/2010-07/msg00025.php

After some investigation I was able to simplify it to the following
example using the regression database:

select
(select sq1) as qq1
from
(select exists(select 1 from int4_tbl where f1 = q2) as sq1, 42 as dummy
from int8_tbl) sq0
join
int4_tbl i4 on dummy = i4.f1;

The problem is that flatten_join_alias_vars() can push SubLink
expressions down into sub-selects, as in this example when it replaces
the "sq1" reference with the EXISTS() subexpression that was previously
pulled up by flattening sq0. But it fails to set the hasSubLinks flag
in the sub-Query, so subsequent processing doesn't think it needs to do
SS_process_sublinks within the sub-Query, and eventually we fail when we
come across the unprocessed SubLink. This bug goes clear back to 7.4.
Fortunately it's simple to fix.

What seems more interesting is that I initially had a hard time
reproducing the bug under different conditions, and didn't figure out
what was going on until I realized that I had used pg_dump to
consolidate the multiple files Merlin sent ... and *reloading pg_dump's
version of the views didn't exhibit the bug*. This is because pg_dump,
or more accurately ruleutils.c, has a habit of qualifying variable
references whether or not they were qualified in the original query.
If you turn the above example into a view and then dump it, you'll get

... (select sq0.sq1) as qq1 ...

and that doesn't tickle this bug. (That's because "sq0.sq1" isn't a
join alias Var, whereas unqualified "sq1" is.)

So the question that seems worth discussing is whether this difference
ought to be considered a bug in ruleutils. In theory it shouldn't
matter if pg_dump adds an "unnecessary" qualification, but here's an
example where it did matter. Do we care? People tend to assume that
dumping and reloading will make no change in the behavior of their
views, so this seems kind of scary to me. On the other hand, the
"extra" qualifications make view definitions a bit more robust in the
face of column additions, renamings, etc. So there's certainly a case
to be made that the dump behavior is preferable as-is.

Thoughts?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Robert Haas 2010-07-07 23:04:31 Re: pg_dump and join aliases (was Re: [BUGS] ERROR: cannot handle unplanned sub-select)
Previous Message Josh Berkus 2010-07-07 17:54:33 Re: [TESTERS] Location of certs -Windows 7 SSL mode?

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-07-07 23:04:31 Re: pg_dump and join aliases (was Re: [BUGS] ERROR: cannot handle unplanned sub-select)
Previous Message Robert Haas 2010-07-07 22:55:05 Re: Proposal for 9.1: WAL streaming from WAL buffers