Re: Changed SRF in targetlist handling

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changed SRF in targetlist handling
Date: 2016-05-23 20:15:13
Message-ID: 21076.1464034513@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Mon, May 23, 2016 at 2:13 PM, David Fetter <david(at)fetter(dot)org> wrote:
>> On Mon, May 23, 2016 at 01:28:11PM -0500, Merlin Moncure wrote:
>>> +1 on removing LCM.

>> As a green field project, that would make total sense. As a thing
>> decades in, it's not clear to me that that would break less stuff or
>> break it worse than simply disallowing SRFs in the target list, which
>> has been rejected on bugward-compatibility grounds. I suspect it
>> would be even worse because disallowing SRFs in target lists would at
>> least be obvious and localized when it broke code.

> If I'm reading this correctly, it sounds to me like you are making the
> case that removing target list SRF completely would somehow cause less
> breakage than say, rewriting it to a LATERAL based implementation for
> example. With more than a little forbearance, let's just say I don't
> agree.

We should consider single and multiple SRFs in a targetlist as distinct
use-cases; only the latter has got weird properties.

There are several things we could potentially do with multiple SRFs in
the same targetlist. In increasing order of backwards compatibility and
effort required:

1. Throw error if there's more than one SRF.

2. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...). This would
have the same behavior as before if the SRFs all return the same number
of rows, and otherwise would behave differently.

3. Rewrite into some other construct that still ends up as a FunctionScan
RTE node, but has the old LCM behavior if the SRFs produce different
numbers of rows. (Perhaps we would not need to expose this construct
as something directly SQL-visible.)

It's certainly arguable that the common use-cases for SRF-in-tlist
don't have more than one SRF per tlist, and thus that implementing #1
would be an appropriate amount of effort. It's worth noting also that
the LCM behavior has been repeatedly reported as a bug, and therefore
that if we do #3 we'll be expending very substantial effort to be
literally bug-compatible with ancient behavior that no one in the
current development group thinks is well-designed. As far as #2 goes,
it would have the advantage that code depending on the same-number-of-
rows case would continue to work as before. David has a point that it
would silently break application code that's actually depending on the
LCM behavior, but how much of that is there likely to be, really?

[ reflects a bit... ] I guess there is room for an option 2-and-a-half:

2.5. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...), but decorate
the FunctionScan RTE to tell the executor to throw an error if the SRFs
don't all return the same number of rows, rather than silently
null-padding. This would have the same behavior as before for the sane
case, and would be very not-silent about cases where apps actually invoked
the LCM behavior. Again, we wouldn't necessarily have to expose such an
option at the SQL level. (Though it strikes me that such a restriction
could have value in its own right, analogous to the STRICT options that
we've invented in some other places to allow insisting on the expected
numbers of rows being returned. ROWS FROM STRICT (srf1(), srf2(), ...),
anybody?)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter van Hardenberg 2016-05-23 20:15:31 Re: Calling json_* functions with JSONB data
Previous Message David Fetter 2016-05-23 20:05:03 Re: Changed SRF in targetlist handling