Re: PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Dilger <hornschnorter(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Regina Obe <lr(at)pcorp(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity
Date: 2017-06-07 18:57:22
Message-ID: 13450.1496861842@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Mark Dilger <hornschnorter(at)gmail(dot)com> writes:
>> On Jun 4, 2017, at 2:19 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> Seems very unlikely that we'd ever want to do that. The right way to do
>> this is to simply move the SRF into the from list. Having the executor
>> support arbitrary sources of tuples would just complicate and slow down
>> already complicated and slow code...

> In my example, the aggregate function is taking a column from the table as
> an argument, so the output of the aggregate function needs to be computed per row,
> not just once. And if the function is expensive, or has side-effects, you might
> only want it to execute for those rows where the CASE statement is true, rather
> than for all of them. You may get that same behavior using lateral or some such,
> I'm uncertain, but in a complicated CASE statement, it be more straightforward
> to write something like:

> SELECT
> CASE
> WHEN t.x = 'foo' THEN expensive_aggfunc1(srf1(t.y,t.z))
> WHEN t.x = 'bar' THEN expensive_aggfunc2(srf2(t.y,t.z))
> WHEN t.x = 'baz' THEN expensive_aggfunc3(srf3(t.y,t.z))
> ....
> WHEN t.x = 'zzz' THEN expensive_aggfuncN(srfN(t.y,t.z))
> ELSE 5
> END
> FROM mytable t;

I think the correct way to do that already exists, namely to use a
correlated sub-select to wrap each SRF+aggregate:

...
WHEN t.x = 'foo' THEN (SELECT expensive_aggfunc1(s) FROM srf1(t.y,t.z) s)
...

I don't really feel a need to invent some other notation for that.

After chewing on this for awhile, I'm starting to come to the conclusion
that we'd be best off to throw an error for SRF-inside-CASE (or
COALESCE). Mark is correct that the simplest case of

SELECT x, CASE WHEN y THEN generate_series(1,z) ELSE 5 END
FROM table_with_columns_x_and_y_and_z;

behaves just intuitively enough that people might be using it. The new
implementation method cannot reasonably duplicate the old semantics for
that, which means that if we let it stand as-is we will be silently
breaking queries, even if we fix up some of the weirder corner cases like
what happens when the CASE can be const-simplified. So I think we'd be
better off to make this throw an error, and force any affected users to
rewrite in a way that will work in both v10 and older releases.

As to *how* to throw an error, I think it should be possible to teach
parse analysis to detect such cases, with something like the
ParseExprKind mechanism that could be checked to see if we're inside
a subexpression that restricts what's allowed. There are some other
checks like no-nested-aggregates that perhaps could be folded in as
well. Checking at parse analysis ought to be sufficient because
rule rewriting could not introduce such a case where it wasn't before,
and planner subquery flattening won't introduce one either because we
don't flatten subqueries with SRFs in their tlists.

If people are on board with throwing an error, I'll go see about
writing a patch.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2017-06-07 19:00:44 Re: PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity
Previous Message Fabrízio de Royes Mello 2017-06-07 18:46:45 Re: Directory pg_replslot is not properly cleaned