Re: Function Column Expansion Causes Inserts To Fail

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: David Johnston <polobo(at)yahoo(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)postgresql(dot)org
Subject: Re: Function Column Expansion Causes Inserts To Fail
Date: 2011-05-31 15:56:29
Message-ID: BANLkTimppK9hBjwrT3qcLZ3KwZLyVXAXCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, May 31, 2011 at 9:24 AM, David Johnston <polobo(at)yahoo(dot)com> wrote:
> From syntax works fine for literals but how would you then get table.column
> values into the function call - where you want to evaluate multiple rows
> from the source table?  In order to feed rows to a function you need the
> function in the SELECT column-list so that it can see the columns in
> question.
>
>  If using the described syntax results in the odd situation where a function
> is called twice I would consider that a bug.  Either it needs to be fixed or
> the system should disallow that syntax from being used.  I haven't tried
> serial pk creation or other side-effects that would not result in such an
> obvious error but it is reasonable to believe that if the duplicate key
> exception is being thrown then other bad - but not catchable things - could
> occur as well.  Even an expensive SELECT statement inside the function would
> make this behavior undesirable - though I am guessing it would otherwise be
> invisible since the SELECT is not a side-effect and thus the engine would
> only return one set of results - though I haven't tested this theory either.
>
> The fact that: SELECT createpkrecord('1') works - returning a "row" - leads
> me to think that decomposing that row should be (but is not) independent of
> the source of that "row".
>
> The work around I described (converting the SELECT function() statement to a
> sub-query and expanding the results in the parent) is fine but if that is
> the only safe way to do it then the alternate method should fail since it is
> unsafe.  Now, back to my first question, are there other alternatives that
> I've overlooked when you want to use the result of a SELECT statement as the
> source of values for a function call?
>
> That is, how would you re-write this to place "createpkrecord(sub)" in a
> FROM clause instead of the SELECT list?
>
> SELECT createpkrecord(sub)
> FROM (SELECT sub FROM generate_series(1, 10) sub ) src;

The basic issue is that:
select (func()).*, if the return type has fields, 'a', 'b', gets expanded to:

select (func()).a, (func()).b;

This is a *huge* gotcha with type returning functions -- in many cases
people only notice the problem indirectly through slow performance.
I've griped about this many times but it's not clear if there's a
solution other than to document and advise workarounds. Typically the
safest way to deal with this is through use of CTE:

> SELECT createpkrecord(sub)
> FROM (SELECT sub FROM generate_series(1, 10) sub ) src;

becomes

with list as (SELECT createpkrecord(sub) as c FROM generate_series(1, 10) sub )
select (c).* from list;

merlin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Raymond O'Donnell 2011-05-31 16:19:54 Re: troubles with initdb
Previous Message jlhgis 2011-05-31 15:16:35 troubles with initdb