Re: Implicit casts with generic arrays

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Implicit casts with generic arrays
Date: 2007-06-04 20:30:35
Message-ID: 24070.1180989035@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Awhile back, I wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> I've looked into cutting back on the implicit casts to text, which
>> exposed the following little gem.

>> The expressions
>> 'abc' || 34
>> 34 || 'abc'
>> would no longer work, with the following error message:
>> ERROR: 22P02: array value must start with "{" or dimension information

> Hm, that's annoying. Not that the expressions fail --- we want them to
> --- but that the error message is so unhelpful.

I've looked into this more closely. The problem basically is that the
parser sees these alternatives for binary || operators:

select oid,oid::regoperator,oprcode from pg_operator where oprname = '||';
oid | oid | oprcode
------+-------------------------+---------------
349 | ||(anyarray,anyelement) | array_append
374 | ||(anyelement,anyarray) | array_prepend
375 | ||(anyarray,anyarray) | array_cat
654 | ||(text,text) | textcat
1797 | ||(bit,bit) | bitcat
2018 | ||(bytea,bytea) | byteacat
(6 rows)

If there is no implicit cast from int to text, then operator 349 is the
*only* candidate that is not immediately eliminated by the lack of any
way to cast an integer 34 to its right argument type. So as far as the
parser is concerned there is no ambiguity. If we hack things to prevent
matching unknown to anyarray, as was suggested in the previous
discussion, we'll get "operator does not exist: "unknown" || integer".
Which is better than the 22P02 error, but still not great.

It furthermore seems that the two operators anyarray || anyelement and
anyelement || anyarray are really the only cases where an undesirable
match to anyarray might occur. The other operators that take anyarray
take it on both sides, which means that they'd not be preferred unless
the other operand was discernibly an array. I don't think we want a
solution that causes "knownarraycolumn = '{1,2,3}'" to start failing.

That argument is even more compelling on the function side, because
for instance there isn't a lot of doubt about the user's intent if he
writes "array_append('{1,2,3}', 34)".

So after reflecting on all that, it doesn't seem like a good idea to
hack the type-coercion code to discriminate against matching unknown
to anyarray. It looks to me like we have a very narrow problem and
we should tailor a very narrow solution. What I am currently thinking
we should do is make oper() specifically test for the case of operator
349 with UNKNOWN left input, or operator 374 with UNKNOWN right input,
and throw a custom error message hinting that the other operand
needs to be cast to text.

In the long run maybe we should choose some other name for the
array_append and array_prepend operators to avoid the confusion with
concatenation. It seems to me that "concatenation" normally implies
"stringing together similar objects", which these two operators
definitely don't do, and so you could argue that || was a bad name
for them from the get-go. But compatibility worries would mean we
couldn't eliminate the old names for quite a long time, so maybe
it's too late for that.

Comments?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2007-06-04 20:36:11 Performance regression on CVS head
Previous Message Simon Riggs 2007-06-04 19:58:38 Re: [COMMITTERS] pgsql: Create a GUC parametertemp_tablespacesthatallows selection of