Re: patch (for 9.1) string functions

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Erik Rijkers <er(at)xs4all(dot)nl>
Subject: Re: patch (for 9.1) string functions
Date: 2010-07-26 16:04:27
Message-ID: AANLkTinxeY3q8gZNLif_kXWWQe6hgK=aCBTLYmCFnwKv@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2010/7/26 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Mon, Jul 26, 2010 at 10:39 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> On Mon, Jul 26, 2010 at 9:26 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Mon, Jul 26, 2010 at 9:10 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>>>>> CONCAT('foo', NULL) => 'foo' really the behavior that everyone else
>>>>> implements here?  And why does CONCAT() take a variadic "ANY"
>>>>> argument?  Shouldn't that be variadic TEXT?
>>>>
>>>> What does that accomplish, besides forcing you to sprinkle every
>>>> concat call with text casts (maybe that's not a bad thing?)?
>>>
>>> You could ask the same thing about the existing || operator.  And in
>>> fact, we used to have that behavior.  We changed it in 8.3.  Perhaps
>>> that was a good decision and perhaps it wasn't, but I don't think
>>> using CONCAT() to make an end-run around that decision is the way to
>>> go.
>>
>> It was absolutely a good decision because it prevented type inference
>> in ways that were ambiguous or surprising (for a canonical case see:
>> http://www.mail-archive.com/pgsql-general(at)postgresql(dot)org/msg93224.html).
>>
>> || operator is still pretty tolerant in the 8.3+ world.
>> select interval_col || bool_col; -- error
>> select interval_col::text || bool_col; -- text concatenation
>> select text_col || interval_col || bool_col; -- text concatenation
>>
>> variadic text would require text casts on EVERY non 'unknown' argument
>> which drops it below the threshold of usefulness IMO -- it would be
>> far stricter than vanilla || concatenation.  Ok, pure bikeshed here
>> (shutting my trap now!), but concat() is one of those wonder functions
>> that you want to make as usable and terse as possible.  I don't see
>> the value in making it overly strict.
>
> I'm just very skeptical that we should give our functions argument
> types that are essentially fantasy.  CONCAT() doesn't concatenate
> integers or intervals or boxes: it concatenates strings, and only
> strings.  Surely the right fix, if our casting rules are too
> restrictive, is to fix the casting rules; not to start adding a bunch
> of kludgery in every function we define.
>

Rules are correct probably. The problem is in searching function
algorithm - it is too simple (and fast, just only one cycle). And some
exceptions - like COALESCE and similar are solved specifically on
parser level. We cannot enforce some casting on user level. PostgreSQL
is more strict with timestamps, dates than other databases. Sometimes
you have to do explicit casts, but it clean from context desired
datatype -

SELECT date_trunc('day', current_date) - result is timestamp, but it
is clean, so result have to be date ... When I proposed a parser hook
I though about these functions. With this hook, you can enforce any
necessary casting like some buildin functions does.

so "any"

> The problem with your canonical example (and the other examples of
> this type I've seen) is that they are underspecified.  Given two
> identically-named operators, one of which accepts types T1 and T2, and
> the other of which accepts types T3 and T4, what is one to do with T1
> OP T4?  You can cast T1 to T3 and call the first operator or you can
> cast T4 to T2 and call the second one, and it's really not clear which
> is right, so you had better thrown an error.  The same applies to
> functions: given foo(text) and foo(date) and the call
> foo('2010-04-15'), you had better complain, or you may end up with a
> very sad user.  But sometimes our current casting rules require casts
> in situations where they don't seem necessary, such as
> LPAD(integer_column, '0', 5).  That's not ambiguous because there's
> only one definition of LPAD, and the chances that the user will be
> unpleasantly surprised if you call it seem quite low.
>
> In other words, this problem is not unique to CONCAT().
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2010-07-26 16:10:19 Re: patch (for 9.1) string functions
Previous Message Merlin Moncure 2010-07-26 15:39:40 Re: patch (for 9.1) string functions