Re: [HACKERS] string_to_array with empty input

From: Greg Stark <stark(at)enterprisedb(dot)com>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: justin <justin(at)emproshunts(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Sam Mason <sam(at)samason(dot)me(dot)uk>, pgsql-general(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] string_to_array with empty input
Date: 2009-04-01 18:40:16
Message-ID: 4136ffa0904011140x7e2473f4x714660d8a0cb8bd9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Wed, Apr 1, 2009 at 6:23 PM, David E. Wheeler <david(at)kineticode(dot)com> wrote:
> Right, it's making a special case of '', which does seem rather inconsistent
> to me.

"David E. Wheeler" <david(at)kineticode(dot)com> writes:

> On Apr 1, 2009, at 10:05 AM, justin wrote:
>
>> string_to_array('',',')::INT[] works as proposed
>>
>> But
>> string_to_array(',,,', ',' )::INT[] Fails
>> or
>> string_to_array('1,2,,4', ',' )::INT[] Fails .
>>
>>
>> I'm trying to understand the difference between a empty string to a string
>> with many blank entries between the delimiter.

Well, uh, in one case it's empty and in the other case it's not?

>> Consider ',,,,,,' = '' once the delimiter is removed . Yet Seven zero
>> length entries were passed. How is that going to be handled????

Well it's pretty clear empty delimiters cannot be handled consistently. Some
languages handle them as a special case (splitting every character into a
separate string, for example -- which I'll point out will result in an empty
array as a result for an empty string input) or make it an error.

> Right, it's making a special case of '', which does seem rather inconsistent
> to me.

It's not a special case -- or it's a special case whichever we choose,
depending on which way you look at it.

What we're talking about here is replacing the blank values in the following
tables. We can get either the first one right in both cases with {} as the
result, or we can get the second one right in the second table with {""}.
Either way there is an inconsistency in at least one case.

The existing behaviour of returning NULL is the only "consistent" choice since
the correct value is "unknown". And one could argue that it's easier to
replace NULL with the correct value if the programmer knows using coalesce
than it is to replace either "" or {""}. But I'm still leaning to thinking
that using an arbitrary choice that at least gets most users intentions is
better.

postgres=# select input,
string_to_array(array_to_string(input,','),',') as output from (values
(array[]::text[]),(array['foo']),(array['foo','bar']),(array['foo','bar','baz']))
as input(input);
input | output
---------------+---------------
{} |
{foo} | {foo}
{foo,bar} | {foo,bar}
{foo,bar,baz} | {foo,bar,baz}
(4 rows)

postgres=# select input,
string_to_array(array_to_string(input,','),',') as output from (values
(array[]::text[]),(array['']),(array['','']),(array['','',''])) as
input(input);
input | output
------------+------------
{} |
{""} |
{"",""} | {"",""}
{"","",""} | {"","",""}
(4 rows)

--
greg

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Agustin Bialet 2009-04-01 18:40:27 Re: consulta demasiado grande
Previous Message Sam Mason 2009-04-01 18:27:09 Re: [HACKERS] string_to_array with empty input

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-04-01 18:55:20 Re: Patch to speed up pg_dump
Previous Message Sam Mason 2009-04-01 18:27:09 Re: [HACKERS] string_to_array with empty input