Re: [HACKERS] Fix number skipping in to_number

From: Ioseph Kim <pgsql-kr(at)postgresql(dot)kr>
To: Oliver Ford <ojford(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Fix number skipping in to_number
Date: 2018-10-23 02:41:08
Message-ID: 6906b0fe-e037-5070-711d-6cd067ecf47b@postgresql.kr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi.

I'm checking release note for version 11.

in that.

|"L| and |TH| now only consume characters that are not digits,
positive/negative signs, decimal points, or commas."

postgres(at)postgres=# select to_number('1234', '+9999');
 to_number
-----------
       234

Is this right?

Regards, ioseph.

||||

2017년 08월 10일 19:21에 Oliver Ford 이(가) 쓴 글:
> Prevents an issue where numbers can be skipped in the to_number()
> function when the format mask contains a "G" or a "," but the input
> string doesn't contain a separator. This resolves the TODO item "Fix
> to_number() handling for values not matching the format string".
>
> == Change ==
>
> Currently, if the format mask in to_number() has a "G" or a ",", it
> will assume that the input string contains a separator character in
> the same place. If however a number is there instead, that number will
> be silently skipped and not appear in the output. So we get:
>
> select to_number('34,50','999,99');
> to_number
> -----------
> 340
> (1 row)
>
> This patch checks the input string when it encounters a "G" or "," in
> the format mask. If the separator character is found, the code moves
> over it as normal. If it's not found, then the code no longer
> increments the relevant pointer so as not to skip the character. After
> the patch, we get the correct result:
>
> select to_number('34,50','999,99');
> to_number
> -----------
> 3450
> (1 row)
>
> This is in line with Oracle's result.
>
> == Rationale ==
>
> This patch is a small change, which leaves PostgreSQL behavior
> different from Oracle behavior in related cases. Oracle's
> implementation seems to read from right-to-left, and raises an
> "ORA-01722: invalid number" error if there are digits in the input
> string which don't have corresponding characters in the format mask. I
> have chosen not to throw such errors, because there are use cases for
> only returning part of a number string. For example, the following is
> an error on Oracle:
>
> select to_number('123,000', '999G') from dual;
> Error report -
> SQL Error: ORA-01722: invalid number
>
> But if you wanted to only take the characters before the comma, and
> discard the thousands part, you can do so on PostgreSQL with:
>
> select to_number('123,000', '999G');
> to_number
> -----------
> 123
> (1 row)
>
> This is the current behavior. Which is why I think it makes more sense
> to do what PostgreSQL currently does and read from left-to-right. The
> only change, as mentioned above, is that the current behavior can skip
> a digit:
>
> select to_number('123456', '999G999');
> to_number
> -----------
> 12356
> (1 row)
>
> After the patch, this returns all the digits:
>
> select to_number('123456', '999G999');
> to_number
> -----------
> 123456
> (1 row)
>
> == Testing ==
>
> Tested on Windows with MinGW using the latest checkout from master.
> Added regression tests to check for this new behavior. All existing
> tests pass.
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-10-23 03:16:06 Re: BUG #15448: server process (PID 22656) was terminated by exception 0xC0000005
Previous Message David Rowley 2018-10-23 02:02:02 Re: Speeding up INSERTs and UPDATEs to partitioned tables