Re: atoi-like function: is there a better way to do this?

From: David Johnston <polobo(at)yahoo(dot)com>
To: Chris Angelico <rosuav(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: atoi-like function: is there a better way to do this?
Date: 2012-03-05 03:50:22
Message-ID: 0CE18B85-3BE3-4B04-8BD7-612D1DEFD5A2@yahoo.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mar 4, 2012, at 22:31, Chris Angelico <rosuav(at)gmail(dot)com> wrote:

> (Hoping you meant for that reply to be on-list as I'm here responding on-list.)
>
> On Mon, Mar 5, 2012 at 2:16 PM, A.M. <agentm(at)themactionfaction(dot)com> wrote:
>>
>> On Mar 4, 2012, at 9:13 PM, Chris Angelico wrote:
>>
>>> One of our tables has a few columns that may be interpreted as strings
>>> or may be numbers (data type is varchar, numbers are stored as
>>> decimal). Generally, operations are performed on the string, but
>>> sometimes we need to parse out a number - without it failing on error.
>>> I wrote the following function to approximate to the semantics of
>>> atoi:
>>
>> I would recommend against such a schema since different data types should warrant their own columns, but if you are left with no choice...
>
> The values have to be strings for other reasons (eg '' is valid
> everywhere, and this is subsequently processed by a script that
> expects all strings). So yeah, no choice there. But I agree that
> normally you DO want integers stored in integer columns, and we're
> paying a performance penalty for this.
>
>>>
>>> create or replace function str2int(val varchar) returns bigint immutable as $$
>>> begin
>>> val=substring(val from '[0-9]*');
>>> if length(val) between 1 and 19 then return val::bigint; end if;
>>> return 0;
>>> end;
>>> $$ language plpgsql;
>>
>> This can be written as:
>> select substring('35345345.45645654' from '\d{1,19}')::bigint;
>>
>> Be aware that this does not account for:
>> 3dogs
>> 3 dogs
>> 3,dogs
>> 3.5.6.7
>>
>> all of which will return 3::bigint, but I assume that the column is not completely free-form or maybe this is what you want. (?)
>>
>> Cheers,
>> M
>
> That is in fact the behaviour I want. Trouble is, that simpler version
> returns NULL if given 'asdf' as its input - I want it to return 0.

COALESCE is your friend

> It's also about the same performance (slightly slower in a quick test)
> than the original; it's still doing a regular expression parse. I was
> hoping very much to avoid the regex altogether.
>
> ChrisA
>

Any efficient, non-RegEx, alternative would require more context to evaluate than you provide. Mainly, would it be faster to have a separate field to store the parsed (at input) number and then query that field directly (even if it is a text field as well)? Basically cache the parse.

David J..

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Chris Angelico 2012-03-05 04:06:49 Re: atoi-like function: is there a better way to do this?
Previous Message Chris Angelico 2012-03-05 03:31:30 Re: atoi-like function: is there a better way to do this?