Re: refactoring - share str2*int64 functions

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: refactoring - share str2*int64 functions
Date: 2019-08-29 06:14:54
Message-ID: alpine.DEB.2.21.1908290705310.28828@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Bonjour Michaël,

>> - *ptr && WHATEVER(*ptr)
>> *ptr is redundant, WHATEVER yields false on '\0', and it costs on each
>> char but at the end. It might be debatable in some places, e.g. it is
>> likely that there are no spaces in the string, but likely that there are
>> more than one digit.
>
> Still this makes the checks less robust?

I do not see any downside, but maybe I lack imagination.

On my laptop isWHATEVER is implementd through an array mapping characters
to a bitfield saying whether each char is WHATEVER, depending on the bit.
This array is well defined for index 0 ('\0').

If an implementation is based on comparisons, for isdigit it would be:

c >= '0' && c <= '9'

Then checking c != '\0' is redundant with c >= '0'.

Given the way the character checks are used in the function, we do not go
beyond the end of the string because we only proceed further when a
character is actually recognized, else we return.

So I cannot see any robustness issue, just a few cycles saved.

>> Hmmm. Have you looked at the fallback cases when the corresponding builtins
>> are not available?
>>
>> I'm unsure of a reliable way to detect a generic unsigned int overflow
>> without expensive dividing back and having to care about zero…
>
> Mr Freund has mentioned that here:
> https://www.postgresql.org/message-id/20190717184820.iqz7schxdbucmdmu@alap3.anarazel.de

Yep, that is what I mean by expensive: there is an integer division, which
is avoided if b is known to be 10, hence is not zero and the limit value
can be checked directly on the input without having to perform a division
each time.

>> So I was pretty happy with my two discreet, small and efficient tests.
>
> That's also a matter of code and interface consistency IMHO.

Possibly.

ISTM that part of the motivation is to reduce costs for heavily used
conversions, eg on COPY. Function "scanf" is overly expensive because it
has to interpret its format, and we have to check for overflows…

Anyway, if we assume that the builtins exist and rely on efficient
hardware check, maybe we do not care about the fallback cases which would
just be slow but never executed.

Note that all this is moot, as all instances of string to uint64
conversion do not check for errors.

Attached v7 does create uint64 overflow inline functions. The stuff yet is
not moved to "common/int.c", a file which does not exists, even if
"common/int.h" does.

--
Fabien.

Attachment Content-Type Size
str2int-7.patch text/x-diff 23.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2019-08-29 06:30:07 Re: BUG #15977: Inconsistent behavior in chained transactions
Previous Message Pavel Stehule 2019-08-29 04:36:06 Re: doc: update PL/pgSQL sample loop function