Re: refactoring - share str2*int64 functions

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: refactoring - share str2*int64 functions
Date: 2019-09-17 02:29:13
Message-ID: 20190917022913.GB1660@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 16, 2019 at 10:08:19AM -0700, Andres Freund wrote:
> On 2019-09-14 15:02:36 +0900, Michael Paquier wrote:
>> - Switching OID to use pg_strtoint32 causes a failure with initdb.
>
> Needs to be debugged too. Although I suspect this might just be that you
> need to use unsigned variant.

No, that's not it. My last message had a typo as I used the unsigned
variant. Anyway, by switching the routines of readfuncs.c to use the
new _check wrappers it is easy to see what the actual issue is. And
here we go with one example:
"FATAL: invalid input syntax for type uint32: "12089 :relkind v"

So, the root of the problem is that the optimized conversion routines
would complain if the end of the string includes incorrect characters
when converting a node from text, which is not something that strtoXX
complains about. So our wrappers atooid() and atoui() accept more
types of strings in input as they rely on the system's strtol(). And
that counts for the failures with UINT and OID. atoi() also is more
flexible on that, which explains the failures for INT, as well as
atol() for LONG (this shows a failure in the regression tests, not at
initdb time though).

One may think that this restriction does not actually apply to
READ_UINT64_FIELD because the routine involves no other things than
queryId. However once I enable -DWRITE_READ_PARSE_PLAN_TREES
-DWRITE_READ_PARSE_PLAN_TREES -DCOPY_PARSE_PLAN_TREES in the builds,
queryId parsing also complains with the patch. So except if we
redesign the node reads we are bound to keep around the wrapper of
strtoXX on HEAD called pg_strtouint64() to avoid an incompatibility
when parsing the 64-bit query ID. We could keep that isolated in
readfuncs.c close to the declarations of atoui() and strtobool()
though. This also points out that pg_strtouint64 of HEAD is
inconsistent with its signed relatives in the treatment of the input
string.

The code paths of the patch calling pg_strtouint64_check to parse
completion tags (spi.c and pg_stat_statements.c) should complain in
those cases as the format of the tags for SELECT and COPY is fixed.

In order to unify the parsing interface and put all the conversion
routines in a single place, I still think that the patch has value so
I would still keep it (with a fix for the queryId parsing of course),
but there is much more to it.

>> So while I agree with you that a switch should be doable, there is a
>> large set of issues to ponder about here, and the patch already does a
>> lot, so I really think that we had better do a closer lookup at those
>> issues separately, once the basics are in place, and consider them if
>> they actually make sense. There is much more than just doing a direct
>> switch in this area with the family of ato*() system calls.
>
> I have no problem applying this separately, but I really don't think
> it's wise to apply this before these problems have been debugged.

Sure. No problem with that line of reasoning.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuro Yamada 2019-09-17 02:34:21 Re: [HACKERS] CLUSTER command progress monitor
Previous Message Alvaro Herrera 2019-09-17 02:08:32 Re: [HACKERS] CLUSTER command progress monitor