Re: refactoring - share str2*int64 functions

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: refactoring - share str2*int64 functions
Date: 2019-09-14 06:02:36
Message-ID: 20190914060236.GG15406@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 13, 2019 at 06:38:31PM -0700, Andres Freund wrote:
> On 2019-09-10 12:05:25 +0900, Michael Paquier wrote:
>> On Mon, Sep 09, 2019 at 05:27:04AM -0700, Andres Freund wrote:
>> Attached is an updated patch? How does it look? I have left the
>> parts of readfuncs.c for now as there are more issues behind that than
>> doing a single switch, short reads are one, long reads a second.
>
> Hm? I don't know what you mean by those issues.

I think that we have more issues than it looks. For example:
- Switching UINT to use pg_strtouint32() causes an incompatibility
issue compared to atoui().
- Switching INT to use pg_strtoint32() causes a set of warnings as for
example with AttrNumber:
72 | (void) pg_strtoint32(token, &local_node->fldname)
| ^~~~~~~~~~~~~~~~~~~~~
| |
| AttrNumber * {aka short int *}
And it is not like we should use a cast either, as we could hide real
issues. Hence it seems to me that we need to have a new routine
definition for shorter integers and switch more flags to that.
- Switching LONG to use pg_strtoint64() leads to another set of
issues, particularly one could see an assertion failure related to Agg
nodes. I am not sure either that we should use int64 here as the size
can be at least 32b.
- Switching OID to use pg_strtoint32 causes a failure with initdb.

So while I agree with you that a switch should be doable, there is a
large set of issues to ponder about here, and the patch already does a
lot, so I really think that we had better do a closer lookup at those
issues separately, once the basics are in place, and consider them if
they actually make sense. There is much more than just doing a direct
switch in this area with the family of ato*() system calls.

>> And the patch already does a lot. There could be also an argument for
>> having extra _check wrappers for the unsigned portions but these would
>> be mostly unused in the backend code, so I have left that out on
>> purpose.
>
> I'd value consistency higher here.

Okay, no objections to that.

>> @@ -2338,8 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
>> CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
>>
>> if (strncmp(completionTag, "SELECT ", 7) == 0)
>> - _SPI_current->processed =
>> - pg_strtouint64(completionTag + 7, NULL, 10);
>> + (void) pg_strtouint64(completionTag + 7, &_SPI_current->processed);
>
> I'd just use the checked version here, seems like a good thing to check
> for, and I can't imagine it matters performance wise.

Yeah, makes sense. I don't think it matters either for
pg_stat_statements in the same context. So changed that part as
well.

>> @@ -63,8 +63,16 @@ Datum
>> int2in(PG_FUNCTION_ARGS)
>> {
>> char *num = PG_GETARG_CSTRING(0);
>> + int16 res;
>> + pg_strtoint_status status;
>>
>> - PG_RETURN_INT16(pg_strtoint16(num));
>> + /* Use a custom set of error messages here adapted to the data type */
>> + status = pg_strtoint16(num, &res);
>
> I don't know what that comment is supposed to mean?

I mean here that the _check equivalent cannot be used as any error
messages generated need to be consistent with the SQL data type. I
have updated the comment, does it look better now?

>> +/*
>> + * pg_strtoint64_check
>> + *
>> + * Convert input string to a signed 64-bit integer.
>> + *
>> + * This throws ereport() upon bad input format or overflow.
>> + */
>> +int64
>> +pg_strtoint64_check(const char *s)
>> +{
>> + int64 result;
>> + pg_strtoint_status status = pg_strtoint64(s, &result);
>> +
>> + if (unlikely(status != PG_STRTOINT_OK))
>> + pg_strtoint_error(status, s, "int64");
>> + return result;
>> +}
>
> I think I'd just put these as inlines in the header.

I have not considered that. This bloats a bit more builtins.h. We
could live with that, or just move that into a separate header in
include/utils/, say int.h? Even if common/int.h exists?

Attached is an updated patch. Perhaps you have something else in
mind?
--
Michael

Attachment Content-Type Size
str2int-14.patch text/x-diff 39.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-09-14 06:03:57 Re: [HACKERS] [PATCH] pageinspect function to decode infomasks
Previous Message Amit Kapila 2019-09-14 05:48:37 Re: [HACKERS] [PATCH] pageinspect function to decode infomasks