Re: custom function for converting human readable sizes to bytes

From: Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: custom function for converting human readable sizes to bytes
Date: 2016-02-18 02:00:18
Message-ID: CAKOSWNni38qQh0tszLY9scknD_Q6D2xXfhOA5x2fJ061PWD0aQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/17/16, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> On 17 February 2016 at 00:39, Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>
> wrote:
>> Now parse_memory_unit returns -1024 for bytes as divider, constant
>> "bytes" has moved there.
>> Add new memory_units_bytes_hint which differs from an original
>> memory_units_int by "bytes" size unit.
>> Allow hintmsg be NULL and if so, skip setting dereferenced variable to
>> memory_units_bytes_hint.
>>
>
> I think that approach is getting more and more unwieldy, and it simply
> isn't worth the effort just to share a few values from the unit
> conversion table, especially given that the set of supported units
> differs between the two places.
>
>> On 2/16/16, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>>> ISTM that it would be far less code, and much simpler and more
>>> readable to just parse the supported units directly in
>>> pg_size_bytes(), rather than trying to share code with guc.c, when the
>>> supported units are actually different and may well diverge further in
>>> the future.
>
> I've gone with this approach and it is indeed far less code, and much
> simpler and easier to read. This will also make it easier to
> maintain/extend in the future.
>
> I've made a few minor tweaks to the docs, and added a note to make it
> clear that the units in these functions work in powers of 2 not 10.
>
> I also took the opportunity to tidy up the number scanning code
> somewhat (I was tempted to rip it out entirely, since it feels awfully
> close to duplicating the numeric code, but it's probably worth it for
> the better error message).

Yes, the better error message was the only reason to leave that scan.

> Additionally, note that I replaced strcasecmp() with pg_strcasecmp(),
> since AIUI the former is not available on all supported platforms.

Thank you. I didn't know that function exists.

> Barring objections, and subject to some more testing, I intend to
> commit this version.
>
> Regards,
> Dean

Than you for your work. Your version seems even better that refactored by me.
But I have several questions (below).

> + bool have_digits;
Agree, "main_digits" and "error" together was redundant, "have_digits"
is enough.

> + else
> + have_digits = false;
Does it worth to move it to the declaration and remove that "else" block?
+ bool have_digits = false;

Is it important to use:
> + ObjectIdGetDatum(InvalidOid),
> + Int32GetDatum(-1)));
instead of "0, -1"? Previously I left immediate constants because I
found the same thing in jsonb.c (rows 363, 774)...

> + if (pg_strcasecmp(strptr, "bytes") == 0)
> + else if
> + else if
> + else if
It seems little ugly for me. In that case (not using values from GUC)
I'd create a static array for it and do a small cycle via it. Would it
better?

> + errmsg("invalid size: \"%s\"", text_to_cstring(arg)),
Agree. Usage of "text_to_cstring" again is a good idea for improving
readability.

> + NumericGetDatum(mul_num),
Two more space for indentation.

Also I've found that with allowing exponent there is a weird result
(we tried to avoid "type numeric" in the error message):
SELECT pg_size_bytes('123e100000000000000000000000000000000000000000000000000000000000000000000000
kB');
ERROR: invalid input syntax for type numeric:
"123e100000000000000000000000000000000000000000000000000000000000000000000000"

[1]http://www.postgresql.org/docs/9.5/static/error-style-guide.html#AEN110702
--
Best regards,
Vitaly Burovoy

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2016-02-18 02:02:11 Re: exposing pg_controldata and pg_config as functions
Previous Message Tatsuo Ishii 2016-02-18 01:23:38 Re: Figures in docs