Re: custom function for converting human readable sizes to bytes

From: Thom Brown <thom(at)linux(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: custom function for converting human readable sizes to bytes
Date: 2016-01-04 16:16:41
Message-ID: CAA-aLv5NTUAym+QqLscYGO3v5kPFgJJhwd2wmTVrf+g5ec3W0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4 January 2016 at 15:17, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> Hi
>
> 2016-01-04 12:46 GMT+01:00 Shulgin, Oleksandr
> <oleksandr(dot)shulgin(at)zalando(dot)de>:
>>
>> On Wed, Dec 30, 2015 at 8:28 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>> wrote:
>>>
>>>
>>>
>>> 2015-12-30 17:33 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>>>
>>>> On Mon, Dec 28, 2015 at 8:45 AM, Shulgin, Oleksandr
>>>> <oleksandr(dot)shulgin(at)zalando(dot)de> wrote:
>>>> > I didn't check out earlier versions of this patch, but the latest one
>>>> > still
>>>> > changes pg_size_pretty() to emit PB suffix.
>>>> >
>>>> > I don't think it is worth it to throw a number of changes together
>>>> > like
>>>> > that. We should focus on adding pg_size_bytes() first and make it
>>>> > compatible with both pg_size_pretty() and existing GUC units: that is
>>>> > support suffixes up to TB and make sure they have the meaning of
>>>> > powers of
>>>> > 2^10, not 10^3. Re-using the table present in guc.c would be a plus.
>>>> >
>>>> > Next, we could think about adding handling of PB suffix on input and
>>>> > output,
>>>> > but I don't see a big problem if that is emitted as 1024TB or the user
>>>> > has
>>>> > to specify it as 1024TB in a GUC or argument to pg_size_bytes(): an
>>>> > minor
>>>> > inconvenience only.
>>>>
>>>> +1 to everything in this email.
>>>
>>>
>>> so I removed support for PB and SI units. Now the
>>> memory_unit_conversion_table is shared.
>>
>>
>> Looks better, thanks.
>>
>> I'm not sure why the need to touch the regression test for
>> pg_size_pretty():
>>
>> ! 10.5 | 10.5 bytes | -10.5 bytes
>> ! 1000.5 | 1000.5 bytes | -1000.5 bytes
>> ! 1000000.5 | 977 kB | -977 kB
>> ! 1000000000.5 | 954 MB | -954 MB
>> ! 1000000000000.5 | 931 GB | -931 GB
>> ! 1000000000000000.5 | 909 TB | -909 TB
>>
>
> fixed
>
>>
>> A nitpick, this loop:
>>
>> + while (*cp)
>> + {
>> + if ((isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
>> + digits[ndigits++] = *cp++;
>> + else
>> + break;
>> + }
>>
>> would be a bit easier to parse if spelled as:
>>
>> + while (*cp && (isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
>> + digits[ndigits++] = *cp++;
>
>
> fixed
>
>>
>>
>> On the other hand, this seems to truncate the digits silently:
>>
>> + digits[ndigits] = '\0';
>>
>> I don't think we want that, e.g:
>>
>> postgres=# select pg_size_bytes('9223372036854775807.9');
>> ERROR: invalid unit "9"
>> HINT: Valid units for this parameter are "kB", "MB", "GB", and "TB".
>>
>> I think making a mutable copy of the input string and truncating it before
>> passing to numeric_in() would make more sense--no need to hard-code
>> MAX_DIGITS. The same goes for hard-coding MAX_UNIT_LEN, e.g. compare the
>> following two outputs:
>>
>> postgres=# select pg_size_bytes('1 KiB');
>> ERROR: invalid unit "KiB"
>> HINT: Valid units for this parameter are "kB", "MB", "GB", and "TB".
>>
>> postgres=# select pg_size_bytes('1024 bytes');
>> ERROR: invalid format
>>
>
> fixed

Hi,

Some feedback:

+ Converts a size in human-readable format with size units
+ into bytes. The parameter is case insensitive string. Following
+ units are supported: kB, MB, GB, TB.

May I suggest:

"Converts a size in human-readable format with size units into bytes.
The parameter is case-insensitive, and the supported size units are
are: kB, MB, GB and TB."

+ * Convert human readable size to long int.

Big int?

+ * Due suppor decimal value and case insensitivity of units
+ * a function parse_intcannot be used.

Is this supposed to be saying:

"Due to support for decimal values..."?

and "a function parse_int cannot be used."?

+ * Use buffer as unit if there are not any nonspace char,
+ * else use a original unit string.

s/use a/use an/

+ * Now, the multiplier is in KB unit. It should be multiplied by 1024
+ * before usage

s/unit/units/

Regards

Thom

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2016-01-04 16:31:51 Re: Building pg_xlogdump reproducibly
Previous Message Pavel Stehule 2016-01-04 16:04:50 Re: custom function for converting human readable sizes to bytes