Re: custom function for converting human readable sizes to bytes

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: custom function for converting human readable sizes to bytes
Date: 2016-01-04 15:17:16
Message-ID: CAFj8pRAf2m3aw1Jgj=qRoP7f0LW7u_juqhMF7A1ZBVDh2WD7JA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

2016-01-04 12:46 GMT+01:00 Shulgin, Oleksandr <oleksandr(dot)shulgin(at)zalando(dot)de>
:

> On Wed, Dec 30, 2015 at 8:28 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
>
>>
>>
>> 2015-12-30 17:33 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>
>>> On Mon, Dec 28, 2015 at 8:45 AM, Shulgin, Oleksandr
>>> <oleksandr(dot)shulgin(at)zalando(dot)de> wrote:
>>> > I didn't check out earlier versions of this patch, but the latest one
>>> still
>>> > changes pg_size_pretty() to emit PB suffix.
>>> >
>>> > I don't think it is worth it to throw a number of changes together like
>>> > that. We should focus on adding pg_size_bytes() first and make it
>>> > compatible with both pg_size_pretty() and existing GUC units: that is
>>> > support suffixes up to TB and make sure they have the meaning of
>>> powers of
>>> > 2^10, not 10^3. Re-using the table present in guc.c would be a plus.
>>> >
>>> > Next, we could think about adding handling of PB suffix on input and
>>> output,
>>> > but I don't see a big problem if that is emitted as 1024TB or the user
>>> has
>>> > to specify it as 1024TB in a GUC or argument to pg_size_bytes(): an
>>> minor
>>> > inconvenience only.
>>>
>>> +1 to everything in this email.
>>>
>>
>> so I removed support for PB and SI units. Now the
>> memory_unit_conversion_table is shared.
>>
>
> Looks better, thanks.
>
> I'm not sure why the need to touch the regression test for
> pg_size_pretty():
>
> ! 10.5 | 10.5 bytes | -10.5 bytes
> ! 1000.5 | 1000.5 bytes | -1000.5 bytes
> ! 1000000.5 | 977 kB | -977 kB
> ! 1000000000.5 | 954 MB | -954 MB
> ! 1000000000000.5 | 931 GB | -931 GB
> ! 1000000000000000.5 | 909 TB | -909 TB
>
>
fixed

> A nitpick, this loop:
>
> + while (*cp)
> + {
> + if ((isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
> + digits[ndigits++] = *cp++;
> + else
> + break;
> + }
>
> would be a bit easier to parse if spelled as:
>
> + while (*cp && (isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
> + digits[ndigits++] = *cp++;
>

fixed

>
> On the other hand, this seems to truncate the digits silently:
>
> + digits[ndigits] = '\0';
>
> I don't think we want that, e.g:
>
> postgres=# select pg_size_bytes('9223372036854775807.9');
> ERROR: invalid unit "9"
> HINT: Valid units for this parameter are "kB", "MB", "GB", and "TB".
>
> I think making a mutable copy of the input string and truncating it before
> passing to numeric_in() would make more sense--no need to hard-code
> MAX_DIGITS. The same goes for hard-coding MAX_UNIT_LEN, e.g. compare the
> following two outputs:
>
> postgres=# select pg_size_bytes('1 KiB');
> ERROR: invalid unit "KiB"
> HINT: Valid units for this parameter are "kB", "MB", "GB", and "TB".
>
> postgres=# select pg_size_bytes('1024 bytes');
> ERROR: invalid format
>
>
fixed

> I believe we should see a similar error message and a hint in the latter
> case. (No, I don't think we should add support for 'bytes' as a unit, not
> even for "compatibility" with pg_size_pretty()--for one, I don't think it
> would be wise to expect pg_size_bytes() to be able to deparse *every*
> possible output produced by pg_size_pretty() as it's purpose is
> human-readable display; also, pg_size_pretty() can easily produce output
> that doesn't fit into bigint type, or is just negative)
>
> Code comments and doc change need proof-reading by a native English
> speaker, which I am not.
>

Regards

Pavel

>
> --
> Alex
>
>

Attachment Content-Type Size
pg-size-bytes-06.patch text/x-patch 10.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-01-04 15:20:43 Re: Some 9.5beta2 backend processes not terminating properly?
Previous Message Stephen Frost 2016-01-04 15:03:51 Re: Rationalizing Query.withCheckOptions