Re: reducing NUMERIC size for 9.1

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: reducing NUMERIC size for 9.1
Date: 2010-07-30 17:42:04
Message-ID: AANLkTinHRCH7h+rY1nba3yV2Fz-cjiJQ77xVzzZRC+sh@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 30, 2010 at 9:16 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Fri, Jul 30, 2010 at 1:13 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Perhaps, but I think you're confused on at least one point.
>>> numeric(2,1) has to be able to hold 2 decimal digits, not 2
>>> NumericDigits (which'd actually be 8 decimal digits given
>>> the current code).
>
>> I get that.  The point is: if one of those 2 decimal digits is before
>> the decimal point and the other is after it, then two NumericDigits
>> will be used.
>
> Ah, I see.  Maybe we should allow for one more NumericDigit in the
> calculation for such cases.  I guess you could look at the scale too
> to detect if the case is possible, but not sure it's worth the trouble.

Since this just needs to be a reasonable upper bound, probably not.

Another small but related issue is this comment is out-of-date:

/* Numeric stores 2 decimal digits/byte, plus header */
return (precision + 1) / 2 + NUMERIC_HDRSZ;

Currently, numeric stores 4 decimal digits/2 bytes, which is not quite
the same thing. I think that for clarity it would make sense to break
this calculation in two parts. First, estimate the number of
NumericDigits that could be needed; then, estimate the amount of space
that could be needed to store them. Maybe something like this,
obviously with a suitable comment which I haven't written yet:

numeric_digits = (precision + 6) / 4;
return (numeric_digits * sizeof(int16)) + NUMERIC_HDRSZ;

The first line would be incorrect for precision 0 (it would produce 1,
not 0) but precision 0 isn't allowed anyway, and overestimating just
that one case would be OK even if it did. As far as I can see, it's
correct for all higher values. If you have 1 decimal digit, you can't
need more than one NumericDigit, but if you have 2 decimal digits, you
can need one for each side of the decimal point. But you can't manage
to get a third NumericDigit until you get up to 6 decimal digits,
because with only 5 you can split them 1-4 or 3-2 across the decimal
point, but getting to a second NumericDigit on either side of the
decimal point uses up all 5 digits, leaving nothing for the other
side. Similarly, to get a fourth NumericDigit, you have to have
enough decimal digits to split them 1-4-4-1 (with the decimal point
somewhere in the middle), so 10 is the minimum.

Another question here is whether we should just fix this in CVS HEAD,
or whether there's any reason to back-patch it. I can't see much
reason to back-patch, unless there's third-party code depending on
this for something more than what we use it for. The only callers of
type_maximum_size are get_typavgwidth(), which doesn't need to be
exact, and needs_toast_table(). So it seems that the worst thing that
could happen as a result of this problem is that we might fail to
create a toast table when one is needed, and even that seems extremely
unlikely. First, you'd need to have a table containing no text or
unlimited-length varchar columns, because those will force toast table
creation anyway. Second, numerics are typically going to be stored
with a short varlena header on disk, so a 2-byte underestimate of the
max size is going to be swamped by a 3-byte savings on the varlena
header. Third, even if you can find a case where the above doesn't
matter, it's not that hard to force a toast table to get created.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-07-30 18:03:59 Re: gincostestimate
Previous Message Alex Hunsaker 2010-07-30 17:34:45 Re: patch for check constraints using multiple inheritance