Re: [HACKERS] Reducing the overhead of NUMERIC data

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-hackers(at)postgresql(dot)org, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Reducing the overhead of NUMERIC data
Date: 2005-11-02 19:36:19
Message-ID: 1130960179.8300.1777.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Wed, 2005-11-02 at 13:46 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > On Tue, 2005-11-01 at 17:55 -0500, Tom Lane wrote:
> >> I don't think it'd be worth having 2 types. Remember that the weight is
> >> measured in base-10k digits. Suppose for instance
> >> sign 1 bit
> >> weight 7 bits (-64 .. +63)
> >> dscale 8 bits (0..255)
>
> > I've coded a short patch to do this, which is the result of two
> > alternate patches and some thinking, but maybe not enough yet.
>
> What your patch does is

Thanks for checking this so quickly.

>
> sign 2 bits

OK, thats just a mistake in my second patch. Thats easily corrected.
Please ignore that for now.

> weight 8 bits (-128..127)
> dscale 6 bits (0..63)
>
> which is simply pretty lame: weight effectively has a factor of 8 more
> dynamic range than dscale in this representation. What's the point of
> being able to represent 1 * 10000 ^ -128 (ie, 10^-512) if the dscale
> only lets you show 63 fractional digits? You've got to allocate the
> bits in a saner fashion. Yes, that takes a little more work.

I wasn't trying to claim the bit assignment made sense. My point was
that the work to mangle the two fields together to make it make sense
looked like it would take more CPU (since the standard representation of
signed integers is different for +ve and -ve values). It is the more CPU
I'm worried about, not the wasted bits on the weight. Spending CPU
cycles on *all* numerics just so we can have numbers with > +/-64
decimal places doesn't seem a good trade. Hence I stuck the numeric sign
back on the dscale, and so dscale and weight seem out of balance.

So, AFAICS, the options are:
0 (current cvstip)
Numeric range up to 1000, with additional 2 bytes per column value
1. Numeric range up to 128, but with overhead to extract last bit
2. Numeric range up to 64

I'm suggesting we choose (2).... other views are welcome.

(I'll code it whichever way we decide.)

> Also, since the internal (unpacked) calculation representation has a
> much wider dynamic range than this, it'd probably be appropriate to add
> some range checks to the code that forms a packed value from unpacked.

Well, there already is one that does that, otherwise I would have added
one as you suggest. (The unpacked code has int values, whereas the
previous packed format used u/int16 values).

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-11-02 20:09:45 Re: [HACKERS] Reducing the overhead of NUMERIC data
Previous Message Robert Creager 2005-11-02 19:27:03 Assert failure found in 8.1RC1

Browse pgsql-patches by date

  From Date Subject
Next Message Simon Riggs 2005-11-02 19:55:18 Re: [PATCHES] Partitioning docs
Previous Message Tom Lane 2005-11-02 19:04:09 Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags