Skip site navigation (1) Skip section navigation (2)

Re: [HACKERS] Reducing the overhead of NUMERIC data

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>,pgsql-hackers(at)postgresql(dot)org, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Reducing the overhead of NUMERIC data
Date: 2005-11-02 19:36:19
Message-ID: 1130960179.8300.1777.camel@localhost.localdomain (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches
On Wed, 2005-11-02 at 13:46 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > On Tue, 2005-11-01 at 17:55 -0500, Tom Lane wrote:
> >> I don't think it'd be worth having 2 types.  Remember that the weight is
> >> measured in base-10k digits.  Suppose for instance
> >> 	sign		1 bit
> >> 	weight		7 bits (-64 .. +63)
> >> 	dscale		8 bits (0..255)
> 
> > I've coded a short patch to do this, which is the result of two
> > alternate patches and some thinking, but maybe not enough yet.
> 
> What your patch does is

Thanks for checking this so quickly.

> 
> 	sign		2 bits

OK, thats just a mistake in my second patch. Thats easily corrected.
Please ignore that for now.

> 	weight		8 bits (-128..127)
> 	dscale		6 bits (0..63)
> 
> which is simply pretty lame: weight effectively has a factor of 8 more
> dynamic range than dscale in this representation.  What's the point of
> being able to represent 1 * 10000 ^ -128 (ie, 10^-512) if the dscale
> only lets you show 63 fractional digits?  You've got to allocate the
> bits in a saner fashion.  Yes, that takes a little more work.

I wasn't trying to claim the bit assignment made sense. My point was
that the work to mangle the two fields together to make it make sense
looked like it would take more CPU (since the standard representation of
signed integers is different for +ve and -ve values). It is the more CPU
I'm worried about, not the wasted bits on the weight. Spending CPU
cycles on *all* numerics just so we can have numbers with > +/-64
decimal places doesn't seem a good trade. Hence I stuck the numeric sign
back on the dscale, and so dscale and weight seem out of balance.

So, AFAICS, the options are:
0 (current cvstip)
   Numeric range up to 1000, with additional 2 bytes per column value
1. Numeric range up to 128, but with overhead to extract last bit
2. Numeric range up to 64

I'm suggesting we choose (2).... other views are welcome.

(I'll code it whichever way we decide.)

> Also, since the internal (unpacked) calculation representation has a
> much wider dynamic range than this, it'd probably be appropriate to add
> some range checks to the code that forms a packed value from unpacked.

Well, there already is one that does that, otherwise I would have added
one as you suggest. (The unpacked code has int values, whereas the
previous packed format used u/int16 values).

Best Regards, Simon Riggs


In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2005-11-02 20:09:45
Subject: Re: [HACKERS] Reducing the overhead of NUMERIC data
Previous:From: Robert CreagerDate: 2005-11-02 19:27:03
Subject: Assert failure found in 8.1RC1

pgsql-patches by date

Next:From: Simon RiggsDate: 2005-11-02 19:55:18
Subject: Re: [PATCHES] Partitioning docs
Previous:From: Tom LaneDate: 2005-11-02 19:04:09
Subject: Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group