Skip site navigation (1) Skip section navigation (2)

Re: type design guidance needed

From: "Evgeni E(dot) Selkov" <selkovjr(at)mcs(dot)anl(dot)gov>
To: brook(at)biology(dot)nmsu(dot)edu
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: type design guidance needed
Date: 2000-09-23 04:41:41
Message-ID: 200009230441.XAA09037@juju.mcs.anl.gov (view raw or flat)
Thread:
Lists: pgsql-hackers
Brook,

I have been contemplating such data type for years. I believe I have
assembled the most important parts, but I did not have time to
complete the whole thing.

The idea is that hte units of measurement can be treated as arithmetic
expressions. One can assign each of the few existing base units a
fixed position in a bit vector, parse the expression, then evaluate it
to obtain three things: scale factor, numerator and quotient, the
latter two being bit vectors.

So, if you assign the base units as

  'm'    => 1,
  'kg'   => 2,
  's'    => 4,
  'K'    => 8,
  'mol'  => 16,
  'A'    => 32,
  'cd'   => 64,

the unit, umol/min/mg, will be represented as 

(0.01667, 00010000,00000110). 

Such structure is compact enough to be stashed into an atomic type.
In fact, one needs more than just a plain bit vector to represent
exponents:

umol/min/ml => (0.01667, '00010000', '00000103') (because ml is a m^3)

Here I use the whole charater per bit for clarity, but one does not
need more than two or three bits -- you normally don't have kg^4 or
m^7 in your units.

I considered other alternatives, but none seemed as good as an atomic
type. I can bet you will see performance problems and indexing
nightmare with non-atomic solutions well before you hit the space
constraints with the atomic type. You are even likely to see the space
problems with the non-atomic storage: pointers can easily cost more
than compacted units.

There are numerous benefits to the atomic type. The units can be
re-assembled on the output, the operators can be written to work on
non-normalized units and discard the incompatible ones, and the
chances that you screw up the unit integrity are none.

So, if that makes sense, I will be willing to funnel more energy into
this project, and I would aprreciate any co-operation.

In the meanwhile, you might want to check out what I have done so far.

1. A perl parser for the units of measurement that computes units as
   algebraic expressions. I have done it in perl for the ease of
   prototyping, but it is flex- and bison-generated and can be ported
   to c and included into the data type.

   Get it from
   http://wit.mcs.anl.gov/~selkovjr/Unit.tgz

   This is a regular perl extension; do a 

	perl Makefile.PL; make; make install

   type of thing, but first you need to build and install my version of
   bison, http://wit.mcs.anl.gov/~selkovjr/camel-1.24.tar.gz

   There is a demo script that you can run as follows

        perl browse.pl units

2. The postgres extension, seg, to which I was planning to add the
   units of measurement. It has its own use already, and it
   exemplifies the use of the yacc parser in an extension.

   Please see the README in 

	http://wit.mcs.anl.gov/~selkovjr/pg_extensions/

   as well as a brief description in 

	http://wit.mcs.anl.gov/EMP/seg-type.html

   and a running demo in 

	http://wit.mcs.anl.gov/EMP/indexing.html (search for seg)

Food for thought.

--Gene

pgsql-hackers by date

Next:From: Tom LaneDate: 2000-09-23 05:41:22
Subject: Re: type design guidance needed
Previous:From: Brook MilliganDate: 2000-09-22 23:05:24
Subject: type design guidance needed

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group