Re: extensible enum types

From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: extensible enum types
Date: 2010-06-19 20:14:29
Message-ID: AANLkTimFxo6594Fs_AlodJpqP0is0XS71uGsqBSiEJWf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 18, 2010 at 12:59 PM, Andrew Dunstan <andrew(at)dunslane(dot)net>wrote:

>
>
> David E. Wheeler wrote:
>
>> On Jun 18, 2010, at 9:34 AM, Andrew Dunstan wrote:
>>
>>
>>
>>> I'd be perfectly happy to hear a reasonable alternative. Assuming we use
>>> some integer representation, given two labels represented by n and n+1, we
>>> can't add a label between them without rewriting the tables that use the
>>> type, whether it's my representation scheme or some other. Maybe we could
>>> have a FORCE option which would rewrite if necessary.
>>>
>>>
>>
>> People would likely always use it.
>>
>> Why not use a decimal number?
>>
>>
>>
>>
>
> You are just bumping up the storage cost. Part of the attraction of enums
> is their efficiency.
>
>
Probably it'd be the same as the decimal suggestion above, but we can use
floating-point data type.

It will allow injection of a new label at any stage.

CREATE leads to

Label1 -> 1.0
Label2 -> 2.0
Label3 -> 3.0

ALTER ... ADD Label4 AFTER Label2; leads to
Label1 -> 1.0
Label2 -> 2.0
Label4 -> 2.5
Label3 -> 3.0

ALTER ... ADD Label5 AFTER Label2; leads to
Label1 -> 1.0
Label2 -> 2.0
Label5 -> 2.25
Label4 -> 2.5
Label3 -> 3.0

Since floating-point implementation is architecture dependent, the ALTER
command should check that the injected value does not equate to any value
around it (eg. comparisons of (2.5 == 2) and (2.25 == 2.5) should not yield
0); and if it does, then throw an error and ask the user to use the
rewrite-the-table version of the command.

And since it is still 32 bit, and comparisons done by machine, performance
should be acceptably close to current integer comparisons, and much faster
that the cache lookups etc. being proposed.

This is very similar to Andrew's original suggestion of splitting 32 bits
into 16+16, but managed by the machine hence no complicated comparison algos
needed on our part. Also, since this is all transparent to the SQL
interface, our dump-reload cycle or Slony replication, etc. should not be
affected either.

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2010-06-19 20:55:10 Re: extensible enum types
Previous Message Stefan Kaltenbrunner 2010-06-19 19:49:30 Re: beta3 & the open items list