Re: Enums patch v2

From: Tom Dunstan <pgsql(at)tomd(dot)cc>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Enums patch v2
Date: 2006-12-20 01:23:02
Message-ID: 45889076.5020507@tomd.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Heikki Linnakangas wrote:
> I'm sorry I missed the original discussions, but I have to ask: Why do
> we want enums in core? The only potential advantage I can see over using
> a look-up table and FK references is performance.

Well, there are a few things. Sometimes its tidiness, sometimes
integrity... I've seen more than one system with hundreds of these
things, and they've either gone down the table-per-enum solution, with
LOTS of extra tables whose values never change, or the EAV solution,
with one or two globally referenced tables to which everything in your
system has as a FK, and an integrity check in a trigger if you're very
lucky. Yuck on both accounts. Enums hit a sweet spot in the middle and
provide data integrity and performance for non-changing values.

> 1. What's the point of having comparison operators for enums? For most
> use cases, there's no natural ordering of enum values.

Well, there are a number of cases where ordering IS important, and
indeed, enums provide a way to do it easily where many of the
alternative solutions do not. It's one of the key benefits.

> 2. The comparison routine compares oids, right? If the oids wrap around
> when the enum values are created, the ordering isn't what the user expects.

As has been pointed out by others quicker on the draw than me, I do sort
the OIDs at enum creation time, for exactly this reason.

> 3. 4 bytes per value is wasteful if you're storing simple status codes
> etc. Especially if you're doing this for performance, let's do no harm
> by wasting space. One byte seems enough for the typical use cases. I'd
> even argue that having a high upper limit on the number of enum values
> encourages misuse of the feature.

I'd really love to have these fit into a 1 or 2 byte value on disk, but
AFAIK there's simply no way to do it currently in postgresql. If we ever
move to a solution where on-disk representation is noticeably different
from in-memory representation, then it might be doable. If that does
happen, we might benefit from other improvements such as being able to
order columns in a tuple on disk so as to minimize alignment padding,
not having to store a composite type's oid, etc. Until that happens,
though, if it ever does, this is probably the tightest on-disk
representation we're likely to get, unless we're happy to impose some
pretty severe restrictions, like 8 bits per enum, and only 256 enums in
total (for a 2 byte total). I was already shot down trying to make
similar restrictions when I first brought it up. :) The OID solution
seems to offend the least. :)

We did discuss this somewhat earlier, and I'm happy to take alternative
suggestions, but AFAIK this is about as good as it's going to get right now.

Cheers

Tom

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2006-12-20 01:25:50 Re: Autovacuum Improvements
Previous Message Hiroshi Saito 2006-12-20 01:20:25 Re: pg_restore fails with a custom backup file

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Dunstan 2006-12-20 01:39:58 Re: [HACKERS] Enums patch v2
Previous Message Bruce Momjian 2006-12-19 17:00:05 Re: Load distributed checkpoint patch