Re: Enums patch v2

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Tom Dunstan <pgsql(at)tomd(dot)cc>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Enums patch v2
Date: 2006-12-19 14:34:27
Message-ID: 4587F873.7070202@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
>
>> 1. What's the point of having comparison operators for enums? For most
>> use cases, there's no natural ordering of enum values.
>>
>
> If you would like to be able to index enum columns, or even GROUP BY one,
> you need those; whether the ordering is arbitrary or not is irrelevant.
>

Heikki's assertion is wrong in any case. The enumeration definition
defines the ordering, and I can think of plenty of use cases where it
does matter. We do not use an arbitrary ordering. An enum type is an
*ordered* set of string labels. Without this the feature would be close
to worthless. But if a particular application doesn't need them ordered,
it need not use the comparison operators. Leaving aside the uses for
GROUP BY and indexes, I would ask what the justification would be for
leaving off comparison operators?

>
>> 2. The comparison routine compares oids, right? If the oids wrap around
>> when the enum values are created, the ordering isn't what the user expects.
>>
>
> This is a fair point --- it'd be better if the ordering were not
> dependent on chance OID assignments. Not sure what we are willing
> to pay to have that though.
>

This is a non-issue. The code sorts the oids before assigning them:

/* allocate oids */
oids = (Oid *) palloc(sizeof(Oid) * n);
for(i = 0; i < n; i++)
{
oids[i] = GetNewOid(pg_enum);
}
/* wraparound is unlikely, but just to be safe...*/
qsort(oids, n, sizeof(Oid), oid_cmp);

>
>> 3. 4 bytes per value is wasteful if you're storing simple status codes
>> etc.
>>
>
> I've forgotten exactly which design Tom is proposing to implement here,
> but at least one of the contenders involved storing an OID that would be
> unique across all enum types. 1 byte is certainly not enough for that
> and even 2 bytes would be pretty marginal. I'm unconvinced by arguments
> about 2 bytes being so much better than 4 anyway --- in the majority of
> real table layouts, the hoped-for savings would disappear into alignment
> padding.
>
>
>

Globally unique is the design adopted, after much on-list discussion.
That was a way of getting it *down* to 4 bytes. The problem is that the
output routines need enough info from just the internal representation
of the type value to do their work. The original suggestions was for 8
bytes - type oid + offset in value set. Having them globally unique lets
us get down to 4.

As for efficiency, I agree with what Tom says about alignment and
padding dissolving away any perceived advantage in most cases. If we
ever get around to optimising record layout we could revisit it.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stefan Kaltenbrunner 2006-12-19 14:42:55 Re: Core dump in PL/pgSQL ...
Previous Message Hans-Juergen Schoenig 2006-12-19 14:20:36 Core dump in PL/pgSQL ...

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2006-12-19 14:45:38 Re: Updated XML patch
Previous Message Peter Eisentraut 2006-12-19 09:58:48 Re: Enums patch v2