Re: Type Categories for User-Defined Types

From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Type Categories for User-Defined Types
Date: 2008-07-29 20:24:58
Message-ID: 36F67A2D-BC51-4919-A46F-6C35CE1415C0@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jul 29, 2008, at 13:12, Tom Lane wrote:

>> Damn, I didn't even notice that! Can that be fixed?
>
> Given the present infrastructure I think the only way would be with
> two more alias operators, text||citext and citext||text. But that way
> madness lies.

I suppose, then, that you're saying that there are lots of other
functions for which this sort of thing would need to be done? Because
two more aliases for this one operator is no big deal, AFAIC.

>> It kinda sounds that way, yeah. What happens with DOMAINs, BTW? Do
>> they need to write hacky functions like the above, or are they aware
>> of their types because of the types from which they inherit?
>
> Domains are treated as their base types in general. Elein has been
> complaining about that for years ;-) ... but I think improving it
> is unrelated to this issue.

I see.

> After a quick look to verify my recollection: the only two things
> that the system does with type categories are
>
> extern CATEGORY TypeCategory(Oid type);
>
> Returns the category a type belongs to.
>
> extern bool IsPreferredType(CATEGORY category, Oid type);
>
> Detects whether a type is a preferred type in its category (there can
> be more than one preferred type in a category, and in fact the
> traditional setup is that *every* user-defined type is a preferred
> type in the USER_TYPE category).

Perhaps tangential: What does it mean for a type to be "preferred"?

> The categories themselves are pretty much opaque values, except that
> parse_func.c has special behavior to prefer STRING_TYPE when in doubt.
>
> So this can fairly obviously be replaced by two new pg_type columns,
> say "typcategory" and "typpreferred", where the latter is a bool.
> Since the list of categories is pretty short and there's no obvious
> reason to extend it a lot, I propose that we just represent
> typcategory
> as a "char", using a mapping along the lines of
>
> BITSTRING_TYPE b
> BOOLEAN_TYPE B
> DATETIME_TYPE D
> GENERIC_TYPE P (think "pseudotype")
> GEOMETRIC_TYPE G
> INVALID_TYPE \0 (not allowed in catalog anyway)
> NETWORK_TYPE n
> NUMERIC_TYPE N
> STRING_TYPE S
> TIMESPAN_TYPE T
> UNKNOWN_TYPE u
> USER_TYPE U
>
> Users would be allowed to select any single ASCII character as the
> "category" of a user-defined type, should they have a need to make
> their
> own new category.

Wouldn't this then limit them to 52 possible categories? Does that
matter? Given your suggestion, I'm assuming that a single character is
somehow more efficient than an enum, yes?

> Of course CREATE TYPE's default is category = U and
> preferred = true for backward compatibility reasons. We could put
> down
> a rule that system-defined categories are always upper or lower case
> letters (or even always upper, if we wanted to strain some of the
> assignments a bit) so that it's clear what can be used for a
> user-defined category.

Makes sense.

> It might possibly be worth making new categories for arrays,
> composites,
> and enums; they're currently effectively USER_TYPE but that doesn't
> seem
> quite right. Also, the rules for domains should likely be "same
> category as base type, never a preferred type" instead of the current
> behavior where they're user types. (I think the latter doesn't really
> matter now, because we always smash a domain to its base type before
> inquiring about categories anyway. But it might give Elein a bit more
> room to maneuver with the functions-on-domains issue.)

Yes, this all sounds like it'd be an important improvement.

> A possible objection is that this will make TypeCategory and
> IsPreferredType slower than before, since they'll involve a syscache
> lookup instead of a simple switch statement. I don't think this will
> be too bad though; all the paths they are used in are full of catalog
> lookups anyway, so it's hard to credit that there would be much
> percentage slowdown.
>
> Thoughts?

Obviously I don't know much about the internals, but your explanation
here seems very clear to me. I like it. +1

Thank you, Tom.

Best,

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-07-29 21:00:29 Re: Type Categories for User-Defined Types
Previous Message Tom Lane 2008-07-29 20:12:09 Re: Type Categories for User-Defined Types