Re: Type Categories for User-Defined Types

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Type Categories for User-Defined Types
Date: 2008-07-29 20:12:09
Message-ID: 15217.1217362329@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"David E. Wheeler" <david(at)kineticode(dot)com> writes:
> On Jul 29, 2008, at 11:41, Tom Lane wrote:
>> and I notice that cases like
>> contrib_regression=# select 'a'::text || 'b'::citext;
>> ERROR: operator is not unique: text || citext
>> still don't work even though you put in an alias || operator.

> Damn, I didn't even notice that! Can that be fixed?

Given the present infrastructure I think the only way would be with
two more alias operators, text||citext and citext||text. But that way
madness lies.

>> Obviously the solution should involve a new column in pg_type and
>> a new type property in CREATE TYPE, but what should the representation
>> be? A full-on approach would make the type categories be real SQL
>> objects with their own system catalog and reference them by OID,
>> but I can't help thinking that that's overkill.

> It kinda sounds that way, yeah. What happens with DOMAINs, BTW? Do
> they need to write hacky functions like the above, or are they aware
> of their types because of the types from which they inherit?

Domains are treated as their base types in general. Elein has been
complaining about that for years ;-) ... but I think improving it
is unrelated to this issue.

>> Anyway, debating that is probably material for a separate thread ...

> Here you go! ;-)

After a quick look to verify my recollection: the only two things
that the system does with type categories are

extern CATEGORY TypeCategory(Oid type);

Returns the category a type belongs to.

extern bool IsPreferredType(CATEGORY category, Oid type);

Detects whether a type is a preferred type in its category (there can
be more than one preferred type in a category, and in fact the
traditional setup is that *every* user-defined type is a preferred
type in the USER_TYPE category).

The categories themselves are pretty much opaque values, except that
parse_func.c has special behavior to prefer STRING_TYPE when in doubt.

So this can fairly obviously be replaced by two new pg_type columns,
say "typcategory" and "typpreferred", where the latter is a bool.
Since the list of categories is pretty short and there's no obvious
reason to extend it a lot, I propose that we just represent typcategory
as a "char", using a mapping along the lines of

BITSTRING_TYPE b
BOOLEAN_TYPE B
DATETIME_TYPE D
GENERIC_TYPE P (think "pseudotype")
GEOMETRIC_TYPE G
INVALID_TYPE \0 (not allowed in catalog anyway)
NETWORK_TYPE n
NUMERIC_TYPE N
STRING_TYPE S
TIMESPAN_TYPE T
UNKNOWN_TYPE u
USER_TYPE U

Users would be allowed to select any single ASCII character as the
"category" of a user-defined type, should they have a need to make their
own new category. Of course CREATE TYPE's default is category = U and
preferred = true for backward compatibility reasons. We could put down
a rule that system-defined categories are always upper or lower case
letters (or even always upper, if we wanted to strain some of the
assignments a bit) so that it's clear what can be used for a
user-defined category.

It might possibly be worth making new categories for arrays, composites,
and enums; they're currently effectively USER_TYPE but that doesn't seem
quite right. Also, the rules for domains should likely be "same
category as base type, never a preferred type" instead of the current
behavior where they're user types. (I think the latter doesn't really
matter now, because we always smash a domain to its base type before
inquiring about categories anyway. But it might give Elein a bit more
room to maneuver with the functions-on-domains issue.)

A possible objection is that this will make TypeCategory and
IsPreferredType slower than before, since they'll involve a syscache
lookup instead of a simple switch statement. I don't think this will
be too bad though; all the paths they are used in are full of catalog
lookups anyway, so it's hard to credit that there would be much
percentage slowdown.

Thoughts?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2008-07-29 20:24:58 Re: Type Categories for User-Defined Types
Previous Message David Fetter 2008-07-29 19:24:37 Re: [PATCH] "\ef <function>" in psql