Re: Type Categories for User-Defined Types

From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Type Categories for User-Defined Types
Date: 2008-07-29 21:15:54
Message-ID: 055450C5-1386-45F3-B2D3-8FD72E781B0A@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jul 29, 2008, at 14:00, Tom Lane wrote:

> Well, a rough estimate of the places where implicit coercion to text
> might be relevant to resolving ambiguity is
>
> select proname from pg_proc
> where 'text'::regtype = any(proargtypes)
> group by proname having count(*)>1;
>
> select oprname from pg_operator
> where oprleft='text'::regtype or oprright='text'::regtype
> group by oprname having count(*)> 1;
>
> I count 37 functions and 10 operators as of CVS HEAD. Perhaps not all
> would need to be fixed in practical use, but if you wanted seamless
> integration of citext it's quite possible that you'd need alias
> functions/operators (maybe more than one) in each of those cases.

Well, there are already citext aliases for all of those operators, for
this very reason. There are citext aliases for a bunch of the
functions, too (ltrim(), substring(), etc.), so I wouldn't worry about
adding more. I've added more of them since I last sent a patch, mainly
for the regexp functions, replace(), strpos(), etc. I'd guess that I'm
about half-way there already, and there probably are a few I wouldn't
bother with (like timezone()).

Anyway, would this issue then go away once the type stuff was added
and citext was specified as TYPE = 'S'?

> [ squint... ] Actually, this is an underestimate since these queries
> aren't finding cases like quote_literal, where there is ambiguity but
> only one of the alternatives takes 'text'. I'm too lazy to work out a
> better query though.

Thanks.

>> Perhaps tangential: What does it mean for a type to be "preferred"?
>
> See the ambiguous-function resolution rules in chapter 10 of the fine
> manual ...

I see this:

> C. Run through all candidates and keep those that accept preferred
> types (of the input data type's type category) at the most positions
> where type conversion will be required. Keep all candidates if none
> accept preferred types. If only one candidate remains, use it; else
> continue to the next step.

That doesn't exactly explain what "preferred" means, just that it
seems to prioritize the resolution of a function a bit. Which, I
guess, is the point.

>> Wouldn't this then limit them to 52 possible categories?
>
> It'd be either 94 - 26 or 94 - 26 - 26 depending on what the policy is
> about lower-case letters (and assuming they wanted to stay away from
> control characters, which seems like a good idea). Considering the
> world supply of categories up to now has been about ten, it's hard
> to imagine that this is really a limitation.

Okay.

>> Does that
>> matter? Given your suggestion, I'm assuming that a single character
>> is
>> somehow more efficient than an enum, yes?
>
> Marginally so; but an enum wouldn't help anyway unless we are prepared
> to invent ALTER ENUM. We'd have to go to an actual new system catalog
> if we wanted something noticeably better than the poor-mans-enum
> approach, and as I mentioned earlier, that just seems like overkill.
> (Besides, we could always add it later if there's suddenly a gold rush
> for categories. The only thing we'd be locking ourselves into, if
> we view this as a stopgap implementation, is the need to accept
> single-character abbreviations in future, even after the system knows
> actual names for categories.)

Makes sense.

Thanks,

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-07-29 21:33:46 Re: Type Categories for User-Defined Types
Previous Message Tom Lane 2008-07-29 21:00:29 Re: Type Categories for User-Defined Types