Re: TODO: Fix CREATE CAST on DOMAINs

From: Mark Dilger <pgsql(at)markdilger(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject: Re: TODO: Fix CREATE CAST on DOMAINs
Date: 2006-09-21 16:37:09
Message-ID: 4512BFB5.9010501@markdilger.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Martijn van Oosterhout wrote:
> On Wed, Sep 20, 2006 at 10:56:08AM -0700, Mark Dilger wrote:
>> If the system chooses cast chains based on a breadth-first search, then the
>> existing int2 -> int8 cast would be chosen over an int2 -> int4 -> int8
>> chain, or an int2 -> int3 -> int4 -> int8 chain, or in fact any chain at
>> all, because the int2 -> int8 cast is the shortest.
>
> But we're not talking about a search here, we don't always know where
> the endpoint is. Imagine you have the following three functions:
>
> abs(int8)
> abs(float4)
> abs(numeric)
>
> And you have an int2. Which is the best cast to use? What's the answer
> if you have a float8? What if it's an unknown type text string?
>
> Now, consider that functions can have up to 32 arguments and that this
> resolution might have to be applied to each argument and you find that
> searching is going to get very expensive very quickly.
>
> The current system of requiring only a single step is at least
> predictable. If you have the choice between:
>
> - first argument matches, second needs three "safe" conversions, and
> - first argument need one "unsafe" conversion, second matches exactly
>
> Which is cheaper?
>
> To make this manageable you have to keep the number of types you can
> cast to small, or you'll get lost in the possibilites. Adding just a
> single step domain to base type conversion seems pretty safe, but
> anything more is going to be hard.
>
> Have a nice day,

The searching never needs to be done at runtime. It should be computable at
cast creation time. A new cast creates a potential bridge between any two types
in the system. Using a shortest path algorithm, the best chain (if any exists)
from one type to another can be computed and pre-compiled, right?

So, assume the following already exists:

Types A,B,C, fully connected with casts A->B, B->A, A->C, C->A, B->C, C->B, with
some marked IMPLICIT, some marked EXPLICIT, and some marked SAFE.

Types X,Y,Z, also fully connected with casts, as above.

Then assume someone comes along and creates a new type M with conversions A->M,
M->A, X->M, and M->X. At the time that type and those casts are added to the
system, the system could calculate any additional casts to/from B, C, Y, and Z.
A simple implementation (but maybe not optimal) would be for the system to
autogenerate code like:

CREATE FUNCTION cast_M_Y (arg M) RETURNS Y AS $$
SELECT arg::X::Y;
$$ LANGUAGE SQL;
CREATE CAST (M AS Y) WITH FUNCTION cast_M_Y(M) [ AS ASSIGNMENT | AS IMPLICIT ]

And then load that function and cast. The only real trick seems to be
determining the rules for which cast chain gets used within that autogenerated
function, and whether the generated cast is IMPLICIT, EXPLICIT, or ASSIGNMENT.

Looking over what I have just written, another idea pops up. To avoid having
the system decide which casts are reasonable, you could extend the syntax and
allow an easy shorthand for the user. Something like:

CREATE CAST (M AS A)
WITH FUNCTION cast_M_A
AS ASSIGNMENT
PROPOGATES TO B AS ASSIGNMENT,
PROPOGATES TO C AS ASSIGNMENT;

CREATE CAST (A AS M)
WITH FUNCTION cast_A_M
AS ASSIGNMENT
PROPOGATES FROM B,
PROPOGATES FROM C;

And then the casts from M->B, M->C, B->M, and C->M would all be added to the system.

Thoughts?

mark

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-09-21 16:57:34 Cause of moving-target FSM space-needed reports
Previous Message Tom Lane 2006-09-21 16:17:28 Re: [HACKERS] large object regression tests