Re: Charset/collate support and function parameters

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: db(at)zigo(dot)dhs(dot)org
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Charset/collate support and function parameters
Date: 2004-10-31 09:47:52
Message-ID: 20041031.184752.77400289.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Sun, 31 Oct 2004, Tatsuo Ishii wrote:
>
> > I don't understand your point. Today we already use one length()
> > function for any charsets as Tom has already pointed out.
>
> We have one length function that inside do different things depending on
> the charset. If you want to add a charset and implement the length
> function for that charset, how do you do that?

That's exactly the job of CREATE CHARSET. It will define set of
functions that handle various work including counting length of a
string. One can find the char-length-counting function by looking up the
charset system catalog.

> > The question in your approach is how you could handle the coercibility
> > property. It's a transient and on memory property thus will not fit
> > into the function declaration. No?
>
> No, it's not part of the function signature. Coercibility is a way to
> decide what collation to use. Depending on where the value comes from it
> can have different coercibility and when one do operations that involves
> different collations the coercibility decide how ambiguities are resolved
> (which value will be coerced).

I see.

> If one would want function signatures with charsets in them and where the
> charset information is stored, it doesn't have to be opposit of each
> other.
>
> I've currently been thinking that one can avoid storing the charset in the
> value by handling types like that. I even though that there was no way
> that anyone in the pg project would ever accept to enlarge the string
> values, obviously a wrong assumption :-)
>
> Even when one do store the charset in the value one might want to have
> function overloading to depend on the charset of the string (when
> specified).
>
> That's the same opinion that if I declare a function
>
> foo (x varchar(5))
> begin
> ...
> end
>
> then I expect to get strings that are max 5 chars long. Why do we allow
> the (5) if it's just droped? If I define a column as varchar(5) then the
> column values are relly max 5 chars long, but it does not work for
> functions like that.
>
> Let us simply agree that we do store the charset/collation/... in the
> (memory) values. On disk we don't want that since the column type do
> decide it totally, do we agree on that?

I agree except that shared system catalogs (and probably some non
shared system catalogs such as pg_class) need charset on disk.

I personaly don't see any value in using non English user names, database
names, table names and so on though. However some users love to use
them:-)
--
Tatsuo Ishii

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Devrim GUNDUZ 2004-10-31 10:48:13 make check error on -HEAD
Previous Message Dennis Bjorklund 2004-10-31 09:16:49 Re: Charset/collate support and function parameters