Re: Re: Big 7.1 open items

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: lockhart(at)alumni(dot)caltech(dot)edu
Cc: t-ishii(at)sra(dot)co(dot)jp, robinson(at)netrinsics(dot)com, pgsql-hackers(at)hub(dot)org
Subject: Re: Re: Big 7.1 open items
Date: 2000-06-16 14:03:57
Message-ID: 20000616230357D.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > o Don't accept character sequences those are not valid as their
> > charset (signaling ERROR seems appropriate IMHO)
> > o Make PostgreSQL more multibyte aware (for example, TRIM function and
> > NAME data type)
> > o Regard n of CHAR(n)/VARCHAR(n) as the number of letters, rather than
> > the number of bytes
>
> All good, and important features when we are done.

Glad to hear that.

> One issue: I can see (or imagine ;) how we can use the Postgres type
> system to manage multiple character sets. But allowing arbitrary
> character sets in, say, table names forces us to cope with allowing a
> mix of character sets in a single column of a system table. afaik this
> general capability is not mandated by SQL9x (the SQL_TEXT character set
> is used for all system resources??). Would it be acceptable to have a
> "default database character set" which is allowed to creep into the
> pg_xxx tables? Even that seems to be a difficult thing to accomplish at
> the moment (we'd need to get some of the text manipulation functions
> from the catalogs, not from hardcoded references as we do now).

"default database character set" idea does not seem to be the solution
for cross-db relations such as pg_database. The only solution I can
imagine so far is using SQL_TEXT.

BTW, I've been thinking about SQL_TEXT for a while and it seems
mule_internal_code or Unicode(UTF-8) would be the candidates for
it. Mule_internal_code looks more acceptable for Asian multi-byte
users like me than Unicode. It's clean, simple and does not require
huge conversion tables between Unicode and other encodings. However,
Unicode has a stronger political power in the real world and for most
single-byte users probably it would be enough. My idea is let users
choose one of them. I mean making it a compile time option.

> We should itemize all of these issues so we can keep track of what is
> necessary, possible, and/or "easy".

You are right, probably there would be tons of issues in implementing
multiple charsets support.
--
Tatsuo Ishii

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Lockhart 2000-06-16 14:37:35 Re: Re: Big 7.1 open items
Previous Message Roberto João Lopes Garcia 2000-06-16 13:14:45 Is this list up??