Re: UTF8 national character data type support WIP patch and list of open issues.

From: "Arulappan, Arul Shaji" <arul(at)fast(dot)au(dot)fujitsu(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Boguk, Maksym" <Maksym(dot)Boguk(at)au(dot)fujitsu(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UTF8 national character data type support WIP patch and list of open issues.
Date: 2013-11-05 06:04:07
Message-ID: 022C711CCA8AF2459F370E936F2B9E8C0255E4BB@SYDExchTmp.au.fjanz.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached is a patch that implements the first set of changes discussed
in this thread originally. They are:

(i) Implements NCHAR/NVARCHAR as distinct data types, not as synonyms so
that:
- psql \d can display the user-specified data types.
- pg_dump/pg_dumpall can output NCHAR/NVARCHAR columns as-is,
not as CHAR/VARCHAR.
- Groundwork to implement additional features for NCHAR/NVARCHAR
in the future (For eg: separate encoding for nchar columns).
(ii) Support for NCHAR/NVARCHAR in ECPG
(iii) Documentation changes to reflect the new data type

Rgds,
Arul Shaji

>-----Original Message-----
>From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-
>owner(at)postgresql(dot)org] On Behalf Of MauMau
>
>From: "Greg Stark" <stark(at)mit(dot)edu>
>> If it's not lossy then what's the point? From the client's point of
>> view it'll be functionally equivalent to text then.
>
>Sorry, what Tatsuo san suggested meant was "same or compatible", not
lossy.
>I quote the relevant part below. This is enough for the use case I
mentioned
>in my previous mail several hours ago (actually, that is what Oracle
manual
>describes...).
>
>http://www.postgresql.org/message-id/20130920.085853.162891705483086415
1.t-
>ishii(at)sraoss(dot)co(dot)jp
>
>[Excerpt]
>----------------------------------------
>What about limiting to use NCHAR with a database which has same
encoding or
>"compatible" encoding (on which the encoding conversion is defined)?
This way,
>NCHAR text can be automatically converted from NCHAR to the database
encoding
>in the server side thus we can treat NCHAR exactly same as CHAR
afterward. I
>suppose what encoding is used for NCHAR should be defined in initdb
time or
>creation of the database (if we allow this, we need to add a new column
to know
>what encoding is used for NCHAR).
>
>For example, "CREATE TABLE t1(t NCHAR(10))" will succeed if NCHAR is
>UTF-8 and database encoding is UTF-8. Even succeed if NCHAR is
SHIFT-JIS and
>database encoding is UTF-8 because there is a conversion between UTF-8
and
>SHIFT-JIS. However will not succeed if NCHAR is SHIFT-JIS and database
encoding
>is ISO-8859-1 because there's no conversion between them.
>----------------------------------------
>
>
>Regards
>MauMau
>
>
>
>--
>Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org) To
make
>changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-hackers

Attachment Content-Type Size
PGHEAD_nchar_v1.patch application/octet-stream 191.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vik Fearing 2013-11-05 06:20:05 Re: WITHIN GROUP patch
Previous Message Amit Kapila 2013-11-05 03:51:00 Re: [BUGS] BUG #8573: int4range memory consumption