Re: UTF8 national character data type support WIP patch and list of open issues.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Boguk, Maksym" <maksymb(at)fast(dot)au(dot)fujitsu(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UTF8 national character data type support WIP patch and list of open issues.
Date: 2013-09-20 18:22:36
Message-ID: CA+TgmoZENYsQ5BwsjKz+SU8e1+JjtHNLqz4q9=FYn7n_dHBw_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 19, 2013 at 6:42 PM, MauMau <maumau307(at)gmail(dot)com> wrote:
> National character types support may be important to some potential users of
> PostgreSQL and the popularity of PostgreSQL, not me. That's why national
> character support is listed in the PostgreSQL TODO wiki. We might be losing
> potential users just because their selection criteria includes national
> character support.

We'd have to go back and search the archives to figure out why that
item was added to the TODO, but I'd be surprised if anyone ever had it
in mind to create additional types that behave just like existing
types but with different names. I don't think that you'll be able to
get consensus around that path on this mailing list.

>> I am not keen to introduce support for nchar and nvarchar as
>> differently-named types with identical semantics.
>
> Similar examples already exist:
>
> - varchar and text: the only difference is the existence of explicit length
> limit
> - numeric and decimal
> - int and int4, smallint and int2, bigint and int8
> - real/double precison and float

I agree that the fact we have both varchar and text feels like a wart.
The other examples mostly involve different names for the same
underlying type, and so are different from what you are asking for
here.

> I understand your feeling. The concern about incompatibility can be
> eliminated by thinking the following way. How about this?
>
> - NCHAR can be used with any database encoding.
>
> - At first, NCHAR is exactly the same as CHAR. That is,
> "implementation-defined character set" described in the SQL standard is the
> database character set.
>
> - In the future, the character set for NCHAR can be selected at database
> creation like Oracle's CREATE DATABAWSE .... NATIONAL CHARACTER SET
> AL16UTF16. The default it the database set.

Hmm. So under that design, a database could support up to a total of
two character sets, the one that you get when you say 'foo' and the
other one that you get when you say n'foo'.

I guess we could do that, but it seems a bit limited. If we're going
to go to the trouble of supporting multiple character sets, why not
support an arbitrary number instead of just two?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-09-20 18:30:25 Re: UTF8 national character data type support WIP patch and list of open issues.
Previous Message Steve Singer 2013-09-20 17:12:29 Re: record identical operator - Review