Re: Unicode string literals versus the world

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Marko Kreen <markokr(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Unicode string literals versus the world
Date: 2009-04-14 12:53:52
Message-ID: 200904141553.52181.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday 14 April 2009 14:38:38 Marko Kreen wrote:
> I think the problem is that they should not act like E'' strings, but they
> should act like plain '' strings - they should follow stdstr setting.
>
> That way existing tools that may (or may not..) understand E'' and stdstr
> settings, but definitely have not heard about U&'' strings can still
> parse the SQL without new surprises.

Can you be more specific in what "surprises" you expect? What algorithms do
you suppose those "existing tools" use and what expectations do they have?

> I still stand on my proposal, how about extending E'' strings with
> unicode escapes (eg. \uXXXX)? The E'' strings are already more
> clearly defined than '' and they are our "own", we don't need to
> consider random standards, but can consider our sanity.

This doesn't excite me. I think the tendency should be to get rid of E''
usage, because its definition of escape sequences is single-byte and ASCII
centric and thus overall a legacy construct. Certainly, we will want to keep
around E'' for a long time or forever, but it is a legitimate goal for
application writers to not use it, which is after all the reason behind this
whole standards-conforming strings project. I wouldn't want to have a
forward-looking feature such as the Unicode escapes be burdened with that kind
of legacy behavior.

Also note that Unicode escapes are also available for identifiers, for which
there is no existing E"" that you can add it to.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2009-04-14 13:01:48 Re: Unicode string literals versus the world
Previous Message Peter Eisentraut 2009-04-14 12:36:35 Re: Unicode support