Quick Links

Re: unicode

From:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To:	hannu(at)tm(dot)ee
Cc:	oleg(at)sai(dot)msu(dot)su, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: unicode
Date:	2002-09-27 04:29:19
Message-ID:	20020927.132919.38718738.t-ishii@sra.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> Where can I read about basic tech details of Unicode / Charset
> Conversion / ...
>
> I't like to find answers to the following (for database created using
> UNICODE)
>
> 1. Where exactly are conversions between national charsets done

No "national charset" is in PostgreSQL. I assume you want to know
where frontend/backend encoding conversion happens. They are handled
by pg_server_to_client(does conversion BE to FE) and
pg_client_to_server(FE to BE). These functions are called by the
communication sub system(backend/libpq) and COPY. In summary, in most
cases the encoding conversion is done before the parser and after the
executor produces the final result.

> 2. What is converyted (whole SQL statements or just data)

Whole statement.

> 3. What format is used for processing in memory (UCS-2, UCS-4, UTF-8,
> UTF-16, UTF-32, ...)

"format"? I assume you are talking about the encoding.

It is exactly same as the database encoding. For UNICODE database, we
use UTF-8. Not UCS-2 nor UCS-4.

> 4. What format is used when saving to disk (UCS-*, UTF-*, SCSU, ...) ?

Ditto.

> 5. Are LIKE/SIMILAR aware of locale stuff ?

I don't know about SIMILAR, but I believe LIKE is not locale aware and
is correct from the standard's point of view...
--
Tatsuo Ishii

In response to

Re: unicode at 2002-09-26 08:52:48 from Hannu Krosing

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Rod Taylor	2002-09-27 04:29:21	Re: Cascaded Column Drop
Previous Message	Tom Lane	2002-09-27 04:28:56	Re: Cascaded Column Drop