Quick Links

Re: unicode questions

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	- - <crossroads0000(at)googlemail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: unicode questions
Date:	2009-12-24 16:40:30
Message-ID:	4B33997E.2040907@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

- - wrote:
> Dear PG hackers,
>
> I have two question regarding Unicode support in PG:
>
> 1) If I set my database and connection encoding to UTF-8, does pg (and
> future versions of it) guarantee that unicode code points are stored
> unmodified? or could it be that pg does some unicode
> normalization/manipulation with them before storing a string, or when
> retrieving a string?
>
> The reason why I'm asking is, I've built a little program that reads
> in and stores text and explicilty analyzes the text at a later point
> in time, also regarding things like if the text is in NFC, NFD or
> neither. and since I want to store them in the database, it is very
> imporant for PG not to fiddle around with the normalization unless my
> program explicitly told PG to do that.
>
> 2) How far is normalization support in PG? When I checked a long time
> ago, there was no such support. Now that the SQL standard mandates a
> NORMALIZE function that may have changed. Any updates?
>

We don't do any normalization. If the client gives us UTF8 then we store
exactly what it gives us, and return exactly that.

(This question is not really a -hackers question. The correct forum is
pgsql-general. Please make sure you use the correct forum in future.)

cheers

andrew

In response to

unicode questions at 2009-12-24 16:25:54 from - -

Responses

Re: unicode questions at 2009-12-24 23:37:16 from - -

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2009-12-24 17:53:32	Re: Corrupt WAL production possible in gistxlog.c
Previous Message	Andrew Dunstan	2009-12-24 16:27:04	Re: Removing pg_migrator limitations