Quick Links

unicode questions

From:	- - <crossroads0000(at)googlemail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	unicode questions
Date:	2009-12-24 16:25:54
Message-ID:	1842a500912240825s3738c492rc4ff14e8a84f0b47@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Dear PG hackers,

I have two question regarding Unicode support in PG:

1) If I set my database and connection encoding to UTF-8, does pg (and
future versions of it) guarantee that unicode code points are stored
unmodified? or could it be that pg does some unicode
normalization/manipulation with them before storing a string, or when
retrieving a string?

The reason why I'm asking is, I've built a little program that reads
in and stores text and explicilty analyzes the text at a later point
in time, also regarding things like if the text is in NFC, NFD or
neither. and since I want to store them in the database, it is very
imporant for PG not to fiddle around with the normalization unless my
program explicitly told PG to do that.

2) How far is normalization support in PG? When I checked a long time
ago, there was no such support. Now that the SQL standard mandates a
NORMALIZE function that may have changed. Any updates?

Responses

Re: unicode questions at 2009-12-24 16:40:30 from Andrew Dunstan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrew Dunstan	2009-12-24 16:27:04	Re: Removing pg_migrator limitations
Previous Message	Tom Lane	2009-12-24 16:19:06	Re: Removing pg_migrator limitations