Re: unicode questions

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: - - <crossroads0000(at)googlemail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unicode questions
Date: 2009-12-24 16:40:30
Message-ID: 4B33997E.2040907@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

- - wrote:
> Dear PG hackers,
>
> I have two question regarding Unicode support in PG:
>
> 1) If I set my database and connection encoding to UTF-8, does pg (and
> future versions of it) guarantee that unicode code points are stored
> unmodified? or could it be that pg does some unicode
> normalization/manipulation with them before storing a string, or when
> retrieving a string?
>
> The reason why I'm asking is, I've built a little program that reads
> in and stores text and explicilty analyzes the text at a later point
> in time, also regarding things like if the text is in NFC, NFD or
> neither. and since I want to store them in the database, it is very
> imporant for PG not to fiddle around with the normalization unless my
> program explicitly told PG to do that.
>
> 2) How far is normalization support in PG? When I checked a long time
> ago, there was no such support. Now that the SQL standard mandates a
> NORMALIZE function that may have changed. Any updates?
>

We don't do any normalization. If the client gives us UTF8 then we store
exactly what it gives us, and return exactly that.

(This question is not really a -hackers question. The correct forum is
pgsql-general. Please make sure you use the correct forum in future.)

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-12-24 17:53:32 Re: Corrupt WAL production possible in gistxlog.c
Previous Message Andrew Dunstan 2009-12-24 16:27:04 Re: Removing pg_migrator limitations