Re: Mac OS: invalid byte sequence for encoding "UTF8"

From: Chapman Flack <chap(at)anastigmatix(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, Stas Kelvich <stas(dot)kelvich(at)gmail(dot)com>, "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Mac OS: invalid byte sequence for encoding "UTF8"
Date: 2016-02-11 05:20:12
Message-ID: 56BC1A0C.1070104@anastigmatix.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/10/16 23:55, Tom Lane wrote:

> Yeah, I got that --- what seems squishier is that none of the other C1
> control characters are considered whitespace?

That seems to be exactly the case:

http://www.unicode.org/Public/5.2.0/ucd/PropList.txt

09..0D, 20, 85, and A0 are the only whitespace chars whose codepoints
fit in a byte.

-Chap

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Vitaly Burovoy 2016-02-11 06:44:28 Re: custom function for converting human readable sizes to bytes
Previous Message Tom Lane 2016-02-11 04:55:27 Re: Mac OS: invalid byte sequence for encoding "UTF8"