Re: BUG #14038: substring cuts unicode char in half, allowing to save broken utf8 into table

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: rpegues(at)tripwire(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #14038: substring cuts unicode char in half, allowing to save broken utf8 into table
Date: 2016-03-21 16:46:30
Message-ID: 22852.1458578790@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

rpegues(at)tripwire(dot)com writes:
> We have a table with an update trigger where if you modify a certain column,
> we change the name of the row by calling a function.
> In the function, substring() the name and then add a random string to that.
> However, the substring appears to cut a unicode character in half, and the
> update trigger then updates the name with the broken string.

That should not happen if Postgres knows it's dealing with unicode data.
What have you got the database's encoding set to?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Reece Pegues 2016-03-21 17:10:45 Re: BUG #14038: substring cuts unicode char in half, allowing to save broken utf8 into table
Previous Message Tom Lane 2016-03-21 16:39:21 Re: BUG #14033: cross-compilation to ARM fails