Re: evil characters #bfef cause dump failure

From: Markus Bertheau <twanger(at)bluetwanger(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Christian Fowler <spider(at)viovio(dot)com>, pgsql-admin list <pgsql-admin(at)postgresql(dot)org>
Subject: Re: evil characters #bfef cause dump failure
Date: 2004-11-15 23:13:21
Message-ID: 1100560402.7458.3.camel@fc3
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

В Пнд, 15/11/2004 в 16:00 -0500, Tom Lane пишет:

> The real problem is that you've got invalid unicode data in the database
> (I'm not an expert, but I think that #bf is a 1-byte UTF8 sequence and
> then #ef starts a 3-byte sequence, so if this comes within 2 characters
> of end-of-line that would explain your dump problem).

FWIW, 1-byte UTF-8 sequences are always < 128. BF can only appear
inside, not at the beginning of, a UTF-8 byte sequence with more than 1
byte.

Compare

http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8

It has a table that gives anyone who can tell bits from bytes a quick
understanding of how the UTF-8 encoding works.

--
Markus Bertheau <twanger(at)bluetwanger(dot)de>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Iain 2004-11-16 01:17:54 Re: evil characters #bfef cause dump failure
Previous Message Tom Lane 2004-11-15 21:00:22 Re: evil characters #bfef cause dump failure