Quick Links

Re: Bug in UTF8-Validation Code?

From:	Michael Paesold <mpaesold(at)gmx(dot)at>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>, Mario Weilguni EXTERN <mweilguni(at)sime(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Bug in UTF8-Validation Code?
Date:	2007-03-14 07:01:53
Message-ID:	45F79DE1.1070700@gmx.at
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Andrew Dunstan wrote:
> Albe Laurenz wrote:
>> A fix could be either that the server checks escape sequences for
>> validity
>>
>
> This strikes me as essential. If the db has a certain encoding ISTM we
> are promising that all the text data is valid for that encoding.
>
> The question in my mind is how we help people to recover from the fact
> that we haven't done that.

I would also say that it's a bug that escape sequences can get characters
into the database that are not valid in the specified encoding. If you
compare the encoding to table constraints, there is no way to simply
"escape" a constraint check.

This seems to violate the principle of consistency in ACID. Additionally,
if you include pg_dump into ACID, it also violates durability, since it
cannot restore what it wrote itself.
Is there anything in the SQL spec that asks for such a behaviour? I guess not.

A DBA will usually not even learn about this issue until they are presented
with a failing restore.

Best Regards,
Michael Paesold

In response to

Re: Bug in UTF8-Validation Code? at 2007-03-13 15:12:33 from Andrew Dunstan

Responses

Re: Bug in UTF8-Validation Code? at 2007-03-14 09:05:31 from Peter Eisentraut
Re: Bug in UTF8-Validation Code? at 2007-03-16 11:17:14 from Mario Weilguni

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Fuhr	2007-03-14 07:29:49	Re: Bug in UTF8-Validation Code?
Previous Message	Greg Smith	2007-03-14 04:13:04	Re: Log levels for checkpoint/bgwriter monitoring