Re: Multibyte still broken

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: robinson(at)netrinsics(dot)com
Cc: pgsql-hackers(at)hub(dot)org
Subject: Re: Multibyte still broken
Date: 2000-05-11 01:07:19
Message-ID: 20000511100719Q.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> More robust code may always be good, but "good" apparently doesn't always go
> into the tree. Imagine my surprise, while upgrading a production server
> from 6.5.3 to 7.0, when the data dumped from the old database failed to load
> into the new database (well, crashed the backend, to be specific).
>
> Apparently the "validate your own damn data" sentiment of the first excerpt
> above has prevailed, because, on inspection, the MB code is just as fragile
> as it was five months ago.
>
> I was forced to perform emergency repairs to my database dump file to fool a
> non-multibyte 7.0 into accepting it. Since EUC_CN is compatible with
> Latin-1, and since the benefits of multibyte are small compared to the
> risks, I intend to stick with unibyte Postgres henceforth.
>
> I would, though, recommend a warning in the "INSTALL" file along the lines of:
>
> "WARNING: Use of improperly-encoded text with multi-byte support enabled
> WILL lead to data corruption and/or loss. Do not enable multi-byte support
> unless you intend to fully validate your own damn data."

Sorry for the problem. I forgot about issue:-<

What I'm thinking now to fix the problem you found is that doing data
validataion in the text/var/char input functions, rather than tweaking
the mb functions. If corrupted MB string was found, then call
elog(ERROR) to abort the transation. Will appear in 7.0.1 unless
someone objects.
--
Tatsuo Ishii

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniele Orlandi 2000-05-11 01:09:20 Re: Not using index
Previous Message Tom Lane 2000-05-11 00:58:38 Re: misc questions