Re: Upcoming PG re-releases

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Paul Lindner <lindner(at)inuus(dot)com>
Cc: Neil Conway <neilc(at)samurai(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Upcoming PG re-releases
Date: 2005-12-06 19:26:38
Message-ID: 200512061926.jB6JQcS23646@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-www


I have added your suggestions to the 8.1.X release notes.

---------------------------------------------------------------------------

Paul Lindner wrote:
-- Start of PGP signed section.
> On Sat, Dec 03, 2005 at 10:54:08AM -0500, Bruce Momjian wrote:
> > Neil Conway wrote:
> > > On Wed, 2005-11-30 at 10:56 -0500, Tom Lane wrote:
> > > > It's been about a month since 8.1.0 was released, and we've found about
> > > > the usual number of bugs for a new release, so it seems like it's time
> > > > for 8.1.1.
> > >
> > > I think one fix that should be made in time for 8.1.1 is adding a note
> > > to the "version migration" section of the 8.1 release notes describing
> > > the "invalid UTF-8 byte sequence" problems that some people have run
> > > into when upgrading from prior versions. I'm not familiar enough with
> > > the problem or its remedies to add the note myself, though.
> >
> > Agreed, but I don't understand the problem well enough either. Does
> > anyone?
>
> There was a thread a couple of weeks back about this problem. Here's
> my sample writeup -- I give my permission for anyone to use it as they
> see fit:
>
>
> Upgrading UNICODE databases to 8.1
>
> Postgres 8.1 includes a number of bug-fixes and improvements to
> Unicode and UTF-8 character handling. Unfortunately previous releases
> would accept character sequences that were not valid UTF-8. This
> may cause problems when upgrading your database using
> pg_dump/pg_restore resulting in an error message like this:
>
> Invalid UNICODE byte sequence detected near byte ...
>
> To convert your pre-8.1 database to 8.1 you may have to remove and/or
> fix the offending characters. One simple way to fix the problem is to
> run your pg_dump output through the iconv command like this:
>
> iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
>
> The -c flag tells iconv to omit invalid characters from output.
>
> There is one problem with this. Most versions of iconv try to read
> the entire input file into memory. If you dump is quite large you
> will need to split the dump into multiple files and convert each one
> individually. You must use the -l flag for split to insure that the
> unicode byte sequences are not split.
>
> split -l 10000 dump.sql
>
> Another possible solution is to use the --inserts flag to pg_dump.
> When you load the resulting data dump in 8.1 this will result in the
> problem rows showing up in your error log.
>
> --
> Paul Lindner ||||| | | | | | | | | |
> lindner(at)inuus(dot)com
-- End of PGP section, PGP failed!

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-12-06 19:27:56 Re: Upcoming PG re-releases
Previous Message Simon Riggs 2005-12-06 18:32:53 Re: Optimizer oddness, possibly compounded in 8.1

Browse pgsql-www by date

  From Date Subject
Next Message Tom Lane 2005-12-06 19:27:56 Re: Upcoming PG re-releases
Previous Message Joshua D. Drake 2005-12-06 19:03:44 Re: Launching PostgreSQL KB Project Mark 2