Skip site navigation (1) Skip section navigation (2)

Re: Upcoming PG re-releases

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Paul Lindner <lindner(at)inuus(dot)com>
Cc: Neil Conway <neilc(at)samurai(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Upcoming PG re-releases
Date: 2005-12-06 19:26:38
Message-ID: 200512061926.jB6JQcS23646@candle.pha.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-www
I have added your suggestions to the 8.1.X release notes.

---------------------------------------------------------------------------

Paul Lindner wrote:
-- Start of PGP signed section.
> On Sat, Dec 03, 2005 at 10:54:08AM -0500, Bruce Momjian wrote:
> > Neil Conway wrote:
> > > On Wed, 2005-11-30 at 10:56 -0500, Tom Lane wrote:
> > > > It's been about a month since 8.1.0 was released, and we've found about
> > > > the usual number of bugs for a new release, so it seems like it's time
> > > > for 8.1.1.
> > > 
> > > I think one fix that should be made in time for 8.1.1 is adding a note
> > > to the "version migration" section of the 8.1 release notes describing
> > > the "invalid UTF-8 byte sequence" problems that some people have run
> > > into when upgrading from prior versions. I'm not familiar enough with
> > > the problem or its remedies to add the note myself, though.
> > 
> > Agreed, but I don't understand the problem well enough either.  Does
> > anyone?
> 
> There was a thread a couple of weeks back about this problem.  Here's
> my sample writeup -- I give my permission for anyone to use it as they
> see fit:
> 
> 
> Upgrading UNICODE databases to 8.1
> 
> Postgres 8.1 includes a number of bug-fixes and improvements to
> Unicode and UTF-8 character handling.  Unfortunately previous releases
> would accept character sequences that were not valid UTF-8.  This
> may cause problems when upgrading your database using
> pg_dump/pg_restore resulting in an error message like this:
> 
>   Invalid UNICODE byte sequence detected near byte ...
> 
> To convert your pre-8.1 database to 8.1 you may have to remove and/or
> fix the offending characters.  One simple way to fix the problem is to
> run your pg_dump output through the iconv command like this:
> 
>   iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
> 
> The -c flag tells iconv to omit invalid characters from output.
> 
> There is one problem with this.  Most versions of iconv try to read
> the entire input file into memory.  If you dump is quite large you
> will need to split the dump into multiple files and convert each one
> individually.  You must use the -l flag for split to insure that the
> unicode byte sequences are not split.
> 
>    split -l 10000 dump.sql
> 
> Another possible solution is to use the --inserts flag to pg_dump.
> When you load the resulting data dump in 8.1 this will result in the
> problem rows showing up in your error log.
> 
> -- 
> Paul Lindner        ||||| | | | |  |  |  |   |   |
> lindner(at)inuus(dot)com
-- End of PGP section, PGP failed!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman(at)candle(dot)pha(dot)pa(dot)us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

In response to

Responses

pgsql-www by date

Next:From: Tom LaneDate: 2005-12-06 19:27:56
Subject: Re: Upcoming PG re-releases
Previous:From: Joshua D. DrakeDate: 2005-12-06 19:03:44
Subject: Re: Launching PostgreSQL KB Project Mark 2

pgsql-hackers by date

Next:From: Tom LaneDate: 2005-12-06 19:27:56
Subject: Re: Upcoming PG re-releases
Previous:From: Simon RiggsDate: 2005-12-06 18:32:53
Subject: Re: Optimizer oddness, possibly compounded in 8.1

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group