Skip site navigation (1) Skip section navigation (2)

BUG #1987: UTF8 encoding differences hamper upgrades

From: "Paul Lindner" <lindner(at)inuus(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #1987: UTF8 encoding differences hamper upgrades
Date: 2005-10-22 15:33:52
Message-ID: 20051022153352.7CDC1F0DDB@svr2.postgresql.org (view raw or flat)
Thread:
Lists: pgsql-bugs
The following bug has been logged online:

Bug reference:      1987
Logged by:          Paul Lindner
Email address:      lindner(at)inuus(dot)com
PostgreSQL version: 8.1beta3
Operating system:   Fedora Core 4 x86_64
Description:        UTF8 encoding differences hamper upgrades
Details: 

I've been doing some test imports of UNICODE databases into Postgres        
                                                                            
           
8.1beta3.  The only problem I've seen is that some data from 8.0            
                                                                            
           
databases will not import.                                                  
                                                                            
           
                                                                            
                                                                            
           
I've generated dumps using pg_dump from 8.0 and 8.1.  Attempting to         
                                                                            
           
restore these results in                                                    
                                                                            
           
                                                                            
                                                                            
           
 Invalid UNICODE byte sequence detected near byte ...                       
                                                                            
           
                                                                            
                                                                            
           
Question:                                                                   
                                                                            
           
                                                                            
                                                                            
           
Does the 8.1 Unicode sanity code accept the full set of characters          
                                                                            
           
accepted by the 8.0 Unicode sanity code?                                    
                                                                            
           
                                                                            
                                                                            
           
If not we'll see a lot of problems like the one above.                      
                                                                            
           
                                                                            
                                                                            
           
                                                                            
                                                                            
           
I believe this patch is the one causing the problem I see:                  
                                                                            
           
                                                                            
                                                                            
           
 
http://www.mail-archive.com/pgsql-patches(at)postgresql(dot)org/msg08198/unicode.di
ff                                                                          
          
                                                                            
                                                                            
           
                                                                            
                                                                            
           
Is there any solution other than scrubbing the entire dataset to            
                                                                            
           
conform to the new (8.1) encoding rules?

pgsql-bugs by date

Next:From: Tom LaneDate: 2005-10-22 16:11:06
Subject: Re: BUG #1984: automatic casting for using indexes on bigint
Previous:From: Bill ShuiDate: 2005-10-22 07:12:00
Subject: Re: BUG #1976: steps to reproduce BUG #1438: Non UTF-8 client encoding problem

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group