Skip site navigation (1) Skip section navigation (2)

Re: Dumping/Restoring with constraints?

From: Marco Colombo <pgsql(at)esiway(dot)net>
To: Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
Cc: Andrew Sullivan <ajs(at)commandprompt(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Dumping/Restoring with constraints?
Date: 2008-08-29 19:06:33
Message-ID: 48B848B9.4040805@esiway.net (view raw or flat)
Thread:
Lists: pgsql-adminpgsql-general
Phoenix Kiula wrote:
> Thanks Andrew.
> 
> On the server (the DB to be dumped) everything is "UTF8".
> 
> On my home server (where I would like to mirror the DB), this is the output:
> 
> 
> =# \l
>             List of databases
>    Name    |      Owner      | Encoding
> -----------+-----------------+-----------
>  postgres  | postgres        | SQL_ASCII
>  pkiula    | pkiula_pkiula   | UTF8
>  template0 | postgres        | SQL_ASCII
>  template1 | postgres        | SQL_ASCII
> (4 rows)
> 
> 
> 
> This is a fresh install as you can see. The database into which I am
> importing ("pkiula") is in fact listed as UTF8! Is this not enough?
> 

You said you're getting these errors:
ERROR:  invalid byte sequence for encoding "UTF8": 0x80

those 0x80 bytes are inside the mydb.sql file, you may find it easier to 
  look for them there and identify the offending string(s). Try (on the 
linux machine):

zcat mydb.sql.gz | iconv -f utf8 > /dev/null

should tell you something like:

illegal input sequence at position xxx

BTW, 0x80 is usually found in windows encoding, such as windows-1250, 
where it stands for the EURO symbol:

echo -n "€" | iconv -t windows-1250 | hexdump -C
00000000  80                                                |.|
00000001


FYI, you *can* get non UTF-8 data from an UTF-8 database, if (and only 
if) your client encoding is something different (either because you 
explicitly set it so, or because of your client defaults).

Likewise, you can insert non UTF-8 data (such as your mydb.sql) into an 
UTF-8 database, provided you set your client encoding accordingly. 
PostgreSQL clients handle encoding conversions, but there's no way to 
guess (reliabily) the encoding of a text file.

OTOH, from a SQL_ASCII database you can get all sort of data, even mixed 
  encoding text (which you need to fix somehow). If your mydb.sql 
contains data from a SQL_ASCII database, you simply know nothing about 
the encoding.

I have seen SQL_ASCII databases containg data inserted from HTTP forms, 
both in UTF-8 and windows-1250 encoding. Displaying, dumping, restoring
that correctly is impossible, you need to fix it somehow before 
processing it as text.

.TM.

In response to

pgsql-admin by date

Next:From: Richard BroersmaDate: 2008-08-29 19:31:53
Subject: PITR with MS-DOS shell
Previous:From: Phoenix KiulaDate: 2008-08-29 16:07:25
Subject: Re: Dumping/Restoring with constraints?

pgsql-general by date

Next:From: Tom LaneDate: 2008-08-29 19:44:44
Subject: Re: temp schemas
Previous:From: Adrian KlaverDate: 2008-08-29 17:22:27
Subject: Re: Dumping/Restoring with constraints?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group