| From: | Marco Colombo <pgsql(at)esiway(dot)net> |
|---|---|
| To: | Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> |
| Cc: | Andrew Sullivan <ajs(at)commandprompt(dot)com>, pgsql-general(at)postgresql(dot)org |
| Subject: | Re: Dumping/Restoring with constraints? |
| Date: | 2008-08-29 19:06:33 |
| Message-ID: | 48B848B9.4040805@esiway.net |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-admin pgsql-general |
Phoenix Kiula wrote:
> Thanks Andrew.
>
> On the server (the DB to be dumped) everything is "UTF8".
>
> On my home server (where I would like to mirror the DB), this is the output:
>
>
> =# \l
> List of databases
> Name | Owner | Encoding
> -----------+-----------------+-----------
> postgres | postgres | SQL_ASCII
> pkiula | pkiula_pkiula | UTF8
> template0 | postgres | SQL_ASCII
> template1 | postgres | SQL_ASCII
> (4 rows)
>
>
>
> This is a fresh install as you can see. The database into which I am
> importing ("pkiula") is in fact listed as UTF8! Is this not enough?
>
You said you're getting these errors:
ERROR: invalid byte sequence for encoding "UTF8": 0x80
those 0x80 bytes are inside the mydb.sql file, you may find it easier to
look for them there and identify the offending string(s). Try (on the
linux machine):
zcat mydb.sql.gz | iconv -f utf8 > /dev/null
should tell you something like:
illegal input sequence at position xxx
BTW, 0x80 is usually found in windows encoding, such as windows-1250,
where it stands for the EURO symbol:
echo -n "€" | iconv -t windows-1250 | hexdump -C
00000000 80 |.|
00000001
FYI, you *can* get non UTF-8 data from an UTF-8 database, if (and only
if) your client encoding is something different (either because you
explicitly set it so, or because of your client defaults).
Likewise, you can insert non UTF-8 data (such as your mydb.sql) into an
UTF-8 database, provided you set your client encoding accordingly.
PostgreSQL clients handle encoding conversions, but there's no way to
guess (reliabily) the encoding of a text file.
OTOH, from a SQL_ASCII database you can get all sort of data, even mixed
encoding text (which you need to fix somehow). If your mydb.sql
contains data from a SQL_ASCII database, you simply know nothing about
the encoding.
I have seen SQL_ASCII databases containg data inserted from HTTP forms,
both in UTF-8 and windows-1250 encoding. Displaying, dumping, restoring
that correctly is impossible, you need to fix it somehow before
processing it as text.
.TM.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Richard Broersma | 2008-08-29 19:31:53 | PITR with MS-DOS shell |
| Previous Message | Phoenix Kiula | 2008-08-29 16:07:25 | Re: Dumping/Restoring with constraints? |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2008-08-29 19:44:44 | Re: temp schemas |
| Previous Message | Adrian Klaver | 2008-08-29 17:22:27 | Re: Dumping/Restoring with constraints? |