| From: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
|---|---|
| To: | Jean-Christophe BOGGIO <postgresql(at)thefreecat(dot)org>, "pgsql-admin(at)lists(dot)postgresql(dot)org" <pgsql-admin(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Importing a Windows database (in en_GB.CP1252) to linux |
| Date: | 2025-12-02 11:26:54 |
| Message-ID: | b82ebbc1fa3eb5cb175ff6669260fa95d3bbe6b7.camel@cybertec.at |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-admin |
On Tue, 2025-12-02 at 10:39 +0100, Jean-Christophe BOGGIO wrote:
> > That looks like pg_restore sets a wrong client_encoding, which is weird.
> >
> > What do you get for
> >
> > pg_restore -p 5433 -t csakafl -s -f - imlocal20251127.backup | grep client_encoding
>
> SET client_encoding = 'UTF8';
>
>
> > How did that happen? How exactly did you take that dump?
>
> This backup is a transfer from an iSeries DB2 database. It has been a nightmare to
> get this working (and took around 10 days to finalize). We set up a FDW Server using
> odbc_fdw, recreated all the tables (around 2k) and INSERTed the DB2 data to the PG tables.
>
> Then we used PgAdmin that came with PostgreSQL 17 on the Windows machine.
>
> I double-checked with the client: the database is in en_GB.CP1252.
The DB2 database or the PostgreSQL database?
It must be the DB2 database, because otherwise the dump would contain
SET client_encoding = 'WIN1252';
That is, unless you created the dump with
pg_dump --encoding=UTF8
But then, the dump couldn't contain non-UTF-8 characters.
Having used pgAdmin, you probably don't know the pg_dump command line that was used.
My best guess is that odbc_fdw has a bug that does not check if the strings are
properly encoded, and you somehow got corrupted data in your PostgreSQL database.
But I am not sure.
> > Did you do anything (like an encoding conversion) with the dump after you took it?
>
> No, the backup is in custom format so I can't touch it (or at least I don't know how I could).
>
> Where can I go from here?
You can try the following:
- convert the custom format dump into an SQL script with
pg_restore -f script.sql imlocal20251127.backup
- edit script.sql and change the line to read
SET client_encoding = 'WIN1252';
- restore that dump with "psql":
psql -f script.sql -d newdb
That should work if *all* the strings are in WINDOWS-1252 encoding.
Yours,
Laurenz Albe
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jean-Christophe BOGGIO | 2025-12-02 13:15:15 | Re: Importing a Windows database (in en_GB.CP1252) to linux |
| Previous Message | Jean-Christophe BOGGIO | 2025-12-02 09:39:04 | Re: Importing a Windows database (in en_GB.CP1252) to linux |