Re: pg_dump's "--exclude-table" and "--exclude-table-data" options are ignored and/or cause the dump to fail entirely unless both the schema and table name use 1950s-era identifiers.

From: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>
To: tutiluren(at)tutanota(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_dump's "--exclude-table" and "--exclude-table-data" options are ignored and/or cause the dump to fail entirely unless both the schema and table name use 1950s-era identifiers.
Date: 2020-07-22 07:36:35
Message-ID: CAC+AXB0-vx6wzfYg93f=YgZNUNgow1n+9ertuoV5pUVzjHJtOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Please keep the list in CC for future reference, and so the subscribers can
contribute.

On Tue, Jul 21, 2020 at 7:32 PM <tutiluren(at)tutanota(dot)com> wrote:

> Jul 21, 2020, 11:12 AM by juanjo(dot)santamaria(at)gmail(dot)com:
>
>
> On Tue, Jul 21, 2020 at 8:30 AM <tutiluren(at)tutanota(dot)com> wrote:
>
>
> Try it out yourself, by creating a test schema called "Personal stöff" and
> a table in it called "My däiary". Then create a text column and make it PK
> and then add the text "This is supposed to be ignored.". Then try to run
> this command:
>
> pg_dump --format plain --verbose --file "C:\test.txt"
> --exclude-table-data="Personal stöff"."My däiary" --host="localhost"
> --port="5432" --username="postgres" --dbname="TestDB"
>
> Just to avoid wasting time, when the command doesn't work at all, it
> outputs things like this:
>
> pg_dump: [archiver (db)] query failed: ERROR: invalid byte sequence for
> encoding "UTF8": 0xf6 0x72 0x66 0x72
> pg_dump: [archiver (db)] query was: SELECT c.oid
> FROM pg_catalog.pg_class c
> LEFT JOIN pg_catalog.pg_namespace n
> ON n.oid OPERATOR(pg_catalog.=) c.relnamespace
> WHERE c.relkind OPERATOR(pg_catalog.=) ANY
> (array['r', 'S', 'v', 'm', 'f', 'p'])
> AND c.relname OPERATOR(pg_catalog.~) '^(table name)$'
> AND n.nspname OPERATOR(pg_catalog.~) '^(schema name)$'
>
>
> The source of the problem is coming from how CMD works with UTF8 (or does
> not). The error you are getting is using code page Windows-1252 [1], 0xf6
> is ö, but pg_dump is expecting UTF8 and crashes.
>
> You can try to configure UTF8 as your CMD encoding, see [2]. Please tell
> us if this works for you.
>
> [1] https://en.wikipedia.org/wiki/Windows-125
> [2]
> https://stackoverflow.com/questions/57131654/using-utf-8-encoding-chcp-65001-in-command-prompt-windows-powershell-window
>
>
> I actually have very carefully made sure (from past problems) that the
> cmd.exe uses UTF-8 and the same goes for my databases and the connection
> and everything. It truly doesn't seem to have anything to do with this.
> Isn't it obvious from the output that pg_dump is lowercasing/changing the
> input?
>

The problem with that query is not that it does not return any rows because
of case folding. Actually it crashes because it is expecting UTF8 input but
is getting something else: "pg_dump: [archiver (db)] query failed: ERROR:
invalid byte sequence for encoding "UTF8": 0xf6 0x72 0x66 0x72"

I can reproduce a test case in an English_United States.1252 WIndows 10
machine, and the setting "Beta: Use unicode UTF-8 for worldwide language
support", as mentioned above, worked in that case.

Regards,

Juan José Santamaría Flecha

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-07-22 08:45:54 Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data
Previous Message Michael Paquier 2020-07-22 06:51:12 Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data