BUG #7611: \copy (and COPY?) incorrectly parses nul character for windows-1252

From: sams(dot)james+postgres(at)gmail(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #7611: \copy (and COPY?) incorrectly parses nul character for windows-1252
Date: 2012-10-18 06:29:49
Message-ID: E1TOjc5-0001lZ-5m@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 7611
Logged by: James
Email address: sams(dot)james+postgres(at)gmail(dot)com
PostgreSQL version: 9.1.6
Operating system: Ubuntu Linux 12.04
Description:

I have a file with several nul characters in it. The file itself appears to
be encoded as windows-1252, though I am not 100% certain of that. I do know
that other software (e.g. Python) can decode the data as windows-1252
without issue. Postgres's \copy, however, chokes on the nul byte:

ERROR: unterminated CSV quoted field
CONTEXT: COPY promo_nonactive_load_fake, line 239900

Note that the error is wrong, the field is quoted but postgres seems to jump
forward in the file when it encounters the nul bytes.

Further, the line number is wrong. That is the length of the file (in
lines), not the line on which the error occurs, which is several hundred
lines before this.

Deleting the nul byte characters allowed copy to proceed normally. I
experienced similar issues with psycopg2 and copy_expert using COPY FROM
STDIN and this file.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Fujii Masao 2012-10-18 15:19:30 Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
Previous Message ichbinrene 2012-10-17 22:55:03 Re: BUG #7521: Cannot disable WAL log while using pg_dump