Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Matheus Alcantara <matheusssilv97(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
Date: 2026-02-06 03:40:02
Message-ID: CACJufxHFwQMw1As+QFk+fA7S8ZxRG2wOvHcvmsWuj2XJ+W6d_A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 28, 2026 at 2:37 AM Matheus Alcantara
<matheusssilv97(at)gmail(dot)com> wrote:
>
> Yeah, after some more thinking it seems ok to use both options
> together. I just found a bit strange when using integer columns.
> Consider this example:
>
> cat data.csv
> 1,11
> 2,22
> 3,
> 4,44
>
> postgres=# create table t(a int not null, b int);
> CREATE TABLE
>
> postgres=# copy t from
> '/Users/matheus/dev/pgdev/copy-on-error-set-null/data.csv' with
> (FORCE_NOT_NULL(b), format csv, delimiter ',', ON_ERROR set_null);
> NOTICE: 1 row was replaced with null due to data type incompatibility
> COPY 4
>
> postgres=# select * from t where b is null;
> a | b
> ---+---
> 3 |
> (1 row)
>
> We are requiring a not null value on column b but we are still
> generating rows with null values on b.
>
> The reasoning on this is that the row 3 would generate a "invalid
> input syntax for type integer" error and the ON_ERROR set_null fix
> this by inserting a NULL value. It make sense I think but I'm
> wondering if it could cause any confusion?
>

After careful reading the FORCE_NOT_NULL, FORCE_NULL option.
It's about dealing with empty value and NULL strings.

copy t from stdin with(FORCE_NOT_NULL (b), format csv, delimiter ',');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,
>> \.
ERROR: invalid input syntax for type integer: ""
CONTEXT: COPY t, line 1, column b: ""

in this case, FORCE_NOT_NULL will convert the empty value to null
string (empty double quote)

another FORCE_NULL.
copy t from stdin with(FORCE_NULL (b), format csv, delimiter ',');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,""
>> \.
COPY 1
src9=# table t;
a | b
---+---
1 |
(1 row)

In this case, FORCE_NULL will convert null string (empty double quote) to NULL.

ON_ERROR explanation, the first sentence:
"""
Specifies how to behave when encountering an error converting a
column's input value into its data type.
"""
FORCE_NULL, FORCE_NOT_NULL is a special handling of input value, ON_ERROR is
about converting the input value to data type, so it's before ON_ERROR.

Overall the current doc is fine, IMHO.

The attached patch has addressed your other points.

--
jian
https://www.enterprisedb.com/

Attachment Content-Type Size
v22-0001-COPY-on_error-set_null.patch text/x-patch 22.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo Nagata 2026-02-06 03:54:14 Re: Show expression of virtual columns in error messages
Previous Message jian he 2026-02-06 03:26:04 Re: Emitting JSON to file using COPY TO