| From: | "Matheus Alcantara" <matheusssilv97(at)gmail(dot)com> |
|---|---|
| To: | "jian he" <jian(dot)universality(at)gmail(dot)com>, "Matheus Alcantara" <matheusssilv97(at)gmail(dot)com> |
| Cc: | "torikoshia" <torikoshia(at)oss(dot)nttdata(dot)com>, "Masahiko Sawada" <sawada(dot)mshk(at)gmail(dot)com>, "vignesh C" <vignesh21(at)gmail(dot)com>, "Jim Jones" <jim(dot)jones(at)uni-muenster(dot)de>, "Kirill Reshke" <reshkekirill(at)gmail(dot)com>, "Fujii Masao" <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, "Yugo NAGATA" <nagata(at)sraoss(dot)co(dot)jp>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row |
| Date: | 2026-02-06 12:58:43 |
| Message-ID: | DG7WARDX1TY0.2VX3Q7QHSCXE9@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri Feb 6, 2026 at 12:40 AM -03, jian he wrote:
>> Yeah, after some more thinking it seems ok to use both options
>> together. I just found a bit strange when using integer columns.
>> Consider this example:
>>
>> cat data.csv
>> 1,11
>> 2,22
>> 3,
>> 4,44
>>
>> postgres=# create table t(a int not null, b int);
>> CREATE TABLE
>>
>> postgres=# copy t from
>> '/Users/matheus/dev/pgdev/copy-on-error-set-null/data.csv' with
>> (FORCE_NOT_NULL(b), format csv, delimiter ',', ON_ERROR set_null);
>> NOTICE: 1 row was replaced with null due to data type incompatibility
>> COPY 4
>>
>> postgres=# select * from t where b is null;
>> a | b
>> ---+---
>> 3 |
>> (1 row)
>>
>> We are requiring a not null value on column b but we are still
>> generating rows with null values on b.
>>
>> The reasoning on this is that the row 3 would generate a "invalid
>> input syntax for type integer" error and the ON_ERROR set_null fix
>> this by inserting a NULL value. It make sense I think but I'm
>> wondering if it could cause any confusion?
>>
>
> After careful reading the FORCE_NOT_NULL, FORCE_NULL option.
> It's about dealing with empty value and NULL strings.
>
> copy t from stdin with(FORCE_NOT_NULL (b), format csv, delimiter ',');
> Enter data to be copied followed by a newline.
> End with a backslash and a period on a line by itself, or an EOF signal.
>>> 1,
>>> \.
> ERROR: invalid input syntax for type integer: ""
> CONTEXT: COPY t, line 1, column b: ""
>
> in this case, FORCE_NOT_NULL will convert the empty value to null
> string (empty double quote)
>
> another FORCE_NULL.
> copy t from stdin with(FORCE_NULL (b), format csv, delimiter ',');
> Enter data to be copied followed by a newline.
> End with a backslash and a period on a line by itself, or an EOF signal.
>>> 1,""
>>> \.
> COPY 1
> src9=# table t;
> a | b
> ---+---
> 1 |
> (1 row)
>
> In this case, FORCE_NULL will convert null string (empty double quote) to NULL.
>
> ON_ERROR explanation, the first sentence:
> """
> Specifies how to behave when encountering an error converting a
> column's input value into its data type.
> """
> FORCE_NULL, FORCE_NOT_NULL is a special handling of input value, ON_ERROR is
> about converting the input value to data type, so it's before ON_ERROR.
>
> Overall the current doc is fine, IMHO.
>
Yeah, after also reading more carefully it's sounds correct to me too.
> The attached patch has addressed your other points.
>
Thanks, overall the patch looks good to me. I'm attaching a diff with
just some small tweaks on documentation and error messages. Please see
and check if it's make sense.
I'm wondering if we should have an else if block on
CopyFromTextLikeOneRow() when cstate->cur_attval is NULL to handle
COPY_ON_ERROR_SET_NULL when log_verbosity is set to
COPY_LOG_VERBOSITY_VERBOSE
if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
ereport(NOTICE,
errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
cstate->cur_lineno,
cstate->cur_attname));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg("setting to null due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
+ cstate->cur_lineno,
+ cstate->cur_attname));
--
Matheus Alcantara
EDB: https://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Matheus Alcantara | 2026-02-06 13:00:36 | Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row |
| Previous Message | Pavlo Golub | 2026-02-06 12:57:24 | Re[2]: [PATCH] Add last_executed timestamp to pg_stat_statements |