Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, vignesh21(at)gmail(dot)com, lena(dot)ribackina(at)yandex(dot)ru, dam(dot)bel07(at)gmail(dot)com, zhihuifan1213(at)163(dot)com, daniel(at)yesql(dot)se, pgsql-hackers(at)postgresql(dot)org, andres(at)anarazel(dot)de, anisimow(dot)d(at)gmail(dot)com, HukuToc(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date: 2024-01-19 12:37:38
Message-ID: 84996d38ce68a8b9c0aa751f6fdcfa9a@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024-01-18 23:59, jian he wrote:
> Hi.
> patch refactored based on "on_error {stop|ignore}"
> doc changes:
>
> --- a/doc/src/sgml/ref/copy.sgml
> +++ b/doc/src/sgml/ref/copy.sgml
> @@ -43,7 +43,7 @@ COPY { <replaceable
> class="parameter">table_name</replaceable> [ ( <replaceable
> FORCE_QUOTE { ( <replaceable
> class="parameter">column_name</replaceable> [, ...] ) | * }
> FORCE_NOT_NULL { ( <replaceable
> class="parameter">column_name</replaceable> [, ...] ) | * }
> FORCE_NULL { ( <replaceable
> class="parameter">column_name</replaceable> [, ...] ) | * }
> - SAVE_ERROR_TO '<replaceable
> class="parameter">location</replaceable>'
> + ON_ERROR '<replaceable
> class="parameter">error_action</replaceable>'
> ENCODING '<replaceable
> class="parameter">encoding_name</replaceable>'
> </synopsis>
> </refsynopsisdiv>
> @@ -375,20 +375,20 @@ COPY { <replaceable
> class="parameter">table_name</replaceable> [ ( <replaceable
> </varlistentry>
>
> <varlistentry>
> - <term><literal>SAVE_ERROR_TO</literal></term>
> + <term><literal>ON_ERROR</literal></term>
> <listitem>
> <para>
> - Specifies to save error information to <replaceable
> class="parameter">
> - location</replaceable> when there is malformed data in the
> input.
> - Currently, only <literal>error</literal> (default) and
> <literal>none</literal>
> + Specifies which <replaceable class="parameter">
> + error_action</replaceable> to perform when there is malformed
> data in the input.
> + Currently, only <literal>stop</literal> (default) and
> <literal>ignore</literal>
> values are supported.
> - If the <literal>error</literal> value is specified,
> + If the <literal>stop</literal> value is specified,
> <command>COPY</command> stops operation at the first error.
> - If the <literal>none</literal> value is specified,
> + If the <literal>ignore</literal> value is specified,
> <command>COPY</command> skips malformed data and continues
> copying data.
> The option is allowed only in <command>COPY FROM</command>.
> - The <literal>none</literal> value is allowed only when
> - not using <literal>binary</literal> format.
> + Only <literal>stop</literal> value is allowed only when
> + using <literal>binary</literal> format.
> </para>

Thanks for making the patch!

Here are some comments:

> - The <literal>none</literal> value is allowed only when
> - not using <literal>binary</literal> format.
> + Only <literal>stop</literal> value is allowed only when
> + using <literal>binary</literal> format.

The second 'only' may be unnecessary.

> - /* If SAVE_ERROR_TO is specified, skip rows
> with soft errors */
> + /* If ON_ERROR is specified with IGNORE, skip
> rows with soft errors */

This is correct now, but considering future works which add other
options like "file 'copy.log'" and
"table 'copy_log'", it may be better not to limit the case to 'IGNORE'.
How about something like this?

If ON_ERROR is specified and the value is not STOP, skip rows with
soft errors

> -COPY x from stdin (format BINARY, save_error_to none);
> -COPY x to stdin (save_error_to none);
> +COPY x from stdin (format BINARY, ON_ERROR ignore);
> +COPY x from stdin (ON_ERROR unsupported);
> COPY x to stdin (format TEXT, force_quote(a));
> COPY x from stdin (format CSV, force_quote(a));

In the existing test for copy2.sql, the COPY options are written in
lower case(e.g. 'format') and option value(e.g. 'BINARY') are written in
upper case.
It would be more consistent to align them.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2024-01-19 12:43:05 Re: System username in pg_stat_activity
Previous Message Tomas Vondra 2024-01-19 12:35:25 Re: index prefetching