Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, Damir Belyalov <dam(dot)bel07(at)gmail(dot)com>, zhihuifan1213(at)163(dot)com, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, anisimow(dot)d(at)gmail(dot)com, HukuToc(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date: 2024-01-15 06:43:46
Message-ID: CAD21AoCO_gkhJ4nr6v1_tMDAqGOrRk1tPJgdowvGDNveS11eEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 15, 2024 at 8:21 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>
> On Sun, Jan 14, 2024 at 10:35 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > Thank you for updating the patch. Here are two comments:
> >
> > ---
> > + if (cstate->opts.save_error_to != COPY_SAVE_ERROR_TO_UNSPECIFIED &&
> > + cstate->num_errors > 0)
> > + ereport(WARNING,
> > + errmsg("%zd rows were skipped due to data type incompatibility",
> > + cstate->num_errors));
> > +
> > /* Done, clean up */
> > error_context_stack = errcallback.previous;
> >
> > If a malformed input is not the last data, the context message seems odd:
> >
> > postgres(1:1769258)=# create table test (a int);
> > CREATE TABLE
> > postgres(1:1769258)=# copy test from stdin (save_error_to none);
> > Enter data to be copied followed by a newline.
> > End with a backslash and a period on a line by itself, or an EOF signal.
> > >> a
> > >> 1
> > >>
> > 2024-01-15 05:05:53.980 JST [1769258] WARNING: 1 rows were skipped
> > due to data type incompatibility
> > 2024-01-15 05:05:53.980 JST [1769258] CONTEXT: COPY test, line 3: ""
> > COPY 1
> >
> > I think it's better to report the WARNING after resetting the
> > error_context_stack. Or is a WARNING really appropriate here? The
> > v15-0001-Make-COPY-FROM-more-error-tolerant.patch[1] uses NOTICE but
> > the v1-0001-Add-new-COPY-option-SAVE_ERROR_TO.patch[2] changes it to
> > WARNING without explanation.
>
> Thank you for noticing this. I think NOTICE is more appropriate here.
> There is nothing to "worry" about: the user asked to ignore the errors
> and we did. And yes, it doesn't make sense to use the last line as
> the context. Fixed.
>
> > ---
> > +-- test missing data: should fail
> > +COPY check_ign_err FROM STDIN WITH (save_error_to none);
> > +1 {1}
> > +\.
> >
> > We might want to cover the extra data cases too.
>
> Agreed, the relevant test is added.

Thank you for updating the patch. I have one minor point:

+ if (cstate->opts.save_error_to != COPY_SAVE_ERROR_TO_UNSPECIFIED &&
+ cstate->num_errors > 0)
+ ereport(NOTICE,
+ errmsg("%zd rows were skipped due to
data type incompatibility",
+ cstate->num_errors));
+

We can use errmsg_plural() instead.

I have a question about the option values; do you think we need to
have another value of SAVE_ERROR_TO option to explicitly specify the
current default behavior, i.e. not accept any error? With the v4
patch, the user needs to omit SAVE_ERROR_TO option to accept errors
during COPY FROM. If we change the default behavior in the future,
many users will be affected and probably end up changing their
applications to keep the current default behavior.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2024-01-15 06:50:55 Re: Oom on temp (un-analyzed table caused by JIT) V16.1
Previous Message Richard Guo 2024-01-15 06:42:16 Re: POC: GROUP BY optimization