Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, Damir Belyalov <dam(dot)bel07(at)gmail(dot)com>, zhihuifan1213(at)163(dot)com, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, anisimow(dot)d(at)gmail(dot)com, HukuToc(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date: 2024-01-14 20:34:25
Message-ID: CAD21AoB5V8RAzN589dTG_M3V7LedrkNyYvwucZ1pbdaykRmLZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 14, 2024 at 10:30 AM Alexander Korotkov
<aekorotkov(at)gmail(dot)com> wrote:
>
> Hi!
>
> I think this is a demanding and long-waited feature. The thread is
> pretty long, but mostly it was disputes about how to save the errors.
> The present patch includes basic infrastructure and ability to ignore
> errors, thus it's pretty simple.
>
> On Sat, Jan 13, 2024 at 4:20 PM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
> > On Fri, Jan 12, 2024 at 10:59 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> > >
> > >
> > > Thanks for reviewing!
> > >
> > > Updated the patch merging your suggestions except below points:
> > >
> > > > + cstate->num_errors = 0;
> > >
> > > Since cstate is already initialized in below lines, this may be
> > > redundant.
> > >
> > > | /* Allocate workspace and zero all fields */
> > > | cstate = (CopyFromStateData *) palloc0(sizeof(CopyFromStateData));
> > >
> > >
> > > > + Assert(!cstate->escontext->details_wanted);
> > >
> > > I'm not sure this is necessary, considering we're going to add other
> > > options like 'table' and 'log', which need details_wanted soon.
> > >
> > >
> > > --
> > > Regards,
> >
> > make save_error_to option cannot be used with COPY TO.
> > add redundant test, save_error_to with COPY TO test.
>
> I've incorporated these changes. Also, I've changed
> CopyFormatOptions.save_error_to to enum and made some edits in
> comments and the commit message. I'm going to push this if there are
> no objections.

Thank you for updating the patch. Here are two comments:

---
+ if (cstate->opts.save_error_to != COPY_SAVE_ERROR_TO_UNSPECIFIED &&
+ cstate->num_errors > 0)
+ ereport(WARNING,
+ errmsg("%zd rows were skipped due to data type incompatibility",
+ cstate->num_errors));
+
/* Done, clean up */
error_context_stack = errcallback.previous;

If a malformed input is not the last data, the context message seems odd:

postgres(1:1769258)=# create table test (a int);
CREATE TABLE
postgres(1:1769258)=# copy test from stdin (save_error_to none);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> a
>> 1
>>
2024-01-15 05:05:53.980 JST [1769258] WARNING: 1 rows were skipped
due to data type incompatibility
2024-01-15 05:05:53.980 JST [1769258] CONTEXT: COPY test, line 3: ""
COPY 1

I think it's better to report the WARNING after resetting the
error_context_stack. Or is a WARNING really appropriate here? The
v15-0001-Make-COPY-FROM-more-error-tolerant.patch[1] uses NOTICE but
the v1-0001-Add-new-COPY-option-SAVE_ERROR_TO.patch[2] changes it to
WARNING without explanation.

---
+-- test missing data: should fail
+COPY check_ign_err FROM STDIN WITH (save_error_to none);
+1 {1}
+\.

We might want to cover the extra data cases too.

Regards,

[1] https://www.postgresql.org/message-id/CACJufxEkkqnozdnvNMGxVAA94KZaCPkYw_Cx4JKG9ueNaZma_A%40mail.gmail.com
[2] https://www.postgresql.org/message-id/3d0b349ddbd4ae5f605f77b491697158%40oss.nttdata.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Berg 2024-01-14 21:00:42 Re: plperl and perl 5.38
Previous Message Noah Misch 2024-01-14 20:14:11 Re: Recovering from detoast-related catcache invalidations