Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

From: Andres Freund <andres(at)anarazel(dot)de>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Damir Belyalov <dam(dot)bel07(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Danil Anisimow <anisimow(dot)d(at)gmail(dot)com>, Nikita Malakhov <HukuToc(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date: 2023-03-22 17:50:00
Message-ID: 20230322175000.qbdctk7bnmifh5an@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Tom, see below - I wonder if should provide one more piece of infrastructure
around the saved error stuff...

Have you measured whether this has negative performance effects when *NOT*
using the new option?

As-is this does not work with FORMAT BINARY - and converting the binary input
functions to support soft errors won't happen for 16. So I think you need to
raise an error if BINARY and IGNORE_DATATYPE_ERRORS are specified.

On 2023-03-22 22:34:20 +0900, torikoshia wrote:
> @@ -985,9 +986,28 @@ CopyFrom(CopyFromState cstate)
>
> ExecClearTuple(myslot);
>
> + if (cstate->opts.ignore_datatype_errors)
> + {
> + escontext.details_wanted = true;
> + cstate->escontext = escontext;
> + }

I think it might be worth pulling this out of the loop. That does mean you'd
have to reset escontext.error_occurred after an error, but that doesn't seem
too bad, you need to do other cleanup anyway.

> @@ -956,10 +957,20 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
> values[m] = ExecEvalExpr(defexprs[m], econtext, &nulls[m]);
> }
> else
> - values[m] = InputFunctionCall(&in_functions[m],
> - string,
> - typioparams[m],
> - att->atttypmod);
> + /* If IGNORE_DATATYPE_ERRORS is enabled skip rows with datatype errors */
> + if (!InputFunctionCallSafe(&in_functions[m],
> + string,
> + typioparams[m],
> + att->atttypmod,
> + (Node *) &cstate->escontext,
> + &values[m]))
> + {
> + cstate->ignored_errors++;
> +
> + ereport(WARNING,
> + errmsg("%s", cstate->escontext.error_data->message));

That isn't right - you loose all the details of the message. As is you'd also
leak the error context.

I think the best bet for now is to do something like
/* adjust elevel so we don't jump out */
cstate->escontext.error_data->elevel = WARNING;
/* despite the name, this won't raise an error if elevel < ERROR */
ThrowErrorData(cstate->escontext.error_data);

I wonder if we ought to provide a wrapper for this? It could e.g. know to
mention the original elevel and such?

I don't think NextCopyFrom() is the right place to emit this warning - it
e.g. is also called from file_fdw.c, which might want to do something else
with the error. From a layering POV it seems cleaner to do this in
CopyFrom(). You already have a check for escontext.error_occurred there
anyway.

> @@ -3378,6 +3378,10 @@ copy_opt_item:
> {
> $$ = makeDefElem("freeze", (Node *) makeBoolean(true), @1);
> }
> + | IGNORE_DATATYPE_ERRORS
> + {
> + $$ = makeDefElem("ignore_datatype_errors", (Node *)makeBoolean(true), @1);
> + }
> | DELIMITER opt_as Sconst
> {
> $$ = makeDefElem("delimiter", (Node *) makeString($3), @1);

I think we shouldn't add a new keyword for this, but only support this via
/* new COPY option syntax */
copy_generic_opt_list:
copy_generic_opt_elem

Further increasing the size of the grammar with random keywords when we have
more generic ways to represent them seems unnecessary.

> +-- tests for IGNORE_DATATYPE_ERRORS option
> +CREATE TABLE check_ign_err (n int, m int[], k int);
> +COPY check_ign_err FROM STDIN WITH IGNORE_DATATYPE_ERRORS;
> +1 {1} 1
> +a {2} 2
> +3 {3} 3333333333
> +4 {a, 4} 4
> +
> +5 {5} 5
> +\.
> +SELECT * FROM check_ign_err;
> +

I suggest adding a few more tests:

- COPY with a datatype error that can't be handled as a soft error
- test documenting that COPY FORMAT BINARY is incompatible with IGNORE_DATATYPE_ERRORS
- a soft error showing the error context - although that will require some
care to avoid the function name + line in the output

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-03-22 17:54:37 Re: On login trigger: take three
Previous Message Melanie Plageman 2023-03-22 17:20:51 Re: Add pg_walinspect function with block info columns