Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

From: Damir Belyalov <dam(dot)bel07(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Danil Anisimow <anisimow(dot)d(at)gmail(dot)com>, HukuToc(at)gmail(dot)com, a(dot)lepikhov(at)postgrespro(dot)ru, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date: 2023-03-07 08:35:32
Message-ID: CALH1LguAEsoTYJTCsXNB-7z2Hu9UGEpsXA4kj0FOTmoP=6Wp3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> FWIW, Greenplum has a similar construct (but which also logs the errors
> in the
> db) where data type errors are skipped as long as the number of errors
> don't
> exceed a reject limit. If the reject limit is reached then the COPY
> fails:
> >
> > LOG ERRORS [ SEGMENT REJECT LIMIT <count> [ ROWS | PERCENT ]]
> >
> IIRC the gist of this was to catch then the user copies the wrong input
> data or
> plain has a broken file. Rather than finding out after copying n rows
> which
> are likely to be garbage the process can be restarted.
>

I think this is a matter for discussion. The same question is: "Where to
log errors to separate files or to the system logfile?".
IMO it's better for users to log short-detailed error message to system
logfile and not output errors to the terminal.

This version of the patch has a compiler error in the error message:
>
Yes, corrected it. Changed "ignored_errors" to int64 because "processed"
(used for counting copy rows) is int64.

I felt just logging "Error: %ld" would make people wonder the meaning of
> the %ld. Logging something like ""Error: %ld data type errors were
> found" might be clearer.
>

Thanks. For more clearance change the message to: "Errors were found: %".

Regards, Damir Belyalov
Postgres Professional

Attachment Content-Type Size
v3-0001-Add-COPY-option-IGNORE_DATATYPE_ERRORS.patch text/x-patch 10.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2023-03-07 08:58:03 Re: using memoize in in paralel query decreases performance
Previous Message Daniel Gustafsson 2023-03-07 08:26:41 Re: Raising the SCRAM iteration count