Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Damir Belyalov <dam(dot)bel07(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Danil Anisimow <anisimow(dot)d(at)gmail(dot)com>, HukuToc(at)gmail(dot)com, a(dot)lepikhov(at)postgrespro(dot)ru, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date: 2023-03-06 14:03:38
Message-ID: 0DE0602F-CC12-40ED-B259-3AB91FB02C3B@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 28 Feb 2023, at 15:28, Damir Belyalov <dam(dot)bel07(at)gmail(dot)com> wrote:

> Tested patch on all cases: CIM_SINGLE, CIM_MULTI, CIM_MULTI_CONDITION. As expected it works.
> Also added a description to copy.sgml and made a review on patch.
>
> I added 'ignored_errors' integer parameter that should be output after the option is finished.
> All errors were added to the system logfile with full detailed context. Maybe it's better to log only error message.

FWIW, Greenplum has a similar construct (but which also logs the errors in the
db) where data type errors are skipped as long as the number of errors don't
exceed a reject limit. If the reject limit is reached then the COPY fails:

LOG ERRORS [ SEGMENT REJECT LIMIT <count> [ ROWS | PERCENT ]]

IIRC the gist of this was to catch then the user copies the wrong input data or
plain has a broken file. Rather than finding out after copying n rows which
are likely to be garbage the process can be restarted.

This version of the patch has a compiler error in the error message:

copyfrom.c: In function ‘CopyFrom’:
copyfrom.c:1008:29: error: format ‘%ld’ expects argument of type ‘long int’, but argument 2 has type ‘uint64’ {aka ‘long long unsigned int’} [-Werror=format=]
1008 | ereport(WARNING, errmsg("Errors: %ld", cstate->ignored_errors));
| ^~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~
| |
| uint64 {aka long long unsigned int}

On that note though, it seems to me that this error message leaves a bit to be
desired with regards to the level of detail.

--
Daniel Gustafsson

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2023-03-06 14:14:42 Re: wrong results due to qual pushdown
Previous Message Dilip Kumar 2023-03-06 13:55:37 Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher