Re: Add new error_action COPY ON_ERROR "log"

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, jian(dot)universality(at)gmail(dot)com, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add new error_action COPY ON_ERROR "log"
Date: 2024-02-16 14:47:43
Message-ID: 0e76f13b74f6729958321aba4a32a2cf@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024-02-16 17:15, Bharath Rupireddy wrote:
> On Wed, Feb 14, 2024 at 5:04 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>>
>> [....] let the users know what line numbers are
>> > containing the errors that COPY ignored something like [1] with a
>> > simple change like [2].
>>
>> Agreed.
>> Unlike my patch, it hides the error information(i.e. 22P02: invalid
>> input syntax for type integer: ), but I feel that it's usually
>> sufficient to know the row number and column where the error occurred.
>
> Right.
>
>> > It not only helps users figure out which rows
>> > and attributes were malformed, but also helps them redirect them to
>> > server logs with setting log_min_messages = notice [3]. In the worst
>> > case scenario, a problem with this one NOTICE per malformed row is
>> > that it can overload the psql session if all the rows are malformed.
>> > I'm not sure if this is a big problem, but IMO better than a single
>> > summary NOTICE message and simpler than writing to tables of users'
>> > choice.
>>
>> Maybe could we do what you suggested for the behavior when 'log' is
>> set
>> to on_error?
>
> My point is that why someone wants just the summary of failures
> without row and column info especially for bulk loading tasks. I'd
> suggest doing it independently of 'log' or 'table'. I think we can
> keep things simple just like the attached patch, and see how this
> feature will be adopted. I'm sure we can come back and do things like
> saving to 'log' or 'table' or 'separate_error_file' etc., if we
> receive any firsthand feedback.
>
> Thoughts?

I may be wrong since I seldom do data loading tasks, but I greed with
you.

I also a little concerned about the case where there are many malformed
data and it causes lots of messages, but the information is usually
valuable and if users don't need it, they can suppress it by changing
client_min_messages.

Currently both summary of failures and individual information is logged
in NOTICE level.
If we should assume that there are cases where only summary information
is required, it'd be useful to set lower log level, i.e. LOG to the
individual information.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-02-16 14:49:01 Re: Replace current implementations in crypt() and gen_salt() to OpenSSL
Previous Message torikoshia 2024-02-16 14:42:36 Re: RFC: Logging plan of the running query