Re: Small fix on COPY ON_ERROR document

From: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Small fix on COPY ON_ERROR document
Date: 2024-02-02 06:59:23
Message-ID: 20240202155923.239303278062cd9882858c60@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 02 Feb 2024 11:29:41 +0900
torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:

> On 2024-02-01 15:16, Yugo NAGATA wrote:
> > On Mon, 29 Jan 2024 15:47:25 +0900
> > Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> wrote:
> >
> >> On Sun, 28 Jan 2024 19:14:58 -0700
> >> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
> >>
> >> > > Also, I think "invalid input syntax" is a bit ambiguous. For example,
> >> > > COPY FROM raises an error when the number of input column does not match
> >> > > to the table schema, but this error is not ignored by ON_ERROR while
> >> > > this seems to fall into the category of "invalid input syntax".
> >> >
> >> >
> >> >
> >> > It is literally the error text that appears if one were not to ignore it.
> >> > It isn’t a category of errors. But I’m open to ideas here. But being
> >> > explicit with what on actually sees in the system seemed preferable to
> >> > inventing new classification terms not otherwise used.
> >>
> >> Thank you for explanation! I understood the words was from the error
> >> messages
> >> that users actually see. However, as Torikoshi-san said in [1], errors
> >> other
> >> than valid input syntax (e.g. range error) can be also ignored,
> >> therefore it
> >> would be better to describe to be ignored errors more specifically.
> >>
> >> [1]
> >> https://www.postgresql.org/message-id/7f1457497fa3bf9dfe486f162d1c8ec6%40oss.nttdata.com
> >>
> >> >
> >> > >
> >> > > So, keeping consistency with the existing description, we can say:
> >> > >
> >> > > "Specifies which how to behave when encountering an error due to
> >> > > column values unacceptable to the input function of each attribute's
> >> > > data type."
> >> >
> >> >
> >> > Yeah, I was considering something along those lines as an option as well.
> >> > But I’d rather add that wording to the glossary.
> >>
> >> Although I am still be not convinced if we have to introduce the words
> >> "soft error" to the documentation, I don't care it if there are no
> >> other
> >> opposite opinions.
> >
> > Attached is a updated patch v3, which is a version that uses the above
> > wording instead of "soft error".
> >
> >> >
> >> > > Currently, ON_ERROR doesn't support other soft errors, so it can explain
> >> > > it more simply without introducing the new concept, "soft error" to users.
> >> > >
> >> > >
> >> > Good point. Seems we should define what user-facing errors are ignored
> >> > anywhere in the system and if we aren’t consistently leveraging these in
> >> > all areas/commands make the necessary qualifications in those specific
> >> > places.
> >> >
> >>
> >> > > I think "left in a deleted state" is also unclear for users because this
> >> > > explains the internal state but not how looks from user's view.How about
> >> > > leaving the explanation "These rows will not be visible or accessible" in
> >> > > the existing statement?
> >> > >
> >> >
> >> > Just visible then, I don’t like an “or” there and as tuples at least they
> >> > are accessible to the system, in vacuum especially. But I expected the
> >> > user to understand “as if you deleted it” as their operational concept more
> >> > readily than visible. I think this will be read by people who haven’t read
> >> > MVCC to fully understand what visible means but know enough to run vacuum
> >> > to clean up updated and deleted data as a rule.
> >>
> >> Ok, I agree we can omit "or accessible". How do you like the
> >> followings?
> >> Still redundant?
> >>
> >> "If the command fails, these rows are left in a deleted state;
> >> these rows will not be visible, but they still occupy disk space. "
> >
> > Also, the above statement is used in the patch.
>
> Thanks for updating the patch!
>
> I like your description which doesn't use the word soft error.

Thank you for your comments!

>
> Here are minor comments:
>
> > + <literal>ignore</literal> means discard the input row and
> > continue with the next one.
> > + The default is <literal>stop</literal>
>
> Is "." required at the end of the line?
>
> > An <literal>NOTICE</literal> level context message containing the
> > ignored row count is
>
> Should 'An' be 'A'?
>
> Also, I wasn't sure the necessity of 'context'.
> It might be possible to just say "A NOTICE message containing the
> ignored row count.."
> considering below existing descriptions:
>
> doc/src/sgml/pltcl.sgml: a <literal>NOTICE</literal> message each
> time a supported command is
> doc/src/sgml/pltcl.sgml- executed:
>
> doc/src/sgml/plpgsql.sgml: This example trigger simply raises a
> <literal>NOTICE</literal> message
> doc/src/sgml/plpgsql.sgml- each time a supported command is
> executed.

I attached a updated patch including fixes you pointed out above.

Regards,
Yugo Nagata

> --
> Regards,
>
> --
> Atsushi Torikoshi
> NTT DATA Group Corporation

--
Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>

Attachment Content-Type Size
v4_improve_ON_ERROR_option_in_COPY_doc.patch text/x-diff 3.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-02-02 07:10:12 Re: Is this a problem in GenericXLogFinish()?
Previous Message Amit Kapila 2024-02-02 06:55:30 Re: Synchronizing slots from primary to standby