Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, Damir Belyalov <dam(dot)bel07(at)gmail(dot)com>, zhihuifan1213(at)163(dot)com, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, anisimow(dot)d(at)gmail(dot)com, HukuToc(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date: 2024-01-11 03:13:35
Message-ID: CACJufxEqr5kukVazRQFV_XNww53SS67rJn0DUZ1d9iBKp6=yOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 9, 2024 at 10:36 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On Tue, Dec 19, 2023 at 10:14 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> wrote:
> > If we want only such a feature we need to implement it together (the
> > patch could be split, though). But if some parts of the feature are
> > useful for users as well, I'd recommend implementing it incrementally.
> > That way, the patches can get small and it would be easy for reviewers
> > and committers to review/commit them.
>
> Jian, how do you think this comment?
>
> Looking back at the discussion so far, it seems that not everyone thinks
> saving table information is the best idea[1] and some people think just
> skipping error data is useful.[2]
>
> Since there are issues to be considered from the design such as
> physical/logical replication treatment, putting error information to
> table is likely to take time for consensus building and development.
>
> Wouldn't it be better to follow the following advice and develop the
> functionality incrementally?
>
> On Fri, Dec 15, 2023 at 4:49 AM Masahiko Sawada
> <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > So I'm thinking we may be able to implement this
> > feature incrementally. The first step would be something like an
> > option to ignore all errors or an option to specify the maximum number
> > of errors to tolerate before raising an ERROR. The second step would
> > be to support logging destinations such as server logs and tables.
>
>
> Attached a patch for this "first step" with reference to v7 patch, which
> logged errors and simpler than latest one.
> - This patch adds new option SAVE_ERROR_TO, but currently only supports
> 'none', which means just skips error data. It is expected to support
> 'log' and 'table'.
> - This patch Skips just soft errors and don't handle other errors such
> as missing column data.

Hi.
I made the following change based on your patch
(v1-0001-Add-new-COPY-option-SAVE_ERROR_TO.patch)

* when specified SAVE_ERROR_TO, move the initialization of
ErrorSaveContext to the function BeginCopyFrom.
I think that's the right place to initialize struct CopyFromState field.
* I think your patch when there are N rows have malformed data, then it
will initialize N ErrorSaveContext.
In the struct CopyFromStateData, I changed it to ErrorSaveContext *escontext.
So if an error occurred, you can just set the escontext accordingly.
* doc: mention "If this option is omitted, <command>COPY</command>
stops operation at the first error."
* Since we only support 'none' for now, 'none' means we don't want
ErrorSaveContext metadata,
so we should set cstate->escontext->details_wanted to false.

> BTW I have question and comment about v15 patch:
>
> > + {
> > + /*
> > + *
> > + * InputFunctionCall is more faster than
> > InputFunctionCallSafe.
> > + *
> > + */
>
> Have you measured this?
> When I tested it in an older patch, there were no big difference[3].
Thanks for pointing it out, I probably was over thinking.

> > - SAVEPOINT SCALAR SCHEMA SCHEMAS SCROLL SEARCH SECOND_P SECURITY
> SELECT
> > + SAVEPOINT SAVE_ERROR SCALAR SCHEMA SCHEMAS SCROLL SEARCH SECOND_P
> SECURITY SELECT
>
> There was a comment that we shouldn't add new keyword for this[4].
>
Thanks for pointing it out.

Attachment Content-Type Size
v1-0001-minor-refactor.no-cfbot application/octet-stream 5.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-01-11 03:17:47 Re: heavily contended lwlocks with long wait queues scale badly
Previous Message ddme 2024-01-11 02:53:46 Re: Is the subtype_diff function in CREATE TYPE only can be C function?