Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, Damir Belyalov <dam(dot)bel07(at)gmail(dot)com>, zhihuifan1213(at)163(dot)com, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, anisimow(dot)d(at)gmail(dot)com, HukuToc(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date: 2023-12-20 04:07:38
Message-ID: CACJufxGFqfqzueVC7GPr0QARYXRHEsgM3MRc43SRzVw8vZc5eQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 19, 2023 at 9:14 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
>
> The error table hub idea is still unclear to me. I assume that there
> are error tables at least on each database. And an error table can
> have error data that happened during COPY FROM, including malformed
> lines. Do the error tables grow without bounds and the users have to
> delete rows at some point? If so, who can do that? How can we achieve
> that the users can see only errored rows they generated? And the issue
> with logical replication also needs to be resolved. Anyway, if we go
> this direction, we need to discuss the overall design.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com

Please check my latest attached POC.
Main content is to build spi query, execute the spi query, regress
test and regress output.

copy_errors one per schema.
foo.copy_errors will be owned by the schema: foo owner.

if you can insert to a table in that specific schema let's say foo,
then you will get privilege to INSERT/DELETE/SELECT
to foo.copy_errors.
If you are not a superuser, you are only allowed to do
INSERT/DELETE/SELECT on foo.copy_errors rows where USERID =
current_user::regrole::oid.
This is done via row level security.

Since foo.copy_errors is mainly INSERT operations, if copy_errors grow
too much, that means your source file has many errors, it will take a
very long time to finish the whole COPY. maybe we can capture how many
errors encountered in another client.

I don't know how to deal with logic replication. looking for ideas.

Attachment Content-Type Size
v12-0001-Make-COPY-FROM-more-error-tolerant.patch text/x-patch 50.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-12-20 05:52:47 Re: Synchronizing slots from primary to standby
Previous Message Amit Kapila 2023-12-20 03:42:00 Re: Synchronizing slots from primary to standby