Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, Damir Belyalov <dam(dot)bel07(at)gmail(dot)com>, zhihuifan1213(at)163(dot)com, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, anisimow(dot)d(at)gmail(dot)com, HukuToc(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date: 2023-12-20 12:26:36
Message-ID: CAD21AoDKpqLGRY7YGwvEvskS+s4ykSScgXD+ZZjR_+jtVMkMMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 20, 2023 at 1:07 PM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>
> On Tue, Dec 19, 2023 at 9:14 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> >
> > The error table hub idea is still unclear to me. I assume that there
> > are error tables at least on each database. And an error table can
> > have error data that happened during COPY FROM, including malformed
> > lines. Do the error tables grow without bounds and the users have to
> > delete rows at some point? If so, who can do that? How can we achieve
> > that the users can see only errored rows they generated? And the issue
> > with logical replication also needs to be resolved. Anyway, if we go
> > this direction, we need to discuss the overall design.
> >
> > Regards,
> >
> > --
> > Masahiko Sawada
> > Amazon Web Services: https://aws.amazon.com
>
> Please check my latest attached POC.
> Main content is to build spi query, execute the spi query, regress
> test and regress output.

Why do we need to use SPI? I think we can form heap tuples and insert
them to the error table. Creating the error table also doesn't need to
use SPI.

>
> copy_errors one per schema.
> foo.copy_errors will be owned by the schema: foo owner.

It seems that the error table is created when the SAVE_ERROR is used
for the first time. It probably blocks concurrent COPY FROM commands
with SAVE_ERROR option to different tables if the error table is not
created yet.

>
> if you can insert to a table in that specific schema let's say foo,
> then you will get privilege to INSERT/DELETE/SELECT
> to foo.copy_errors.
> If you are not a superuser, you are only allowed to do
> INSERT/DELETE/SELECT on foo.copy_errors rows where USERID =
> current_user::regrole::oid.
> This is done via row level security.

I don't think it works. If the user is dropped, the user's oid could
be reused for a different user.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2023-12-20 12:42:28 RE: Synchronizing slots from primary to standby
Previous Message wenhui qiu 2023-12-20 12:23:13 Re: Transaction timeout