Re: [POC] Fast COPY FROM command for the table with foreign partitions

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] Fast COPY FROM command for the table with foreign partitions
Date: 2020-08-07 09:14:41
Message-ID: CA+HiwqHtQaZp_t5M3z1+M0T9zd=0s3r_oOmi9TxHDdyW0_cxvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 3, 2020 at 8:38 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Wed, Jul 29, 2020 at 5:36 PM Andrey V. Lepikhov
> <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> > > Will send more comments after reading the v5 patch.
> > >
> > Ok. I'll be waiting for the end of your review.
>
> Sorry about the late reply.
>
> If you'd like to send a new version for other reviewers, please feel
> free. I haven't managed to take more than a brief look at the v5
> patch, but will try to look at it (or maybe the new version if you
> post) more closely this week.

I was playing around with v5 and I noticed an assertion failure which
I concluded is due to improper setting of ri_usesBulkModify. You can
reproduce it with these steps.

create extension postgres_fdw;
create server lb foreign data wrapper postgres_fdw ;
create user mapping for current_user server lb;
create table foo (a int, b int) partition by list (a);
create table foo1 (like foo);
create foreign table ffoo1 partition of foo for values in (1) server
lb options (table_name 'foo1');
create table foo2 (like foo);
create foreign table ffoo2 partition of foo for values in (2) server
lb options (table_name 'foo2');
create function print_new_row() returns trigger language plpgsql as $$
begin raise notice '%', new; return new; end; $$;
create trigger ffoo1_br_trig before insert on ffoo1 for each row
execute function print_new_row();
copy foo from stdin csv;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,2
>> 2,3
>> \.
NOTICE: (1,2)
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

#0 0x00007f2d5e266337 in raise () from /lib64/libc.so.6
#1 0x00007f2d5e267a28 in abort () from /lib64/libc.so.6
#2 0x0000000000aafd5d in ExceptionalCondition
(conditionName=0x7f2d37b468d0 "!resultRelInfo->ri_usesBulkModify ||
resultRelInfo->ri_FdwRoutine->BeginForeignCopyIn == NULL",
errorType=0x7f2d37b46680 "FailedAssertion",
fileName=0x7f2d37b4677f "postgres_fdw.c", lineNumber=1863) at
assert.c:67
#3 0x00007f2d37b3b0fe in postgresExecForeignInsert (estate=0x2456320,
resultRelInfo=0x23a8f58, slot=0x23a9480, planSlot=0x0) at
postgres_fdw.c:1862
#4 0x000000000066362a in CopyFrom (cstate=0x23a8d40) at copy.c:3331

The problem is that partition ffoo1's BR trigger prevents it from
using multi-insert, but its ResultRelInfo.ri_usesBulkModify is true,
which is copied from its parent. We should really check the same
things for a partition that CopyFrom() checks for the main target
relation (root parent) when deciding whether to use multi-insert.

However instead of duplicating the same logic to do so in two places
(CopyFrom and ExecInitPartitionInfo), I think it might be a good idea
to refactor the code to decide if multi-insert mode can be used for a
given relation by checking its properties and put it in some place
that both the main target relation and partitions need to invoke.
InitResultRelInfo() seems to be one such place.

Also, it might be a good idea to use ri_usesBulkModify more generally
than only for foreign relations as the patch currently does, because I
can see that it can replace the variable insertMethod in CopyFrom().
Having both insertMethod and ri_usesBulkModify in each ResultRelInfo
seems confusing and bug-prone.

Finally, I suggest renaming ri_usesBulkModify to ri_usesMultiInsert to
reflect its scope.

Please check the attached delta patch that applies on top of v5 to see
what that would look like.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
ri_usesMultiInsert.patch application/octet-stream 19.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message movead.li@highgo.ca 2020-08-07 09:30:24 Re: [Proposal] Global temporary tables
Previous Message Bharath Rupireddy 2020-08-07 09:04:39 Re: Parallel worker hangs while handling errors.