RE: [POC] Fast COPY FROM command for the table with foreign partitions

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Andrey Lepikhov' <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: "tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>, Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>, Ashutosh Bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "houzj(dot)fnst(at)cn(dot)fujitsu(dot)com" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>
Subject: RE: [POC] Fast COPY FROM command for the table with foreign partitions
Date: 2021-02-09 04:35:03
Message-ID: TYAPR01MB29908A830017784F6E8A468FFE8E9@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: tsunakawa(dot)takay(at)fujitsu(dot)com <tsunakawa(dot)takay(at)fujitsu(dot)com>
> From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
> > Of course, you can rebase it.
>
> Thank you. I might modify the basic part to incorporate my past proposal
> about improving the layering or modularity related to ri_useMultiInsert. (But I
> may end up giving up due to lack of energy.)

Rebased to HEAD with the following modifications. It passes make check in the top directory and contrib/postgres_fdw.

(1)
Placed and ordered new three FDW functions consistently among their documentation, declaration and definition.

(2)
Check if BeginForeignCopy is not NULL before calling it, because the documentation says it's not mandatory.

(3)
Changed the function name ExecSetRelationUsesMultiInsert() to ExecMultiInsertAllowed() because it does *not* set anything but returns a boolean value to indicate whether the relation allows multi-insert. I was bugged about this function's interface and the use of ri_usesMultiInsert in ResultRelInfo. I still feel a bit uneasy about things like whether the function should really take the partition root (parent) argument, and whether it's a good design that ri_usesMultiInsert is used for the executor functions to determine which of Begin/EndForeignCopy() or Begin/EndForeignInsert() should be called. I'm fine with COPY using executor, but it feels a bit uncomfortable for the executor functions to be aware of COPY.

That said, with the reviews from some people and good performance results, I think this can be ready for committer.

> Also, I might defer working on the extended part (v9 0003 and 0004) and further
> separate them in a different thread, if it seems to take longer.

I reviewed them but haven't rebased them (it seems to take more labor.)
Andrey-san, could you tell us:

* Why is a separate FDW connection established for each COPY? To avoid using the same FDW connection for multiple foreign table partitions in a single COPY run?

* In what kind of test did you get 2-4x performance gain? COPY into many foreign table partitions where the input rows are ordered randomly enough that many rows don't accumulate in the COPY buffer?

Regards
Takayuki Tsunakawa

Attachment Content-Type Size
v14-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch application/octet-stream 50.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-02-09 04:58:14 Re: parse mistake in ecpg connect string
Previous Message er 2021-02-09 04:30:05 Re: 2021-02-11 release announcement draft