Re: [POC] Fast COPY FROM command for the table with foreign partitions

From: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] Fast COPY FROM command for the table with foreign partitions
Date: 2020-07-22 09:09:51
Message-ID: cf1e92eb-9a8e-f5c1-f0cc-598882f1e546@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/16/20 2:14 PM, Amit Langote wrote:
> Hi Andrey,
>
> Thanks for this work. I have been reading through your patch and
> here's a what I understand it does and how:
>
> The patch aims to fix the restriction that COPYing into a foreign
> table can't use multi-insert buffer mechanism effectively. That's
> because copy.c currently uses the ExecForeignInsert() FDW API which
> can be passed only 1 row at a time. postgres_fdw's implementation
> issues an `INSERT INTO remote_table VALUES (...)` statement to the
> remote side for each row, which is pretty inefficient for bulk loads.
> The patch introduces a new FDW API ExecForeignCopyIn() that can
> receive multiple rows and copy.c now calls it every time it flushes
> the multi-insert buffer so that all the flushed rows can be sent to
> the remote side in one go. postgres_fdw's now issues a `COPY
> remote_table FROM STDIN` to the remote server and
> postgresExecForeignCopyIn() funnels the tuples flushed by the local
> copy to the server side waiting for tuples on the COPY protocol.

Fine

> Here are some comments on the patch.
>
> * Why the "In" in these API names?
>
> + /* COPY a bulk of tuples into a foreign relation */
> + BeginForeignCopyIn_function BeginForeignCopyIn;
> + EndForeignCopyIn_function EndForeignCopyIn;
> + ExecForeignCopyIn_function ExecForeignCopyIn;

I used an analogy from copy.c.

> * fdwhandler.sgml should be updated with the description of these new APIs.

> * As far as I can tell, the following copy.h additions are for an FDW
> to use copy.c to obtain an external representation (char string) to
> send to the remote side of the individual rows that are passed to
> ExecForeignCopyIn():
>
> +typedef void (*copy_data_dest_cb) (void *outbuf, int len);
> +extern CopyState BeginForeignCopyTo(Relation rel);
> +extern char *NextForeignCopyRow(CopyState cstate, TupleTableSlot *slot);
> +extern void EndForeignCopyTo(CopyState cstate);
>
> So, an FDW's ExecForeignCopyIn() calls copy.c: NextForeignCopyRow()
> which in turn calls copy.c: CopyOneRowTo() which fills
> CopyState.fe_msgbuf. The data_dest_cb() callback that runs after
> fe_msgbuf contains the full row simply copies it into a palloc'd char
> buffer whose pointer is returned back to ExecForeignCopyIn(). I
> wonder why not let FDWs implement the callback and pass it to copy.c
> through BeginForeignCopyTo()? For example, you could implement a
> pgfdw_copy_data_dest_cb() in postgres_fdw.c which gets a direct
> pointer of fe_msgbuf to send it to the remote server.
It is good point! Thank you.

> Do you think all FDWs would want to use copy,c like above? If not,
> maybe the above APIs are really postgres_fdw-specific? Anyway, adding
> comments above the definitions of these functions would be helpful.
Agreed
>
> * I see that the remote copy is performed from scratch on every call
> of postgresExecForeignCopyIn(), but wouldn't it be more efficient to
> send the `COPY remote_table FROM STDIN` in
> postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn()
> when there are no errors during the copy?
It is not possible. FDW share one connection between all foreign
relations from a server. If two or more partitions will be placed at one
foreign server you will have problems with concurrent COPY command. May
be we can create new connection for each partition?
>
> I tried implementing these two changes -- pgfdw_copy_data_dest_cb()
> and sending `COPY remote_table FROM STDIN` only once instead of on
> every flush -- and I see significant speedup. Please check the
> attached patch that applies on top of yours.
I integrated first change and rejected the second by the reason as above.
One problem I spotted
> when trying my patch but didn't spend much time debugging is that
> local COPY cannot be interrupted by Ctrl+C anymore, but that should be
> fixable by adjusting PG_TRY() blocks.
Thanks
>
> * ResultRelInfo.UseBulkModifying should be ri_usesBulkModify for consistency.
+1

I will post a new version of the patch a little bit later.

--
regards,
Andrey Lepikhov
Postgres Professional

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-07-22 10:50:18 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message David Rowley 2020-07-22 08:18:07 Re: Parallel Seq Scan vs kernel read ahead