Re: Add support for tuple routing to foreign partitions

From: David Fetter <david(at)fetter(dot)org>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: Add support for tuple routing to foreign partitions
Date: 2017-08-18 13:41:28
Message-ID: 20170818134128.GS3471@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 18, 2017 at 05:10:29PM +0900, Etsuro Fujita wrote:
> On 2017/08/17 23:48, David Fetter wrote:
> >On Thu, Aug 17, 2017 at 05:27:05PM +0900, Etsuro Fujita wrote:
> >>On 2017/07/11 6:56, Robert Haas wrote:
> >>>On Thu, Jun 29, 2017 at 6:20 AM, Etsuro Fujita
> >>><fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >>>>So, I dropped the COPY part.
> >>>
> >>>Ouch. I think we should try to figure out how the COPY part will be
> >>>handled before we commit to a design.
> >>
> >>I spent some time on this. To handle that, I'd like to propose doing
> >>something similar to \copy (frontend copy): submit a COPY query "COPY ...
> >>FROM STDIN" to the remote server and route data from a file to the remote
> >>server. For that, I'd like to add new FDW APIs called during CopyFrom that
> >>allow us to copy to foreign tables:
> >>
> >>* BeginForeignCopyIn: this would be called after creating a ResultRelInfo
> >>for the target table (or each leaf partition of the target partitioned
> >>table) if it's a foreign table, and perform any initialization needed before
> >>the remote copy can start. In the postgres_fdw case, I think this function
> >>would be a good place to send "COPY ... FROM STDIN" to the remote server.
> >>
> >>* ExecForeignCopyInOneRow: this would be called instead of heap_insert if
> >>the target is a foreign table, and route the tuple read from the file by
> >>NextCopyFrom to the remote server. In the postgres_fdw case, I think this
> >>function would convert the tuple to text format for portability, and then
> >>send the data to the remote server using PQputCopyData.
> >>
> >>* EndForeignCopyIn: this would be called at the bottom of CopyFrom, and
> >>release resources such as connections to the remote server. In the
> >>postgres_fdw case, this function would do PQputCopyEnd to terminate data
> >>transfer.
> >
> >These primitives look good. I know it seems unlikely at first
> >blush, but do we know of bulk load APIs for non-PostgreSQL data
> >stores that this would be unable to serve?
>
> Maybe I'm missing something, but I think these would allow the FDW
> to do the remote copy the way it would like; in
> ExecForeignCopyInOneRow, for example, the FDW could buffer tuples in
> a buffer memory and transmit the buffered data to the remote server
> if the data size exceeds a threshold. The naming is not so good?
> Suggestions are welcome.

The naming seems reasonable.

I was trying to figure out whether this fits well enough with the bulk
load APIs for databases other than PostgreSQL. I'm guessing it's
"well enough" based on checking MySQL, Oracle, and MS SQL Server.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ildar Musin 2017-08-18 13:41:49 Re: Proposal: global index
Previous Message Chris Travers 2017-08-18 13:15:11 Re: Proposal: global index