Re: Two proposed modifications to the PostgreSQL FDW

From: Andres Freund <andres(at)anarazel(dot)de>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Chris Travers <chris(dot)travers(at)adjust(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Two proposed modifications to the PostgreSQL FDW
Date: 2018-08-20 15:02:09
Message-ID: 20180820150209.tl7b2nyxishwqpvg@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018-08-20 10:56:39 -0400, Stephen Frost wrote:
> * Andres Freund (andres(at)anarazel(dot)de) wrote:
> > On 2018-08-20 16:28:01 +0200, Chris Travers wrote:
> > > 1. INSERTMETHOD=[insert|copy] option on foreign table.
> > >
> > > One significant limitation of the PostgreSQL FDW is that it does a prepared
> > > statement insert on each row written which imposes a per-row latency. This
> > > hits environments where there is significant latency or few latency
> > > guarantees particularly hard, for example, writing to a foreign table that
> > > might be physically located on another continent. The idea is that
> > > INSERTMETHOD would default to insert and therefore have no changes but
> > > where needed people could specify COPY which would stream the data out.
> > > Updates would still be unaffected.
> >
> > That has a *lot* of semantics issues, because you suddenly don't get
> > synchronous error reports anymore. I don't think that's OK on a
> > per-table basis. If we invented something like this, it IMO should be a
> > per-statement explicit opt in that'd allow streaming.
>
> Doing some kind of decoration on a per-statement level to do something
> different for FDWs doesn't really seem very clean..

I think it's required. The semantics of an INSERT statement
*drastically* change if you don't insert remotely. Constraints aren't
evaluated once the command finished, sequences aren't increased until
later, there'll be weird interactions with savepoints, ... Without
executing immediately remotely it's basically isn't a normal INSERT
anymore.

Note bulk INSERT and single row INSERT are very different here.

That's not to say it's not useful to pipeline. To the contrary.

> On reading this, a thought I had was that maybe we should just perform a
> COPY to the FDW when COPY is what's been specified by the user (eg:
>
> COPY my_foreign_table FROM STDIN;

Right. There'd not even need to be a an option since that's already
pipelined.

> ), but that wouldn't help when someone wants to bulk copy data from a
> local table into a foreign table.

That possibly still is doable, just with INSERT, as you don't need (may
not even, in plenty cases) to see the effects of the statement until the
CommandCounterIncrement(). So we can delay the flush a bit.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2018-08-20 15:04:31 Re: Truncation failure in autovacuum results in data corruption (duplicate keys)
Previous Message Konstantin Knizhnik 2018-08-20 15:00:39 Re: libpq compression