Re: Optimization for updating foreign tables in Postgres FDW

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc: robertmhaas(at)gmail(dot)com, michael(dot)paquier(at)gmail(dot)com, fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp, noah(at)leadboat(dot)com, rushabh(dot)lathia(at)gmail(dot)com, thom(at)linux(dot)com, sfrost(at)snowman(dot)net, laurenz(dot)albe(at)wien(dot)gv(dot)at, tgl(at)sss(dot)pgh(dot)pa(dot)us, shigeru(dot)hanada(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Optimization for updating foreign tables in Postgres FDW
Date: 2017-04-18 02:42:53
Message-ID: 20170418.114253.71614412.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 17 Apr 2017 17:50:58 +0530, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote in <CAFjFpRdcWw4h0a-zrL-EiaekkPj8O0GR2M1FwZ1useSRfRm3-g(at)mail(dot)gmail(dot)com>
> On Mon, Apr 17, 2017 at 1:53 PM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > At Thu, 13 Apr 2017 13:04:12 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in <CA+TgmoaxnNmuONgP=bXJojrgbnMPTi6Ms8OSwZBC2YQ2ueUiSg(at)mail(dot)gmail(dot)com>
> >> On Thu, Apr 21, 2016 at 10:53 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >> > On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier
> >> > <michael(dot)paquier(at)gmail(dot)com> wrote:
> >> >> On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita
> >> >> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >> >>> Attached is an updated version of the patch, which modified Michael's
> >> >>> version of the patch, as I proposed in [1] (see "Other changes:"). I
> >> >>> modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly because
> >> >>> words like "non-blocking mode" there seems confusing (note that we have
> >> >>> PQsetnonbloking).
> >> >>
> >> >> OK, so that is what I sent except that the comments mentioning PG_TRY
> >> >> are moved to their correct places. That's fine for me. Thanks for
> >> >> gathering everything in a single patch and correcting it.
> >> >
> >> > I have committed this patch. Thanks for working on this. Sorry for the delay.
> >>
> >> This 9.6-era patch, as it turns out, has a problem, which is that we
> >> now respond to an interrupt by sending a cancel request and a
> >> NON-interruptible ABORT TRANSACTION command to the remote side. If
> >> the reason that the user is trying to interrupt is that the network
> >> connection has been cut, they interrupt the original query only to get
> >> stuck in a non-interruptible wait for ABORT TRANSACTION. That is
> >> non-optimal.
> >
> > Agreed.
> >
> >> It is not exactly clear to me how to fix this. Could we get by with
> >> just slamming the remote connection shut, instead of sending an
> >> explicit ABORT TRANSACTION? The remote side ought to treat a
> >> disconnect as equivalent to an ABORT anyway, I think, but maybe our
> >> local state would get confused. (I have not checked.)
> >>
> >> Thoughts?
> >
> > Perhaps we will get stuck at query cancellation before ABORT
> > TRANSACTION in the case. A connection will be shut down when
> > anything wrong (PQstatus(conn) != CONNECTION_OK and so) on
> > aborting local transactoin . So I don't think fdw gets confused
> > or sholdn't be confused by shutting down there.
> >
> > The most significant issue I can see is that the same thing
> > happens in the case of graceful ABORT TRANSACTION. It could be a
> > performance issue.
> >
> > We could set timeout here but maybe we can just slamming the
> > connection down instead of sending a query cancellation. It is
> > caused only by timeout or interrupts so I suppose it is not a
> > problem *with a few connections*.
> >
> >
> > Things are a bit diffent with hundreds of connections. The
> > penalty of reconnection would be very high in the case.
> >
> > If we are not willing to pay such high penalty, maybe we are to
> > manage busy-idle time of each connection and trying graceful
> > abort if it is short enough, maybe having a shoft timeout.
> >
> > Furthermore, if most or all of the hundreds of connections get
> > stuck, such timeout will accumulate up like a mountain...
>
> Even when the transaction is aborted because a user cancels a query,
> we do want to preserve the connection, if possible, to avoid

Yes.

> reconnection. If the request to cancel the query itself fails, we
> should certainly drop the connection. Here's the patch to do that.

A problem I think on this would be that we still try to make
another connection for canceling and it would stall for several
minutes per connection on a packet stall, which should be done in
a second on ordinary circumstances. Perhaps we might want here is
async canceling with timeout.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amos Bird 2017-04-18 03:13:29 Re: PATCH: psql show index with type info
Previous Message Fabien COELHO 2017-04-18 02:34:44 Re: PATCH: psql show index with type info