Re: Optimization for updating foreign tables in Postgres FDW

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Noah Misch <noah(at)leadboat(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimization for updating foreign tables in Postgres FDW
Date: 2017-04-18 03:42:07
Message-ID: CAFjFpRcRjohUZaCNon1zkP7c6fo2qd5hnfHCT6t_39SGtS_oKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 18, 2017 at 8:12 AM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> At Mon, 17 Apr 2017 17:50:58 +0530, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote in <CAFjFpRdcWw4h0a-zrL-EiaekkPj8O0GR2M1FwZ1useSRfRm3-g(at)mail(dot)gmail(dot)com>
>> On Mon, Apr 17, 2017 at 1:53 PM, Kyotaro HORIGUCHI
>> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> > At Thu, 13 Apr 2017 13:04:12 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in <CA+TgmoaxnNmuONgP=bXJojrgbnMPTi6Ms8OSwZBC2YQ2ueUiSg(at)mail(dot)gmail(dot)com>
>> >> On Thu, Apr 21, 2016 at 10:53 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> >> > On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier
>> >> > <michael(dot)paquier(at)gmail(dot)com> wrote:
>> >> >> On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita
>> >> >> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> >> >>> Attached is an updated version of the patch, which modified Michael's
>> >> >>> version of the patch, as I proposed in [1] (see "Other changes:"). I
>> >> >>> modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly because
>> >> >>> words like "non-blocking mode" there seems confusing (note that we have
>> >> >>> PQsetnonbloking).
>> >> >>
>> >> >> OK, so that is what I sent except that the comments mentioning PG_TRY
>> >> >> are moved to their correct places. That's fine for me. Thanks for
>> >> >> gathering everything in a single patch and correcting it.
>> >> >
>> >> > I have committed this patch. Thanks for working on this. Sorry for the delay.
>> >>
>> >> This 9.6-era patch, as it turns out, has a problem, which is that we
>> >> now respond to an interrupt by sending a cancel request and a
>> >> NON-interruptible ABORT TRANSACTION command to the remote side. If
>> >> the reason that the user is trying to interrupt is that the network
>> >> connection has been cut, they interrupt the original query only to get
>> >> stuck in a non-interruptible wait for ABORT TRANSACTION. That is
>> >> non-optimal.
>> >
>> > Agreed.
>> >
>> >> It is not exactly clear to me how to fix this. Could we get by with
>> >> just slamming the remote connection shut, instead of sending an
>> >> explicit ABORT TRANSACTION? The remote side ought to treat a
>> >> disconnect as equivalent to an ABORT anyway, I think, but maybe our
>> >> local state would get confused. (I have not checked.)
>> >>
>> >> Thoughts?
>> >
>> > Perhaps we will get stuck at query cancellation before ABORT
>> > TRANSACTION in the case. A connection will be shut down when
>> > anything wrong (PQstatus(conn) != CONNECTION_OK and so) on
>> > aborting local transactoin . So I don't think fdw gets confused
>> > or sholdn't be confused by shutting down there.
>> >
>> > The most significant issue I can see is that the same thing
>> > happens in the case of graceful ABORT TRANSACTION. It could be a
>> > performance issue.
>> >
>> > We could set timeout here but maybe we can just slamming the
>> > connection down instead of sending a query cancellation. It is
>> > caused only by timeout or interrupts so I suppose it is not a
>> > problem *with a few connections*.
>> >
>> >
>> > Things are a bit diffent with hundreds of connections. The
>> > penalty of reconnection would be very high in the case.
>> >
>> > If we are not willing to pay such high penalty, maybe we are to
>> > manage busy-idle time of each connection and trying graceful
>> > abort if it is short enough, maybe having a shoft timeout.
>> >
>> > Furthermore, if most or all of the hundreds of connections get
>> > stuck, such timeout will accumulate up like a mountain...
>>
>> Even when the transaction is aborted because a user cancels a query,
>> we do want to preserve the connection, if possible, to avoid
>
> Yes.
>
>> reconnection. If the request to cancel the query itself fails, we
>> should certainly drop the connection. Here's the patch to do that.
>
> A problem I think on this would be that we still try to make
> another connection for canceling and it would stall for several
> minutes per connection on a packet stall, which should be done in
> a second on ordinary circumstances. Perhaps we might want here is
> async canceling with timeout.

I am not sure what do you mean "make another connection for
cancelling.". Do you mean to say that PQcancel would make another
connection?

The patch proposed simply improves the condition when PQcancel has
returned a failure. Right now we ignore that failure and try to ABORT
the transaction, which is most probably going to get stalled or fail
to be dispatched. With the patch such a connection will be reset.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-04-18 04:00:22 Re: Optimization for updating foreign tables in Postgres FDW
Previous Message Peter Eisentraut 2017-04-18 03:40:38 Re: Different table schema in logical replication crashes