Re: statement_timeout is not working as expected with postgres_fdw

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: statement_timeout is not working as expected with postgres_fdw
Date: 2017-05-05 15:15:04
Message-ID: CAA4eK1+3F61uOr1UV+PsoBvXN6xn1c3HM+3nWN00YW4+HSw3mg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 5, 2017 at 7:44 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, May 5, 2017 at 1:01 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> I have tried to verify above point and it seems to me in such a
>> situation the transaction status will be either PQTRANS_INTRANS or
>> PQTRANS_INERROR. I have tried to mimic "out of memory" failure by
>> making PQmakeEmptyPGresult return NULL (malloc failure). AFAIU, it
>> will make the state as PQTRANS_IDLE only when the statement is
>> executed successfully.
>
> Hrm. OK, you might be right, but still I don't see why having the 30
> second timeout is bad. People don't want a transaction rollback to
> hang for a long period of time waiting for ABORT TRANSACTION to run if
> we could've just dropped the connection to get the same effect. Sure,
> the next transaction will then have to reconnect, but that takes a few
> tens of milliseconds; if you wait for one extra second to avoid that
> outcome, you come out behind even if you do manage to avoid the
> reconnect. Now for a subtransaction you could argue that it's worth
> being more patient, because forcing a toplevel abort sucks. But what
> are the chances that, after failing to respond to a ROLLBACK; RELEASE
> SAVEPOINT within 30 seconds, the remote side is going to wake up again
> and start behaving normally? I'd argue that it's not terribly likely
> and most people aren't going to want to wait for multiple minutes to
> see whether it does, but I suppose we could impose the 30 second
> timeout only at the toplevel and let subtransactions wait however long
> the operating system takes to time out.
>
>>>> Also, does it matter if pgfdw_subxact_callback fails
>>>> during execution of "ROLLBACK TO SAVEPOINT", because anyway user needs
>>>> to execute rollback to savepoint command before proceeding and that
>>>> should be good enough?
>>>
>>> I don't understand this. If pgfdw_subxact_callback doesn't roll the
>>> remote side back to the savepoint, how is it going to get done later?
>>> There's no code in postgres_fdw other than that code to issue the
>>> rollback.
>>>
>> It is done via toplevel transaction ABORT. I think how it works is
>> after "ROLLBACK TO SAVEPOINT" failure, any next execution is going to
>> fail. Let us say if we issue Commit after failure, it will try to
>> execute RELEASE SAVEPOINT which will fail and lead to toplevel
>> transaction ABORT. Now if the toplevel transaction ABORT also fails,
>> it will drop the connection.
>
> Mmmph. I guess that would work, but I think it's better to track it
> explicitly. I tried this (bar is a foreign table):
>
> begin;
> select * from bar;
> savepoint s1;
> select * from bar;
> savepoint s2;
> select * from bar;
> savepoint s3;
> select * from bar;
> commit;
>
> On the remote side, the commit statement produces this:
>
> 2017-05-05 09:28:52.597 EDT [80263] LOG: statement: RELEASE SAVEPOINT s4
> 2017-05-05 09:28:52.597 EDT [80263] LOG: statement: RELEASE SAVEPOINT s3
> 2017-05-05 09:28:52.597 EDT [80263] LOG: statement: RELEASE SAVEPOINT s2
> 2017-05-05 09:28:52.597 EDT [80263] LOG: statement: COMMIT TRANSACTION
>
> But that's obviously silly. Somebody could easily come along and
> optimize that by getting rid of the RELEASE SAVEPOINT commands which
> are completely unnecessary if we're going to commit the toplevel
> transaction anyway, and then the argument that you're making wouldn't
> be true any more. It seems smart to me to explicitly track whether
> the remote side is known to be in a bad state rather than relying on
> the future sequence of commands to be such that we'll get the right
> result anyway.
>

Sure, tracking explicitly might be a better way to do deal with it,
but at the cost of always dropping the connection in case of error
might not be the best thing to do.

>> Agreed, in such a situation toplevel transaction abort is the only way
>> out. However, it seems to me that will anyway happen in current code
>> even if we don't try to force it via abort_cleanup_failure. I
>> understand that it might be better to force it as you have done in the
>> patch, but not sure if it is good to drop the connection where
>> previously it could have done without it (for ex. if the first
>> statement Rollback To Savepoint fails, but ABORT succeeds). I feel it
>> is worth considering whether such a behavior difference is okay as you
>> have proposed to backpatch it.
>
> Well, I think the whole point of this patch is that if one command on
> a connection fails, the chances of future commands succeeding are
> pretty poor. It's not impossible, but it's not very likely either.
> To recap, the problem at present is that if network connectivity is
> interrupted and the statement is then killed by statement_timeout or a
> local cancel request, we try to send a cancel request to the remote
> side, which takes a while to time out. Even if that fails, we then
> send an ABORT TRANSACTION, hoping against hope that it will succeed
> despite the failure of the cancel request. The user ends up waiting
> for the cancel request to time out and then for the ABORT TRANSACTION
> to time out, which takes 15-16 minutes no matter how you set the
> keepalive settings and no matter how many times you hit ^C on the
> local side.
>
> Now, if you want to allow for the possibility that some future
> operation might succeed never mind past failures, then you're
> basically saying that trying the ABORT TRANSACTION even though the
> query cancel has failed is the correct behavior. And if you don't
> want the ABORT TRANSACTION to have a 30 second timeout either, then
> you're saying we should wait however long it takes the operating
> system to detect the failure, which again is the current behavior. If
> we rip those things out of the patch, then there's not much left of
> the patch.
>

I am not saying to rip those changes. Your most of the mail stresses
about the 30-second timeout which you have added in the patch to
detect timeouts during failures (rollback processing). I have no
objection to that part of the patch except that still there is a race
condition around PQsendQuery and you indicate that it is better to
deal it as a separate patch to which I agree. The point of which I am
not in total agreement is about the failure other than the time out.

+pgfdw_exec_cleanup_query(PGconn *conn, const char *query)
{
..
+ /* Get the result of the query. */
+ if (pgfdw_get_cleanup_result(conn, endtime, &result))
+ return false;
+
+ /* Issue a warning if not successful. */
+ if (PQresultStatus(result) != PGRES_COMMAND_OK)
+ {
+ pgfdw_report_error(WARNING, result, conn, true, query);
+ return false;
+ }
..
}

In the above code, if there is a failure due to timeout then it will
return from first statement (pgfdw_get_cleanup_result), however, if
there is any other failure, then it will issue a warning and return
false. I am not sure if there is a need to return false in the second
case, because even without that it will work fine (and there is a
chance that it won't drop the connection), but maybe your point is
better to be consistent for all kind of error handling in this path.

> The only thing that would left would be making the wait
> for ABORT TRANSACTION interruptible, so that if you hit ^C *again*
> after you've already canceled the query, we then give up on waiting
> for the ABORT TRANSACTION. I guess that would be an improvement over
> the status quo, but it's not really a solution. People don't expect
> to have to send multiple cancel requests to kill the same query in a
> timely fashion, and if the commands are being sent by an application
> rather than by a human being it probably won't happen.
>
> Do you want to propose a competing patch to fix this problem? I'm
> happy to listen to suggestions. But I think if you're determined to
> say "let's rely entirely on the OS timeouts rather than having any of
> our own" and you're also determined to say "let's not assume that a
> failure of one operation means that the connection is dead", then
> you're basically arguing that there's nothing really broken here, and
> I don't agree with that.
>

No, I am not saying any of those. I hope the previous point clarifies
what I want to say.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-05-05 15:29:12 Re: what's up with IDENTIFIER_LOOKUP_EXPR?
Previous Message Alexander Korotkov 2017-05-05 15:13:42 Re: [PATCH] Incremental sort