Re: [HACKERS] Transactions involving multiple postgres foreign servers

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Transactions involving multiple postgres foreign servers
Date: 2017-12-11 10:20:25
Message-ID: CAD21AoBMHf8+opU8ibP34uEEs7zTGf9=g7B6pftpuWq88ufnbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 28, 2017 at 12:31 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> On Tue, Nov 28, 2017 at 3:04 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> On Fri, Nov 24, 2017 at 10:28 PM, Antonin Houska <ah(at)cybertec(dot)at> wrote:
>>> Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>>
>>>> On Mon, Oct 30, 2017 at 5:48 PM, Ashutosh Bapat
>>>> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>>>> > On Thu, Oct 26, 2017 at 7:41 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>>> >>
>>>> >> Because I don't want to break the current user semantics. that is,
>>>> >> currently it's guaranteed that the subsequent reads can see the
>>>> >> committed result of previous writes even if the previous transactions
>>>> >> were distributed transactions. And it's ensured by writer side. If we
>>>> >> can make the reader side ensure it, the backend process don't need to
>>>> >> wait for the resolver process.
>>>> >>
>>>> >> The waiting backend process are released by resolver process after the
>>>> >> resolver process tried to resolve foreign transactions. Even if
>>>> >> resolver process failed to either connect to foreign server or to
>>>> >> resolve foreign transaction the backend process will be released and
>>>> >> the foreign transactions are leaved as dangling transaction in that
>>>> >> case, which are processed later. Also if resolver process takes a long
>>>> >> time to resolve foreign transactions for whatever reason the user can
>>>> >> cancel it by Ctl-c anytime.
>>>> >>
>>>> >
>>>> > So, there's no guarantee that the next command issued from the
>>>> > connection *will* see the committed data, since the foreign
>>>> > transaction might not have committed because of a network glitch
>>>> > (say). If we go this route of making backends wait for resolver to
>>>> > resolve the foreign transaction, we will have add complexity to make
>>>> > sure that the waiting backends are woken up in problematic events like
>>>> > crash of the resolver process OR if the resolver process hangs in a
>>>> > connection to a foreign server etc. I am not sure that the complexity
>>>> > is worth the half-guarantee.
>>>> >
>>>>
>>>> Hmm, maybe I was wrong. I now think that the waiting backends can be
>>>> woken up only in following cases;
>>>> - The resolver process succeeded to resolve all foreign transactions.
>>>> - The user did the cancel (e.g. ctl-c).
>>>> - The resolver process failed to resolve foreign transaction for a
>>>> reason of there is no such prepared transaction on foreign server.
>>>>
>>>> In other cases the resolver process should not release the waiters.
>>>
>>> I'm not sure I see consensus here. What Ashutosh says seems to be: "Special
>>> effort is needed to ensure that backend does not keep waiting if the resolver
>>> can't finish it's work in forseable future. But this effort is not worth
>>> because by waking the backend up you might prevent the next transaction from
>>> seeing the changes the previous one tried to make."
>>>
>>> On the other hand, your last comments indicate that you try to be even more
>>> stringent in letting the backend wait. However even this stringent approach
>>> does not guarantee that the next transaction will see the data changes made by
>>> the previous one.
>>>
>>
>> What I'd like to guarantee is that the subsequent read can see the
>> committed result of previous writes if the transaction involving
>> multiple foreign servers is committed without cancellation by user. In
>> other words, the backend should not be waken up and the resolver
>> should continue to resolve at certain intervals even if the resolver
>> fails to connect to the foreign server or fails to resolve it. This is
>> similar to what synchronous replication guaranteed today. Keeping this
>> semantics is very important for users. Note that the reading a
>> consistent result by concurrent reads is a separated problem.
>
> The question I have is how would we deal with a foreign server that is
> not available for longer duration due to crash, longer network outage
> etc. Example is the foreign server crashed/got disconnected after
> PREPARE but before COMMIT/ROLLBACK was issued. The backend will remain
> blocked for much longer duration without user having an idea of what's
> going on. May be we should add some timeout.

After more thought, I agree with adding some timeout. I can image
there are users who want the timeout, for example, who cannot accept
even a few seconds latency. If the timeout occurs backend unlocks the
foreign transactions and breaks the loop. The resolver process will
keep to continue to resolve foreign transactions at certain interval.

>>
>> The read result including foreign servers can be inconsistent if the
>> such transaction is cancelled or the coordinator server crashes during
>> two-phase commit processing. That is, if there is in-doubt transaction
>> the read result can be inconsistent, even for subsequent reads. But I
>> think this behaviour can be accepted by users. For the resolution of
>> in-doubt transactions, the resolver process will try to resolve such
>> transactions after the coordinator server recovered. On the other
>> hand, for the reading a consistent result on such situation by
>> subsequent reads, for example, we can disallow backends to inquiry SQL
>> to the foreign server if a foreign transaction of the foreign server
>> is remained.
>
> +1 for the last sentence. If we do that, we don't need the backend to
> be blocked by resolver since a subsequent read accessing that foreign
> server would get an error and not inconsistent data.

Yeah, however the disadvantage of this is that we manage foreign
transactions per foreign servers. If a transaction that modified even
one table is remained as a in-doubt transaction, we cannot issue any
SQL that touches that foreign server. Can we occur an error at
ExecInitForeignScan()?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tushar 2017-12-11 10:24:32 After dropping the rule - Not able to insert / server crash (one time ONLY)
Previous Message Ashutosh Bapat 2017-12-11 10:00:47 Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath