From: | Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com> |
---|---|
To: | Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Muhammad Usama <m(dot)usama(at)gmail(dot)com>, amul sul <sulamul(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ildar Musin <ildar(at)adjust(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Chris Travers <chris(dot)travers(at)adjust(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp> |
Subject: | Re: Transactions involving multiple postgres foreign servers, take 2 |
Date: | 2020-07-15 11:58:13 |
Message-ID: | 412f81780e15cfb6b3d4905db9000785@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2020-07-15 15:06, Masahiko Sawada wrote:
> On Tue, 14 Jul 2020 at 09:08, Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
> wrote:
>>
>> > I've attached the latest version patches. I've incorporated the review
>> > comments I got so far and improved locking strategy.
>>
>> Thanks for updating the patch!
>> I have three questions about the v23 patches.
>>
>>
>> 1. messages related to user canceling
>>
>> In my understanding, there are two messages
>> which can be output when a user cancels the COMMIT command.
>>
>> A. When prepare is failed, the output shows that
>> committed locally but some error is occurred.
>>
>> ```
>> postgres=*# COMMIT;
>> ^CCancel request sent
>> WARNING: canceling wait for resolving foreign transaction due to user
>> request
>> DETAIL: The transaction has already committed locally, but might not
>> have been committed on the foreign server.
>> ERROR: server closed the connection unexpectedly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> CONTEXT: remote SQL command: PREPARE TRANSACTION
>> 'fx_1020791818_519_16399_10'
>> ```
>>
>> B. When prepare is succeeded,
>> the output show that committed locally.
>>
>> ```
>> postgres=*# COMMIT;
>> ^CCancel request sent
>> WARNING: canceling wait for resolving foreign transaction due to user
>> request
>> DETAIL: The transaction has already committed locally, but might not
>> have been committed on the foreign server.
>> COMMIT
>> ```
>>
>> In case of A, I think that "committed locally" message can confuse
>> user.
>> Because although messages show committed but the transaction is
>> "ABORTED".
>>
>> I think "committed" message means that "ABORT" is committed locally.
>> But is there a possibility of misunderstanding?
>
> No, you're right. I'll fix it in the next version patch.
>
> I think synchronous replication also has the same problem. It says
> "the transaction has already committed" but it's not true when
> executing ROLLBACK PREPARED.
Thanks for replying and sharing the synchronous replication problem.
> BTW how did you test the case (A)? It says canceling wait for foreign
> transaction resolution but the remote SQL command is PREPARE
> TRANSACTION.
I think the timing of failure is important for 2PC test.
Since I don't have any good solution to simulate those flexibly,
I use the GDB debugger.
The message of the case (A) is sent
after performing the following operations.
1. Attach the debugger to a backend process.
2. Set a breakpoint to PreCommit_FdwXact() in CommitTransaction().
// Before PREPARE.
3. Execute "BEGIN" and insert data into two remote foreign tables.
4. Issue a "Commit" command
5. The backend process stops at the breakpoint.
6. Stop a remote foreign server.
7. Detach the debugger.
// The backend continues and prepare is failed. TR try to abort all
remote txs.
// It's unnecessary to resolve remote txs which prepare is failed,
isn't it?
8. Send a cancel request.
BTW, I concerned that how to test the 2PC patches.
There are many failure patterns, such as failure timing,
failure server/nw (and unexpected recovery), and those combinations...
Though it's best to test those failure patterns automatically,
I have no idea for now, so I manually check some patterns.
> I've incorporated the above your comments in the local branch. I'll
> post the latest version patch after incorporating other comments soon.
OK, Thanks.
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
From | Date | Subject | |
---|---|---|---|
Next Message | Etsuro Fujita | 2020-07-15 12:02:09 | Re: Partitioning and postgres_fdw optimisations for multi-tenancy |
Previous Message | Andrew Dunstan | 2020-07-15 11:50:03 | Re: SQL/JSON: functions |