From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] Transactions involving multiple postgres foreign servers |
Date: | 2018-05-11 00:56:42 |
Message-ID: | CAD21AoCUyLrfqx--F7aAHF48ax3M3CY2so1QtWxH9jL47eJT8Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thank you for the comment.
On Fri, May 11, 2018 at 3:57 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Feb 27, 2018 at 2:21 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> I might be missing your point. As for API breaking, this patch doesn't
>> break any existing FDWs. All new APIs I proposed are dedicated to 2PC.
>> In other words, FDWs that work today can continue working after this
>> patch gets committed, but if FDWs want to support atomic commit then
>> they should be updated to use new APIs. The reason why the calling of
>> FdwXactRegisterForeignServer is necessary is that the core code
>> controls the foreign servers that involved with the transaction but
>> the whether each foreign server uses 2PC option (two_phase_commit) is
>> known only on FDW code. We can eliminate the necessity of calling
>> FdwXactRegisterForeignServer by moving 2PC option from fdw-level to
>> server-level in order not to enforce calling the registering function
>> on FDWs. If we did that, the user can use the 2PC option as a
>> server-level option.
>
> Well, FdwXactRegisterForeignServer has a "bool two_phase_commit"
> argument. If it only needs to be called by FDWs that support 2PC,
> then that argument is unnecessary. If it needs to be called by all
> FDWs, then it breaks existing FDWs that don't call it.
>
I understood now. For now since FdwXactRegisterForeignServer needs to
be called by only FDWs that support 2PC, I will remove the argument.
>>>> But this is now responsible by FDW. I should change it to resolver
>>>> side. That is, FDW can raise error in ordinarily way but core code
>>>> should catch and process it.
>>>
>>> I don't understand exactly what you mean here.
>>
>> Hmm I think I got confused. My understanding is that logging a
>> complaint in commit case and not doing that in abort case if prepared
>> transaction doesn't exist is a core code's job. An API contract here
>> is that FDW raise an error with ERRCODE_UNDEFINED_OBJECT error code if
>> there is no such transaction. Since it's an expected case in abort
>> case for the fdwxact manager, the core code can regard the error as
>> not actual problem.
>
> In general, it's not safe to catch an error and continue unless you
> protect the code that throws the error by a sub-transaction. That
> means we shouldn't expect the FDW to throw an error when the prepared
> transaction isn't found and then just have the core code ignore the
> error. Instead the FDW should return normally and, if the core code
> needs to know whether the prepared transaction was found, then the FDW
> should indicate this through a return value, not an ERROR.
>
>> Or do you mean that FDWs should not raise an error if there is the
>> prepared transaction, and then core code doesn't need to check
>> sqlstate in case of error?
>
> Right. As noted above, that's unsafe, so we shouldn't do it.
Thank you. I will think the API contract again based on your suggestion.
>
>>>> You're right. Perhaps we can deal with it by PrescanFdwXacts until
>>>> reach consistent point, and then have vac_update_datfrozenxid check
>>>> local xids of un-resolved fdwxact to determine the new datfrozenxid.
>>>> Since the local xids of un-resolved fdwxacts would not be relevant
>>>> with vacuuming, we don't need to include it to snapshot and
>>>> GetOldestXmin etc. Also we hint to resolve fdwxact when near
>>>> wraparound.
>>>
>>> I agree with you about snapshots, but I'm not sure about GetOldestXmin.
>>
>> Hmm, although I've thought concern in case where we don't consider
>> local xids of un-resolved fdwxact in GetOldestXmin, I could not find
>> problem. Could you share your concern if you have? I'll try to find a
>> possibility based on it.
>
> I don't remember exactly what I was thinking when I wrote that, but I
> think the point is that GetOldestXmin() does a bunch of things other
> than control the threshold for VACUUM, and we'd need to study them all
> and look for problems. For example, it won't do for an XID to get
> reused while it's still associated with an unresolved fdwxact. We
> therefore certainly need such XIDs to hold back the cluster-wide
> threshold for clog truncation in some manner -- and maybe that
> involves GetOldestXmin(). Or maybe not. But anyway the point,
> broadly considered, is that GetOldestXmin() is used in various ways,
> and I don't know if we've thought through all of the consequences in
> regard to this new feature.
Okay, I'll have GetOldestXmin() consider the oldest local xid of
un-resolved fdwxact as well in the next version patch for more safety,
while considering more efficient ways.
>
> Can I ask what your time frame is for updating this patch?
> Considering how much work appears to remain, if you want to get this
> committed to v12, it would be best to get started as early as
> possible.
>
I'll post an updated patch by PGCon at the latest, hopefully in the next week.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2018-05-11 01:05:44 | Re: Global snapshots |
Previous Message | Tom Lane | 2018-05-10 21:42:48 | Re: [HACKERS] Cutting initdb's runtime (Perl question embedded) |