Re: [HACKERS] Transactions involving multiple postgres foreign servers

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Transactions involving multiple postgres foreign servers
Date: 2018-02-20 21:07:13
Message-ID: CA+TgmoasG8SQNDF1A4y3-+p_yOW=FEL=mj6+qQg1bxgMnDGQ=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 13, 2018 at 5:42 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> The fdw-transactions section of the documentation seems to imply that
>> henceforth every FDW must call FdwXactRegisterForeignServer, which I
>> think is an unacceptable API break.
>>
>> It doesn't seem advisable to make this behavior depend on
>> max_prepared_foreign_transactions. I think that it should be an
>> server-level option: use 2PC for this server, or not? FDWs that don't
>> have 2PC default to "no"; but others can be set to "yes" if the user
>> wishes. But we shouldn't force that decision to be on a cluster-wide
>> basis.
>
> Since I've added a new option two_phase_commit to postgres_fdw we need
> to ask FDW whether the foreign server is 2PC-capable server or not in
> order to register the foreign server information. That's why the patch
> asked FDW to call FdwXactRegisterForeignServer. However we can
> register a foreign server information automatically by executor (e.g.
> at BeginDirectModify and at BeginForeignModify) if a foreign server
> itself has that information. We can add two_phase_commit_enabled
> column to pg_foreign_server system catalog and that column is set to
> true if the foriegn server is 2PC-capable (i.g. has enough functions)
> and user want to use it.

I don't see why this would need a new catalog column.

>> But that brings up another issue: why is MyFdwConnections named that
>> way and why does it have those semantics? That is, why do we really
>> need a list of every FDW connection? I think we really only need the
>> ones that are 2PC-capable writes. If a non-2PC-capable foreign server
>> is involved in the transaction, then we don't really to keep it in a
>> list. We just need to flag the transaction as not locally prepareable
>> i.e. clear TwoPhaseReady. I think that this could all be figured out
>> in a much less onerous way: if we ever perform a modification of a
>> foreign table, have nodeModifyTable.c either mark the transaction
>> non-preparable by setting XACT_FLAGS_FDWNOPREPARE if the foreign
>> server is not 2PC capable, or otherwise add the appropriate
>> information to MyFdwConnections, which can then be renamed to indicate
>> that it contains only information about preparable stuff. Then you
>> don't need each individual FDW to be responsible about calling some
>> new function; the core code just does the right thing automatically.
>
> I could not get this comment. Did you mean that the foreign
> transaction on not 2PC-capable foreign server should be end in the
> same way as before (i.g. by XactCallback)?
>
> Currently, because there is not FDW API to end foreign transaction,
> almost FDWs use XactCallbacks to end the transaction. But after
> introduced new FDW APIs, I think it's better to use FDW APIs to end
> transactions rather than using XactCallbacks. Otherwise we end up with
> having FDW APIs for 2PC (prepare and resolve) and XactCallback for
> ending the transaction, which would be hard to understand. So I've
> changed the foreign transaction management so that core code
> explicitly asks FDW to end/prepare a foreign transaction instead of
> ending it by individual FDWs. After introduced new FDW APIs, core code
> can have the information of all foreign servers involved with the
> transaction and call each APIs at appropriate timing.

Well, it's one thing to introduce a new API. It's another thing to
require existing FDWs to be updated to use it. There are a lot of
existing FDWs out there, and I think that it is needlessly unfriendly
to force them all to be updated for v11 (or whenever this gets
committed) even if we think the new API is clearly better. FDWs that
work today should continue working after this patch is committed.

Separately, I think there's a question of whether the new API is in
fact better -- I'm not sure I have a completely well-formed opinion
about that yet.

> In FdwXactResolveForeignTranasction(), resolver concludes the fate of
> transaction by seeing the status of fdwxact entry and the state of
> local transaction in clog. what I need to do is making that function
> log a complaint in commit case if couldn't find the prepared
> transaction, and not do that in abort case.

+1.

> Also, postgres_fdw don't
> raise an error even if we could not find prepared transaction on
> foreign server because it might have been resolved by other process.

+1.

> But this is now responsible by FDW. I should change it to resolver
> side. That is, FDW can raise error in ordinarily way but core code
> should catch and process it.

I don't understand exactly what you mean here.

> You're right. Perhaps we can deal with it by PrescanFdwXacts until
> reach consistent point, and then have vac_update_datfrozenxid check
> local xids of un-resolved fdwxact to determine the new datfrozenxid.
> Since the local xids of un-resolved fdwxacts would not be relevant
> with vacuuming, we don't need to include it to snapshot and
> GetOldestXmin etc. Also we hint to resolve fdwxact when near
> wraparound.

I agree with you about snapshots, but I'm not sure about GetOldestXmin.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-02-20 21:23:54 Hash Joins vs. Bloom Filters / take 2
Previous Message David G. Johnston 2018-02-20 20:49:42 Re: pgsql: Allow UNIQUE indexes on partitioned tables