Re: Transactions involving multiple postgres foreign servers

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: vinayak <Pokale_Vinayak_q3(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Vinayak Pokale <vinpokale(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Subject: Re: Transactions involving multiple postgres foreign servers
Date: 2016-12-22 16:49:30
Message-ID: CAD21AoCed17GKaVbmojzCEaozfvB2xYeFAk3BL0MMx0BAvo0cA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 9, 2016 at 4:02 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Fri, Dec 9, 2016 at 3:02 PM, vinayak <Pokale_Vinayak_q3(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> On 2016/12/05 14:42, Ashutosh Bapat wrote:
>>>
>>> On Mon, Dec 5, 2016 at 11:04 AM, Haribabu Kommi
>>> <kommi(dot)haribabu(at)gmail(dot)com> wrote:
>>>
>>>
>>> On Fri, Nov 11, 2016 at 5:38 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
>>> wrote:
>>>>>
>>>>>
>>>>> 2PC is a basic building block to support the atomic commit and there
>>>>> are some optimizations way in order to reduce disadvantage of 2PC. As
>>>>> you mentioned, it's hard to support a single model that would suit
>>>>> several type of FDWs. But even if it's not a purpose for sharding,
>>>>> because many other database which could be connected to PostgreSQL via
>>>>> FDW supports 2PC, 2PC for FDW would be useful for not only sharding
>>>>> purpose. That's why I was focusing on implementing 2PC for FDW so far.
>>>>
>>>>
>>>> Moved to next CF with "needs review" status.
>>>
>>> I think this should be changed to "returned with feedback.". The
>>> design and approach itself needs to be discussed. I think, we should
>>> let authors decide whether they want it to be added to the next
>>> commitfest or not.
>>>
>>> When I first started with this work, Tom had suggested me to try to
>>> make PREPARE and COMMIT/ROLLBACK PREPARED involving foreign servers or
>>> at least postgres_fdw servers work. I think, most of my work that
>>> Vinayak and Sawada have rebased to the latest master will be required
>>> for getting what Tom suggested done. We wouldn't need a lot of changes
>>> to that design. PREPARE involving foreign servers errors out right
>>> now. If we start supporting prepared transactions involving foreign
>>> servers that will be a good improvement over the current status-quo.
>>> Once we get that done, we can continue working on the larger problem
>>> of supporting ACID transactions involving foreign servers.
>>
>> In the pgconf ASIA depelopers meeting Bruce Momjian and other developers
>> discussed
>> on FDW based sharding [1]. The suggestions from other hackers was that we
>> need to discuss
>> the big picture and use cases of sharding. Bruce has listed all the building
>> blocks of built-in sharding
>> on wiki [2]. IIUC,transaction manager involving foreign servers is one part
>> of sharding.
>
> Yeah, the 2PC on FDW is a basic building block for FDW based sharding
> and it would be useful not only FDW sharding but also other purposes.
> As far as I surveyed some papers the many kinds of distributed
> transaction management architectures use the 2PC for atomic commit
> with some optimisations. And using 2PC to provide atomic commit on
> distributed transaction has much affinity with current PostgreSQL
> implementation from some perspective.
>
>> As per the Bruce's wiki page there are two use cases for transactions
>> involved multiple foreign servers:
>> 1. Cross-node read-only queries on read/write shards:
>> This will require a global snapshot manager to make sure the shards
>> return consistent data.
>> 2. Cross-node read-write queries:
>> This will require a global snapshot manager and global transaction
>> manager.
>>
>> I agree with you that if we start supporting PREPARE and COMMIT/ROLLBACK
>> PREPARED
>> involving foreign servers that will be good improvement.
>>
>> [1] https://wiki.postgresql.org/wiki/PgConf.Asia_2016_Developer_Meeting
>> [2] https://wiki.postgresql.org/wiki/Built-in_Sharding
>>
>
> I also agree to work on implementing the atomic commit across the
> foreign servers and then continue to work on the more larger problem.
> I think that this will be large step forward. I'm going to submit the
> updated version patch to CF3.

Attached latest version patches. Almost design is the same as previous
patches and I incorporated some optimisations and updated
documentation. But the documentation and regression test is not still
enough.

000 patch adds some new FDW APIs to achive the atomic commit involving
the foreign servers using two-phase-commit. If more than one foreign
servers involve with the transaction or the transaction changes local
data and involves even one foreign server, local node executes PREPARE
and COMMIT/ROLLBACK PREPARED on foreign servers at commit. A lot of
part of this implementation is inspired by two phase commit code. So I
incorporated recent changes of two phase commit code, for example
recovery speed improvement, into this patch.
001 patch makes postgres_fdw support atomic commit. If
two_phase_commit is set 'on' to a foreign server, the two-phase-commit
will be used at commit. 002 patch adds the pg_fdw_resolver new contrib
module that is a bgworker process that resolves the in-doubt
transaction on foreign server if there is.

The reply might be late next week but feedback and review comment are
very welcome.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
000_support_fdw_xact_v3.patch text/x-diff 112.4 KB
001_pgfdw_support_atomic_commit_v3.patch text/x-diff 42.3 KB
002_pg_fdw_resolver_contrib_v3.patch text/x-diff 11.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-12-22 16:50:38 Re: Potential data loss of 2PC files
Previous Message Andres Freund 2016-12-22 16:32:56 Re: Fix checkpoint skip logic on idle systems by tracking LSN progress