Re: Transactions involving multiple postgres foreign servers

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transactions involving multiple postgres foreign servers
Date: 2015-11-09 10:01:52
Message-ID: 56406F10.9030508@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09.11.2015 09:59, Ashutosh Bapat wrote:
>
>
> Since the foreign server (referred to in the slides as secondary
> server) requires to call "create extension pg_dtm" and select
> dtm_join_transaction(xid);, I assume that the foreign server has to be
> a PostgreSQL server and one which has this extension installed and has
> a version that can support this extension. So, we can not use the
> extension for all FDWs and even for postgres_fdw it can be used only
> for a foreign server with above capabilities. The slides mention just
> FDW but I think they mean postgres_fdw and not all FDWs.

DTM approach is based on sharing XIDs and snapshots between different
cluster nodes, so it really can be easily implemented only for
PostgreSQL. So I really have in mind postgres_fdw rather than abstract FDW.
Approach with timestamps is more universal and in principle can be used
for any DBMS where visibility is based on CSNs.

>
> I think that this API is intended to provide not only consistent
> cross-node decisions about whether a particular transaction has
> committed, but also consistent visibility. If the API is sufficient
> for that and if it can be made sufficiently performant, that's a
> strictly stronger guarantee than what this proposal would provide.
>
> On the other hand, I see a couple of problems:
>
> 1. The extensible transaction manager API is meant to be pluggable.
> Depending on which XTM module you choose to load, the SQL that needs
> to be executed by postgres_fdw on the remote node will vary.
> postgres_fdw shouldn't have knowledge of all the possible XTMs out
> there, so it would need some way to know what SQL to send.
>
> 2. If the remote server isn't running the same XTM as the local
> server, or if it is running the same XTM but is not part of the same
> group of cooperating nodes as the local server, then we can't send a
> command to join the distributed transaction at all. In that case, the
> 2PC for FDW approach is still, maybe, useful.
>
>
> Elaborating more on this: Slide 11 shows arbiter protocol to start a
> transaction and next slide shows the same for commit. Slide 15 shows
> the transaction flow diagram for tsDTM. In DTM approach it doesn't
> specify how xids are communicated between nodes, but it's implicit in
> the protocol that xid space is shared by the nodes. Similarly for
> tsDTM it assumes that CSN space is shared by all the nodes (see
> synchronization for max(CSN)). This can not be assumed for FDWs (not
> even postgres_fdw) where foreign servers are independent entities with
> independent xid space.

Proposed architecture of DTM includes "coordinator". Coordinator is a
process responsible for managing logic of distributed transaction. It
can be just a normal client application, or it can be intermediate
master node (like in case of pg_shard).
It can be also PostgreSQL instance (as in case of postgres_fdw) or not.
We try to put as less restriction on "coordinator" as possible.
It should just communicate with PostgreSQL backends using any
communication protocol it likes (i.e. libpq) and invokes some special
stored procedures which are part of particular DTM extension. Such
functions also impose some protocol of exchanging data between different
nodes involved in distributed transaction. In such way we are
propagating XIDs/CSNs between different nodes which may even do not know
about each other.
In DTM approach nodes only know about location of "arbiter". In tsDTM
approach there is even not arbiter...

>
>
> On the whole, I'm inclined to think that the XTM-based approach is
> probably more useful and more general, if we can work out the problems
> with it. I'm not sure that I'm right, though, nor am I sure how hard
> it will be.
>
>
> 2PC for FDW and XTM are trying to solve different problems with some
> commonality. 2PC for FDW is trying to solve problem of atomic commit
> (I am borrowing from the terminology you used in PGCon 2015) for FDWs
> in general (although limited to FDWs which can support 2 phase commit)
> and XTM tries to solve problems of atomic visibility, atomic commit
> and consistency for postgres_fdw where foreign servers support XTM.
> The only thing common between these two is atomic visibility.
>
> If we accept XTM and discard 2PC for FDW, we will not be able to
> support atomic commit for FDWs in general. That, I think would be
> serious limitation for Postgres FDW, esp. now that DMLs are allowed.
> If we accept only 2PC for FDW and discard XTM, we won't be able to get
> atomic visibility and consistency for postgres_fdw with foreign
> servers supporting XTM. That would be again serious limitation for
> solutions implementing sharding, multi-master clusters etc.
>
> There are approaches like [1] by which cluster of heterogenous servers
> (with some level of snapshot isolation) can be constructed. Ideally
> that will enable PostgreSQL users to maximize their utilization of FDWs.
>
> Any distributed transaction management requires 2PC in some or other
> form. So, we should implement 2PC for FDW keeping in mind various
> forms of 2PC used practically. Use that infrastructure to build XTM
> like capabilities for restricted postgres_fdw uses. Previously, I have
> requested the authors of XTM to look at my patch and provide me
> feedback about their requirements for implementing 2PC part of XTM.
> But I have not heard anything from them.
>
> 1.
> https://domino.mpi-inf.mpg.de/intranet/ag5/ag5publ.nsf/1c0a12a383dd2cd8c125613300585c64/7684dd8109a5b3d5c1256de40051686f/$FILE/tdd99.pdf

Sorry, may be I missed some message. but I have not received request
from you to provide feedback concerning your patch.

>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message kawamichi 2015-11-09 10:08:58 Erroneous cost estimation for nested loop join
Previous Message Victor Wagner 2015-11-09 09:44:04 Re: Patch: Implement failover on libpq connect level.