Re: jdbc xa support

From: Michael Allman <msa(at)allman(dot)ms>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: jdbc xa support
Date: 2005-07-22 23:07:32
Message-ID: 20050722190449.T5605@yvyyl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

I've uploaded a new version of my patch to

http://www.allman.ms/pgjdbcxa/pgjdbcxa-20050722-2.jar

This version includes some bug fixes and a small number of (working) unit
tests.

It occurred to me recently that start(xid, TMJOIN) is broken. Given that
this implementation doesn't support transaction branch joining, it should
probably just throw an XAException. Of course, since isSameRm() returns
true only for identical PGXAResource instances a TM should never call
start(xid, TMJOIN). Makes sense?

Michael

On Fri, 22 Jul 2005, Michael Allman wrote:

> On Fri, 22 Jul 2005, Heikki Linnakangas wrote:
>
>> On Thu, 21 Jul 2005, Michael Allman wrote:
>>
>>> On Thu, 21 Jul 2005, Heikki Linnakangas wrote:
>>>
>>>> 2. Using prepared statements like "PREPARE TRANSACTION ?" won't work. You
>>>> can only use prepared statements for normal SELECT/UPDATE/DELETE
>>>> commands.
>>>
>>> Doesn't the driver support client side prepared statements?
>>
>> No, they're server side. I tried that too at first, but it didn't work.
>
> I will make corrections.
>
>>>> 3. How are you planning to handle transaction interleaving discussed in
>>>> the thread Dave mentioned?
>>>
>>> I'm not. PostgreSQL does not support this behavior, and I see no need to
>>> pretend it does. I think the appropriate thing to do is throw an
>>> exception when the second start is called.
>>
>> I agree. However, I'd like to try it with the popular TMs to make sure they
>> work without it.
>>
>>> I have serious doubts that any SQL database in the world supports this
>>> behavior correctly. If you know of one that does, I'd like to see its
>>> magic.
>>
>> I tested it on some SQL databases, and at least Oracle seems to support it.
>> DB2 fakes it by preparing early. Derby seems to support it, but it only
>> supports XA in embedded mode.
>
> If Oracle supports it, it's likely because they have some server-side stored
> procedures that do something magical. I don't know. I'm not an SQL expert,
> but I don't think SQL by itself supports the association of discrete DML
> statements with arbitrary transactions.
>
> You might want to check out SimpleJTA:
>
> http://www.simplejta.org/
>
> They have some XA driver notes. Among them the following nugget:
>
> <quote>
> JTA specifications allow an XAResource object to shared amongst multiple
> concurrent transactions with the restriction that the resource can be
> enlisted with a single transaction at a point in time. Resource sharing
> amonst multiple transactions appears to cause a problem in Oracle in a
> multi-threaded environment. Therefore, SimpleJTA is configured to defer the
> reuse of an XAResource object by other transactions until the existing
> transaction is completed, i.e., either committed or rolled back.
> </quote>
>
>>>> 4. recover is broken because it ignores the flags argument. That's going
>>>> to cause an endless loop in the transaction manager when it tries to
>>>> recover. See this discussion:
>>>> http://forum.java.sun.com/thread.jspa?threadID=475468&messageID=2232566
>>>
>>> That is problematic. The API for recovery is stateful, and, IMHO, poorly
>>> designed. If you look at the original DTP XA spec you'll see it makes
>>> much more sense.
>>
>> I agree that it sucks.
>>
>>> I don't know what to do about this yet.
>>
>> The simplest implementation is one that returns all the recovered xids if
>> flags include TMSTARTRSCAN, and an empty array in all other cases. That
>> way, the internal implementation don't have to be stateful even though the
>> API is.
>
> I posted a new version last night that does this. I think it works.
>
>>>> 6. isSameRM considers two connections to the same database as different
>>>> RMs. I'm not sure what the implications of this are, but I feel that's
>>>> not right. I have the same issue in my implementation as well...
>>>
>>> They're different RM's because you can't join a transaction across two
>>> physical JDBC Connections. Each XAResource instance is associated with
>>> exactly one physical connection instance.
>>
>> I don't think that's the correct definition of an RM. See section 2.2.4 of
>> the XA specification. I think the Postgres database or cluster is one RM.
>> But as I said, I don't know what implications your implementation has. It
>> might work just fine, or not.
>
> It's up to the implementor to define the scope of an "RM" and what isSameRM()
> means --- hence the interface method.
>
> The TM uses this method when it has another XAResource to enlist in the
> transaction and wants to know if it should start another branch for it (with
> start(newBranchXid, TMNOFLAGS)) or can join an existing transaction branch
> (with start(existingBranchXid, TMJOIN)).
>
> The DTP XA spec says a single RM *may* service multiple independent resource
> domains. There are RM's that work like this, e.g. Berkeley DB where
> transactions are represented as first-class Objects which can be passed
> around within the same environment. However, PostgreSQL does not support
> this behavior. Again, you can't join a transaction across physical database
> connections.
>
> One possible alternative we might explore is allowing an XAResource instance,
> say xaRes1, for the same database as another XAResource instance, say xaRes2,
> to adopt the same physical connection instance as xaRes2. So
> xaRes2.isSameRM(xaRes1) would return true if the underlying physical
> connections pointed to the same PostgreSQL database (with the same user
> credentials). Then if a TM tried to join xaRes2 to xaRes1's transaction
> branch, we could implement xaRes1.start(existingBranchXid, TMJOIN) to assign
> xaRes1.physicalConnection = xaRes2.physicalConnection. Then they would share
> the same transaction branch and context. How about that?
>
>>> In light of the implementation, I could probably just define isSameRM() as
>>> return this == otherXAResource . . .
>>
>> Yep.
>>
>>>> The XA and JTA specifications are quite complicated. I'd like to see a
>>>> good set of test cases that exercise all possible scenarious and also
>>>> error conditions. We're also going to need testers with access to the
>>>> popular application servers so that we know our implementation works with
>>>> them. AFAIK, the only open source application server that does recovery
>>>> properly is the CVS head version of JOnAS.
>>>
>>> I have some cactus test cases for an XML database that has an XA driver.
>>> I'm not feeling too motivated to port them to PostgreSQL.
>>
>> Can you send them? I'd like to take a look, even if we can't use them
>> directly. Which XML database is that?
>
> It's for Berkeley DB XML. I think they're in CVS:
>
> http://berkeley-dbxml-adapter.dev.java.net/
>
> However, these are high-level tests at the user API level. Perhaps we should
> write tests to XAResource directly? I don't even think they'd need to be
> cactus tests.
>
>>>> Also, if we violate some parts of the specs (like the transaction
>>>> interleaving part), it's important to know exactly what the limitations
>>>> are and why. I started to write down the exact preconditions for each
>>>> method in the javadoc comments, and also separate which preconditions
>>>> come from the specs and which are just implementation-specific
>>>> limitations.
>>>
>>> I think the interleaving business is a non-issue. I can't think of a real
>>> world case where a transaction manager would do this. Can you?
>>
>> Using interleaving, the application server could get away with a smaller
>> connection pool. It could recycle the connections right after end call,
>> without waiting for the prepare/commit cycle.
>
> But then leave the first transaction hanging? My gut tells me you should
> resolve, either rollback or commit, a transaction as soon as you can.
> Prepared transactions hold onto their locks, right? Leaving them unresolved
> would lead to poorer concurrency, not better.
>
> I don't really understand the rationale behind this "interleaving" idea.
>
>> I don't know if any application server does that in practice. If it turns
>> out to be a problem, we might get away by some clever locking. We could
>> make the second start call block and wait for the previous transaction to
>> finish.
>>
>>> Besides, like I said, I doubt any other SQL database supports this. I
>>> know Berkeley DB does, but Berkeley DB lets you associate any database
>>> call with any transaction, so it's easy.
>>>
>>> JTA was written with more than just SQL databases in mind, and I don't
>>> think we need to bend over backwards to implement some corner
>>> functionality for a resource which, by its design, doesn't support it.
>>
>> I agree, if the application servers work without it.
>>
>> - Heikki
>>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message Oliver Jowett 2005-07-23 07:09:06 Re: jdbc xa support
Previous Message Kris Jurka 2005-07-22 19:40:29 Re: [JDBC] BUG #1780: JDBC driver "setNull" throws for BLOB