Re: jdbc xa support

From: Michael Allman <msa(at)allman(dot)ms>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: jdbc xa support
Date: 2005-07-22 19:15:36
Message-ID: 20050722144451.T5605@yvyyl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

On Fri, 22 Jul 2005, Heikki Linnakangas wrote:

> On Thu, 21 Jul 2005, Michael Allman wrote:
>
>> On Thu, 21 Jul 2005, Heikki Linnakangas wrote:
>>
>>> 2. Using prepared statements like "PREPARE TRANSACTION ?" won't work. You
>>> can only use prepared statements for normal SELECT/UPDATE/DELETE commands.
>>
>> Doesn't the driver support client side prepared statements?
>
> No, they're server side. I tried that too at first, but it didn't work.

I will make corrections.

>>> 3. How are you planning to handle transaction interleaving discussed in
>>> the thread Dave mentioned?
>>
>> I'm not. PostgreSQL does not support this behavior, and I see no need to
>> pretend it does. I think the appropriate thing to do is throw an exception
>> when the second start is called.
>
> I agree. However, I'd like to try it with the popular TMs to make sure they
> work without it.
>
>> I have serious doubts that any SQL database in the world supports this
>> behavior correctly. If you know of one that does, I'd like to see its
>> magic.
>
> I tested it on some SQL databases, and at least Oracle seems to support it.
> DB2 fakes it by preparing early. Derby seems to support it, but it only
> supports XA in embedded mode.

If Oracle supports it, it's likely because they have some server-side
stored procedures that do something magical. I don't know. I'm not an
SQL expert, but I don't think SQL by itself supports the association of
discrete DML statements with arbitrary transactions.

You might want to check out SimpleJTA:

http://www.simplejta.org/

They have some XA driver notes. Among them the following nugget:

<quote>
JTA specifications allow an XAResource object to shared amongst multiple
concurrent transactions with the restriction that the resource can be
enlisted with a single transaction at a point in time. Resource sharing
amonst multiple transactions appears to cause a problem in Oracle in a
multi-threaded environment. Therefore, SimpleJTA is configured to defer
the reuse of an XAResource object by other transactions until the existing
transaction is completed, i.e., either committed or rolled back.
</quote>

>>> 4. recover is broken because it ignores the flags argument. That's going
>>> to cause an endless loop in the transaction manager when it tries to
>>> recover. See this discussion:
>>> http://forum.java.sun.com/thread.jspa?threadID=475468&messageID=2232566
>>
>> That is problematic. The API for recovery is stateful, and, IMHO, poorly
>> designed. If you look at the original DTP XA spec you'll see it makes much
>> more sense.
>
> I agree that it sucks.
>
>> I don't know what to do about this yet.
>
> The simplest implementation is one that returns all the recovered xids if
> flags include TMSTARTRSCAN, and an empty array in all other cases. That way,
> the internal implementation don't have to be stateful even though the API is.

I posted a new version last night that does this. I think it works.

>>> 6. isSameRM considers two connections to the same database as different
>>> RMs. I'm not sure what the implications of this are, but I feel that's not
>>> right. I have the same issue in my implementation as well...
>>
>> They're different RM's because you can't join a transaction across two
>> physical JDBC Connections. Each XAResource instance is associated with
>> exactly one physical connection instance.
>
> I don't think that's the correct definition of an RM. See section 2.2.4 of
> the XA specification. I think the Postgres database or cluster is one RM. But
> as I said, I don't know what implications your implementation has. It might
> work just fine, or not.

It's up to the implementor to define the scope of an "RM" and what
isSameRM() means --- hence the interface method.

The TM uses this method when it has another XAResource to enlist in the
transaction and wants to know if it should start another branch for it
(with start(newBranchXid, TMNOFLAGS)) or can join an existing transaction
branch (with start(existingBranchXid, TMJOIN)).

The DTP XA spec says a single RM *may* service multiple independent
resource domains. There are RM's that work like this, e.g. Berkeley DB
where transactions are represented as first-class Objects which can be
passed around within the same environment. However, PostgreSQL does not
support this behavior. Again, you can't join a transaction across
physical database connections.

One possible alternative we might explore is allowing an XAResource
instance, say xaRes1, for the same database as another XAResource
instance, say xaRes2, to adopt the same physical connection instance as
xaRes2. So xaRes2.isSameRM(xaRes1) would return true if the underlying
physical connections pointed to the same PostgreSQL database (with the
same user credentials). Then if a TM tried to join xaRes2 to xaRes1's
transaction branch, we could implement xaRes1.start(existingBranchXid,
TMJOIN) to assign xaRes1.physicalConnection = xaRes2.physicalConnection.
Then they would share the same transaction branch and context. How about
that?

>> In light of the implementation, I could probably just define isSameRM() as
>> return this == otherXAResource . . .
>
> Yep.
>
>>> The XA and JTA specifications are quite complicated. I'd like to see a
>>> good set of test cases that exercise all possible scenarious and also
>>> error conditions. We're also going to need testers with access to the
>>> popular application servers so that we know our implementation works with
>>> them. AFAIK, the only open source application server that does recovery
>>> properly is the CVS head version of JOnAS.
>>
>> I have some cactus test cases for an XML database that has an XA driver.
>> I'm not feeling too motivated to port them to PostgreSQL.
>
> Can you send them? I'd like to take a look, even if we can't use them
> directly. Which XML database is that?

It's for Berkeley DB XML. I think they're in CVS:

http://berkeley-dbxml-adapter.dev.java.net/

However, these are high-level tests at the user API level. Perhaps we
should write tests to XAResource directly? I don't even think they'd need
to be cactus tests.

>>> Also, if we violate some parts of the specs (like the transaction
>>> interleaving part), it's important to know exactly what the limitations
>>> are and why. I started to write down the exact preconditions for each
>>> method in the javadoc comments, and also separate which preconditions
>>> come from the specs and which are just implementation-specific
>>> limitations.
>>
>> I think the interleaving business is a non-issue. I can't think of a real
>> world case where a transaction manager would do this. Can you?
>
> Using interleaving, the application server could get away with a smaller
> connection pool. It could recycle the connections right after end call,
> without waiting for the prepare/commit cycle.

But then leave the first transaction hanging? My gut tells me you should
resolve, either rollback or commit, a transaction as soon as you can.
Prepared transactions hold onto their locks, right? Leaving them
unresolved would lead to poorer concurrency, not better.

I don't really understand the rationale behind this "interleaving" idea.

> I don't know if any application server does that in practice. If it turns out
> to be a problem, we might get away by some clever locking. We could make the
> second start call block and wait for the previous transaction to finish.
>
>> Besides, like I said, I doubt any other SQL database supports this. I know
>> Berkeley DB does, but Berkeley DB lets you associate any database call with
>> any transaction, so it's easy.
>>
>> JTA was written with more than just SQL databases in mind, and I don't
>> think we need to bend over backwards to implement some corner functionality
>> for a resource which, by its design, doesn't support it.
>
> I agree, if the application servers work without it.
>
> - Heikki
>

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Kris Jurka 2005-07-22 19:40:29 Re: [JDBC] BUG #1780: JDBC driver "setNull" throws for BLOB
Previous Message Dennis Gesker 2005-07-22 19:05:48 JDBCRowSet in tableModel setValueAt method