Re: PostgreSQL XAResource & GlassFish 3.1.2.2

From: Bryan Varner <bvarner(at)polarislabs(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: PostgreSQL XAResource & GlassFish 3.1.2.2
Date: 2013-02-13 15:20:41
Message-ID: 511BAF49.8040706@polarislabs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Thanks Heikki for your responses.

>> So, in our testing, this has eliminated one source of error. We do see
>> -some- improvement.
>>
>> However, I'm -very- confused about why the XAResource implementation
>> for postgres has so many condition checks, why it's tracking what xid
>> was being serviced by the resource (these are global). It seems like
>> the XAResource implementation isn't trusting the global transaction
>> manager to actually track xids to resources.
>
> That's one reason. Bugs in transaction managers are not unheard of.
> Getting useful error messages instead of than strange undefined behavior
> if you call the methods in a wrong sequence is useful in those
> scenarios. It's also highly useful for debugging purposes, if you're
> developing a transaction manager.

I've been doing a lot (a LOT) of catching up on JTA over the last few
days, and I have some concerns about some of the sanity checks in the
driver.

Last night I was reading a thread from this list dating back to 2006,
where it seemed a lot of XA work was going on at the time.

> Another reason is that because the implementation doesn't support
> transaction interleaving and suspend/resume, it checks that the
> transaction manager doesn't try to do that. If it does, you get a
> meaningful error, "Transaction interleaving not implemented". That's a
> clue to the user to configure the transaction manager to not do that.

Fair enough, if the TM allows for that configuration.

I've been logging every XA call from GlassFish 3.1.2.2 over the last 24
hours (including our heavy load testing where the XAResource refused to
do some things the TM told it to do), and I've been reconciling what the
TM is telling the XAResource to do against the JTA 1.0.1 spec [0], and
what the PG implementation claims it can and cannot do.

I hope you don't mind my questions.

>> Is this due to the overly simplistic isSameRM method, where it's not
>> actually comparing if the resources is the same resource rather than
>> the same rmid (pointer to an XAResource)?
>
> I didn't fully understand that sentence, but no, it's not related to the
> fact that we have one XAResource instance per connection.

My understanding of the isSameRM comes JTA spec 3.4.9. Paraphrasing in
my own words, it looks like the TM expects this to return true if the
XAResource passed as a method parameter is connection to the same
resource as the one the method is being invoked upon. The TM uses this
to determine if it should invoke start with TMJOIN, or begin a new TX
branch.

Since interleaving isn't implemented, I can see why the current
implementation 'works'.

>> I'm not an XA expert, but I've been doing some comparison /
>> contrasting to other open source implementations, and it seems like
>> other implementations are merely tracking some simple state (are we in
>> a global tx or not?) but none of them are enforcing the restrictions
>> the PG resource is regarding currentxid.
>
> I guess it depends on the underlying DBMS. Many drivers just pass on the
> start/end calls to the backend, and the backend handles tracking the
> state. Also, some drivers are simply not as strict on sanity-checking
> the incoming calls, and will fail silently if the transaction manager
> does something goofy.

You're not going to see me complain about defensive programming. In
general, it's a good practice and habit to get into.

After synchronizing the methods, the 'failures' we're getting in regards
to TM invocation of the XAResources seem to all be centered around
section 3.4.6 part of the JTA spec.

Specifically, we're seeing commit() invoked with xid's that don't match
the XAResource's currentXid, as well as commit() called on connections
which have no currentXid. This appears to be behavior that's within spec...

I understand that XA isn't easy to do (or else everyone would), but it's
almost like the PG implementation is missing a layer of indirection
between the physical connections (pegged to currently serviced xids) and
the logical connections in use by the client application. I think the
best way I can describe this is, from what I'm seeing, it's like JTA
expects the XAResource being returned by an XAConnection isn't pegged to
(or required to represent) a physical connection, but (potentially)
operates upon one or more physical connections to service the
invocations of the TM upon the appropriate physical connection.

Am I completely off the mark here?

Regards,
-Bryan

[0]: http://download.oracle.com/otndocs/jcp/7286-jta-1.0.1-spec-oth-JSpec/

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Thomas Kellerer 2013-02-13 16:13:06 Re: Timestamp vs. Java Date/Timestamp
Previous Message Simon Riggs 2013-02-13 14:18:02 Re: PostgreSQL XAResource & GlassFish 3.1.2.2