Re: [HACKERS] JPA + enum == Exception

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tom Dunstan <pgsql(at)tomd(dot)cc>, "Hudson, Derrick" <dhudson(at)redcom(dot)com>, "Marc G(dot) Fournier" <scrappy(at)hub(dot)org>, "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] JPA + enum == Exception
Date: 2013-02-13 03:12:09
Message-ID: 511B0489.8000106@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

On 02/12/2013 11:50 PM, Tom Lane wrote:
> Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
>
>> It's my strong view that we should accept setString(...) for any
>> string-like type, like xml, json, domains of text, and so on - or at
>> least provide an easy, no-superuser-required way to tell Pg to do so.
>
> The difficulty I've got with that is that there are only two kinds of
> literal in SQL, namely numbers and strings, so that "it looks like a
> string" applies to absolutely every non-numeric type.

I absolutely agree with that part - we don't want to go converting
*everything* from 'text' to whatever we feel like implicitly. The
removal of those implicit casts was by and large a good thing.

I'm focusing specifically on data types that make sense to handle as
strings in client applications - types where there may be no universal,
core, built-in data type for them or where handling them as strings in
the client may be dramatically more efficient - like "json" and "xml".

I'm *not* proposing a blanket implicit conversion. Those conversions
were removed for good reasons. I just think the removal went a little
too far and that a couple of them need to be added back in. I'm *not*
arguing that we should implicitly convert "text" to anything and everything.

Here's the rationale:

Because of the growing trend toward declarative, type-based data
mapping, it is no longer easy for many users to separate low-level
database interaction from the higher level definitions of how the data
model is represented and worked with in the application. Irrespective of
whether these high level data mapping tools are a good thing or not, in
practice the effect is that it's not easy to say "I'll store this JSON
as a string in my app, and tell the DB it's an unknown type literal when
doing database I/O with it so it knows it can cast it to its internal
JSON type". You largely lose access to the JDBC layer, and while most
JPA implementations and other tools offer ways to pierce the abstraction
they can be clumsy, hard to find out about, poorly documented, and
difficult. That's OK if you're doing something weird and special - but I
don't think storing and retrieving json and xml values as strings rather
than rich data types in the client falls into that category.

Yes, I'm saying we should work around client issues where it has a low
cost to us.

We could say "Use a client that doesn't suck". That's fine; we're not
selling database systems, so when the user says "no thanks, I'll use a
database that doesn't suck instead" it doesn't directly hurt us.
However, I'd prefer not to force that choice when we can fix the problem
on our end with no negative impact on ourselves or users.

I *really* don't want users to have to use memory- and cpu-hungry types
from some XML or JSON support library when mapping database entities
into the application model just to work around a type handling quirk
from the interaction of the client library, Pg and PgJDBC. Particularly
when the app (or this layer of it) in question might be just an
intermediary that doesn't really care what's in the textlike field.

We work around less-than-lovely quirks in operating systems, the SQL
standard, other DBs, etc all the time. I'm proposing that we work around
one in widely used clients like Hibernate and EclipseLink since it
doesn't hurt us and it makes users' lives easier.

> If we go down
> this road we'll end up allowing implicit casts from text to (at least)
> every non-numeric type, which will be pretty much fatal for type
> checking purposes.

I see what you're getting at and agree that this would be bad, but it's
not what I'm arguing for.

I specifically think that "json" and "xml" should be implicitly castable
to/from text. Permitting this doesn't require changes to how Pg
interprets literals, nor hacks in the JDBC driver.

Hacking the JDBC driver to send all java.lang.String values as unknown
is exactly what I want to avoid.

I'd just tell people to:

CREATE CAST (text AS xml) WITH FUNCTION xml(text) AS IMPLICIT;
CREATE CAST (text AS json) WITHOUT FUNCTION AS IMPLICIT;

but (a) the "xml" one won't work because a cast already exists; (b) it's
superuser-only; and (c) it relies on implementation details that may change.

In practice you need to, as superuser:

UPDATE pg_catalog.pg_cast
SET castcontext = 'i'
WHERE casttarget = 'xml'::regtype
AND (castsource = 'text'::regtype OR castsource = 'varchar'::regtype);

then query pg_cast to see if a text to json cast already exists, update
it if it does exist, and use the above CREATE CAST to create it if it
doesn't.

In other words, "blech".

> IIRC, there's already a hack to tell the JDBC driver to mark setString
> parameters as "unknown" rather than "text", which seems to me to be
> a much less dangerous way of getting the lenient behavior when you want
> it.

I'll take a look. I never noticed one in the docs, but back when I was
wrestling with this problem I don't think I dug into the PgJDBC sources.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2013-02-13 03:38:21 Re: Identity projection
Previous Message Kyotaro HORIGUCHI 2013-02-13 02:42:17 Re: Identity projection

Browse pgsql-jdbc by date

  From Date Subject
Next Message Andreas Reichel 2013-02-13 05:37:25 Re: Timestamp vs. Java Date/Timestamp
Previous Message dmp 2013-02-13 02:32:18 Re: rounded brackets in prepared statement