Re: SQLJSON

From: Álvaro Hernández Tortosa <aht(at)8Kdata(dot)com>
To: Christopher BROWN <brown(at)reflexe(dot)fr>
Cc: Sehrope Sarkuni <sehrope(at)jackdb(dot)com>, Markus KARG <markus(at)headcrashing(dot)eu>, Dave Cramer <pg(at)fastcrypt(dot)com>, List <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: SQLJSON
Date: 2015-06-28 09:41:02
Message-ID: 558FC12E.8010107@8Kdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc


On 28/06/15 09:34, Christopher BROWN wrote:
> Hello,
>
> Quick contribution, I'm not answering in-line because there are
> already too many in-line answers and it's getting unreadable.
>
> * In my own applications, I use Jackson, but it's many up of
> different ".jar" files and has therefore no definitive form (you
> can concoct lots of combinations). It's also quite heavy in terms
> of footprint, and embedding it makes no sense, because you'd have
> to keep updating the driver to keep up to date with Jackson.
> Finally, it doesn't actually implement JSR353 (although it would
> be possible to create a thin wrapper), out-of-the-box (via a
> compatibility API) it can read JSR-353 but it basically rebuilds a
> Jackson representation out of a "standard" representation. I
> might choose Jackson, but I wouldn't want to impose it or require
> that it be bundled with the driver (indeed, that would cause me
> classloader issues as I often update to the latest version of
> Jackson).
>
Although I mentioned Jackson as a good candidate for a default
implementation, you are right it does not implement JSR353 directly
(although wrappers already exist, like
https://github.com/pgelinas/jackson-javax-json). But it does not need to
be the default choice. I did a quick test and wrapping Jackson with
jaackson-javax-json and the set of dependencies to make it work would
add 1102Kb to the Jar file. Not much IMHO, although bigger than current
driver size. I would not be scared however to see a 2Mb jdbc jar file.

However, the Reference Implementation (https://jsonp.java.net/) is
probably good enough and only adds 64Kb to the Jar file. The JSR353 is
just another 32Kb, so in total 96Kb would be added if using the RI
rather than Jackson. I don't consider this offensive.

In summary: why not bundle then the RI? Works out of the box and
does not conflict with Jackson. Want to use Jackson? Go for it. The rest
of the world would have something running out-of-the-box.
>
> * You can compile the driver against the JSONP API without embedding
> either the interfaces or an implementation. It's therefore an
> optional feature for those that require it, and it's not rocket
> science to add the necessary APIs to the classpath.
> * I disagree that bundling interfaces + implementation is "making it
> easy". For some users, perhaps, but for others, you're going to
> cause headaches due to creating classloader conflicts (when it's
> already bundled in their application).
>
Technically, it's possible. But from a user perspective, not
bundling an implementation means:

- Understanding why the code I wrote fails with a "Provider
org.glassfish.json.JsonProviderImpl not found".
- Google that. Find that you need to add the dependency and very likely
create a SPI file (META-INF/services/javax.json.spi.JsonProvider).
- Blame PostgreSQL for not doing that by default and shipping a
half-baked driver that is not finished (won't be my opinion, but might
be user's opinion).
- Google again to see what JSR353-compliant implementations are out there.
- Blame PostgreSQL again for not making this choice for you.
- Struggling to find a compatible implementation. Find Jackson but
realize requires third-party driver. Question the "quality" of that
wrapper and consider whether that would be "supported" with PostgreSQL
driver.
- Luckily enough a search points you to a StackOverflow link that
suggests to use either this Jackson wrapper or the Reference
Implementation (there are not many JSR353 implementations, after all).
- Choose without knowing which JSON parser is best.
- Bundle the dependency, check now it works. Clean your sweat.
- Wonder why not to choose MongoDB next time, it works out of the box.

Not funny.

The alternative is: bundle a default (the RI) and let advanced
users not happy with the default implementation to make another choice
(basically create the META-INF/services/javax.json.spi.JsonProvider and
that will override the bundled RI). You have the best of both worlds.

Pareto (https://en.wikipedia.org/wiki/Pareto_principle): let's make
it work for 80% and let that 20% to edit a file and "suffer" a 64Kb size
increase in the driver, rather than ask everybody to go through the
above process.

> * If as Dave Cramer says, the PG protocol doesn't currently support
> streaming, it still makes sense to add streaming support that
> reads from already fully-read resources... because in that way, if
> the protocol improves in the future, client code using the
> streaming API will benefit (with no changes to client code) in the
> future.
>
JSR353 already has a Streaming API. I don't understand what do we
need to do here, it's already done.

Best regards,

Álvaro

--
Álvaro Hernández Tortosa

-----------
8Kdata

> --
> Christopher
>
>
> On 28 June 2015 at 01:53, Álvaro Hernández Tortosa <aht(at)8kdata(dot)com
> <mailto:aht(at)8kdata(dot)com>> wrote:
>
>
> On 28/06/15 00:55, Sehrope Sarkuni wrote:
>> On Sat, Jun 27, 2015 at 6:25 PM, Álvaro Hernández Tortosa
>> <aht(at)8kdata(dot)com <mailto:aht(at)8kdata(dot)com>> wrote:
>>
>> Hi Sehrope!
>>
>>
>> Hi Álvaro! :D
>>
>> To me, this is the least important question. If based on
>> JSR353's SPI, it's trivial to swap the default, included one,
>> for another one. Just picking a sensible default (Jackson,
>> for instance) is probably good enough.
>>
>>
>> I think I've used Jackson almost every time I've had to deal with
>> JSON in Java. The mapping API is pretty cool in that it lets you
>> directly create an target object type. If we got the route of
>> adding methods to PGResultSet then we could have something like:
>> <T> T getJsonAsType(String, Class<T> clazz)
>
> That might be a nice addition. But I believe that goes beyond
> driver's responsibility: I think it ends when it returns you the
> JSON type you queried (JsonObject in my previous email, but I'm
> correcting now myself: JsonValue)
>
>>
>> I'm not wedded to Jackson though. Honestly if JS353 is the
>> standard then that's what we should be using. We'd still need to
>> figure out how to handle older JVMs or maybe just selectively
>> disable the feature (JDK8+?).
>
> JSR353 is targeted for JavaSE 6 :)
>
>>
>>
>> #2 is driven a lot by #1 as depending on the parser
>> implementation there may be different object types
>> returned. JSON is a bit tricky as "valid JSON" can mean
>> null, a scalar, an object, or an array. Most people thing
>> of it as just an object but "foobar" is valid JSON as
>> well. This leads us to #3...
>>
>>
>> The object type to return has to be IMHO JsonObject:
>> http://docs.oracle.com/javaee/7/api/javax/json/JsonObject.html
>>
>> Not always though. All these are valid JSON too:
>>
>> => SELECT '1'::json AS num, '"test"'::json AS string,
>> '[1,2,3]'::json AS arr, '{"foo":"bar"}'::json AS obj;
>> num | string | arr | obj
>> -----+--------+---------+---------------
>> 1 | "test" | [1,2,3] | {"foo":"bar"}
>> (1 row)
>>
>> We'll need separate getters/setters for the scalar and array
>> types as well. I agree that most people will just be using the
>> object type though (and maybe the array).
>
> You are right here. Please s/JsonObject/JsonValue/g JsonValue
> is a container for any of the above including objects and arrays.
> So it would be enough just with JsonValue getJsonValue(....)
>>
>>
>> #3 doesn't have a straight answer as there is no
>> getJSON(...) methods in the JDBC spec. It'd probably have
>> to be returned via getObject(...).
>>
>> An alternative is to provide PGResultSet and
>> PGPreparedStatement classes similar to PGConnection that
>> provides PG extensions. They could have the get/set
>> methods (ex: getJsonScalar(...) or
>> setJsonObject(Map<String,Object> ...)) to retrieve JSON
>> values as specific object types (i.e. scalar, object,
>> array). It'd be a bit more type safe as presumably most
>> people using json/jsonb types know the top level type of
>> what they're storing.
>>
>>
>> Probably adding methods to PG classes would be better
>> than getObject and force explicit casts. Regarding the
>> methods, if they simply return JsonObject, you already have a
>> full API there to parse and extract and process. So anything
>> that returns a JsonObject from a column (identifier or #)
>> would be enough for me.
>>
>>
>> For most cases I think it'd be fine. I think the custom mapping I
>> mentioned above would cover the rest. Anything beyond that would
>> be a full on transformation and would be very application specific.
>
> Yepp
>
>> For #4 I see two possible wins. First off on the
>> usability side, there's some convenience to natively
>> interfacing with json/jsonb types. It'll only have value
>> though if those types are the same ones that users are
>> using in the rest of their code. If they're just using
>> them as Map<String,Object> everywhere then it'd still be
>> a pain for a user to convert to our "native" PG JSON
>> types to use via JDBC. Having a dedicated API that allows
>> for interaction using native Java types would make this
>> more convenient.
>>
>> The other win I can see for #4 is on performance. Right
>> now JSON is converted to a String. That means everybody
>> using it has to convert it twice. First raw bytes to
>> String, then String to object. A dedicated API could cut
>> one of those out of the way. Given how the wire protocol
>> is implemented in the driver, it wouldn't be a direct
>> reading from the input stream (it'll be buffered in a
>> byte array), but at least it won't be copied twice.
>>
>>
>> As far as I know, most users are using JsonObject, so
>> returning that is a perfect match for pgjdbc. I don't expect
>> however big performance wins as JSON is sent as a String over
>> the wire...
>>
>>
>> The performance gain isn't on the wire, it's from not having to
>> convert bytes => String => JsonObject. It'd be bytes =>
>> JsonObject or bytes => CustomObject. Less work and less GC. The
>> bigger the JSON string, the bigger the savings too.
>
> You are right in that JSR353 allows you to create a parser
> directly out of an InputStream, so you would avoid converting to
> String. That's a win. The rest of the conversions are inevitable
> (having the latter one you pointed out laying in user's realm,
> beyond driver's responsibility).
>
> Regards,
>
> Álvaro
>
> --
> Álvaro Hernández Tortosa
>
>
> -----------
> 8Kdata
>
>

In response to

  • Re: SQLJSON at 2015-06-28 07:34:47 from Christopher BROWN

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Álvaro Hernández Tortosa 2015-06-28 09:44:13 Re: SQLJSON
Previous Message Markus KARG 2015-06-28 09:17:59 Re: SQLJSON