Re: JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Joey Adams <joeyadams3(dot)14159(at)gmail(dot)com>, PavelStehule <pavel(dot)stehule(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?
Date: 2012-05-01 20:16:01
Message-ID: 1335903361.3106.247.camel@hvost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2012-05-01 at 12:54 -0400, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> > Let me just point out two things. First, we are approaching a beta release.
> > The time for changing this is long since gone, IMNSHO.
>
> This is our last chance to get it right, so that argument doesn't seem
> to me to carry a lot of weight ...
>
> > Second, RFC 4627 is absolutely clear: a valid JSON value can only be an
> > object or an array,

No, according to RFC, a valid "JSON value" can be:

an object, an array, a number, a string, or one of false null true

>From RFC:
-------------------------
1. Introduction

JavaScript Object Notation (JSON) is a text format for the
serialization of structured data. It is derived from the object
literals of JavaScript, as defined in the ECMAScript Programming
Language Standard, Third Edition [ECMA].

JSON can represent four primitive types (strings, numbers, booleans,
and null) and two structured types (objects and arrays).
...

2.1. Values

A JSON value MUST be an object, array, number, or string, or one of
the following three literal names:

false null true

-------------------------

By having our JSON type mean a "JSON value" instead of "JSON
text" (which indeed is required to be array or object) we could make it
easy for all extension types to provide casts to "JSON value" and thus
automatically plug them into postgreSQL's built-in JSON support.

I would very much like this the *_to_array() functions first try a cast
to json when converting values, so that for example after the following
cast it would do the right thing for hstore .

CREATE FUNCTION hstore_to_json(IN hvalue hstore, OUT jvalue json) AS $$
return '{%s}' % hvalue.replace('"=>"','":"')
$$ LANGUAGE plpythonu;

CREATE CAST (hstore AS json)
WITH FUNCTION hstore_to_json(hstore)
AS IMPLICIT
;

hannu=# select *, datadict::json from test_hstore;
id | datadict | datadict
----+---------------------------------+---------------------------------
1 | "baz"=>"whatever", "foo"=>"bar" | {"baz":"whatever", "foo":"bar"}
2 | "bar"=>"the same", "foo"=>"bar" | {"bar":"the same", "foo":"bar"}
(2 rows)

Currently it seems to be hardwired to do datum --> text conversions

hannu=# select row_to_json(test_hstore) from test_hstore;
row_to_json
---------------------------------------------------------------
{"id":1,"datadict":"\"baz\"=>\"whatever\", \"foo\"=>\"bar\""}
{"id":2,"datadict":"\"bar\"=>\"the same\", \"foo\"=>\"bar\""}
(2 rows)

I'd like it to try datum --> json first and yield

hannu=# select row_to_json(test_hstore) from test_hstore;
row_to_json
---------------------------------------------------------------
{"id":1,"datadict":{"baz":"whatever", "foo":"bar"}}
{"id":2,"datadict":{"bar":"the same", "foo":"bar"}}
(2 rows)

This exact case could be made to work even with "JSON text" meaning ob
json type, but some other types may not be so lucky.

FOr example imagine a tri-value booean with textual values "yes", "no",
and "don't know" . Logical mapping to json would be true, false, null,
but we can't easily provide a triboolean --> json cast for this if we
require json value to be "JSON text" and don't accept "JSON values"

> > so this thing about converting arbitrary datum values
> > to JSON is a fantasy.

It should be possible to cast them to "JSON value", but not always "JSON
text" which indeed has to be array or object .

> > If anything, we should adjust the JSON input routines
> > to disallow anything else, rather than start to output what is not valid
> > JSON.

Nah, I'd like us to accept what other JSON parsers usually accept,
especially the original one described in
http://www.json.org/fatfree.html which cited one way to parse json to be

responseData = eval('(' + responseText + ')');

:)

But then I also like their statement when comparing JSON to XML :

JSON has no validator. Being well-formed and valid is not the same as
being correct and relevant. Ultimately, every application is responsible
for validating its inputs. This cannot be delegated.

> ... but this one does.

It does, _if_ we accept that json type is for "JSON text" and not "JSON
value". in which case we might need also a json_value type for
extensible casting to and from json.

> regards, tom lane

--
-------
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2012-05-01 20:22:32 Re: Temporary tables under hot standby
Previous Message Tom Lane 2012-05-01 20:14:08 Re: proposal: additional error fields