Skip site navigation (1) Skip section navigation (2)

Re: JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Joey Adams <joeyadams3(dot)14159(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PavelStehule <pavel(dot)stehule(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?
Date: 2012-05-01 23:11:02
Message-ID: 23251.1335913862@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On Tue, May 1, 2012 at 9:56 AM, Joey Adams <joeyadams3(dot)14159(at)gmail(dot)com>wrote:
>> No, the RFC says (emphasis mine):
>> 
>> A JSON *text* is a serialized object or array.
>> 
>> If we let the JSON type correspond to a *value* instead, this
>> restriction does not apply, and the JSON type has a useful recursive
>> definition.

> I think you're playing with words. But in any case, the RFC says this
> regarding generators:
> 5. Generators
>    A JSON generator produces JSON text.  The resulting text MUST
>    strictly conform to the JSON grammar.

I read over the RFC, and I think the only reason why they restricted
JSON texts to represent just a subset of JSON values is this cute
little hack in section 3 (Encoding):

   Since the first two characters of a JSON text will always be ASCII
   characters [RFC0020], it is possible to determine whether an octet
   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
   at the pattern of nulls in the first four octets.
           00 00 00 xx  UTF-32BE
           00 xx 00 xx  UTF-16BE
           xx 00 00 00  UTF-32LE
           xx 00 xx 00  UTF-16LE
           xx xx xx xx  UTF-8

They need a guaranteed 2 ASCII characters to make that work, and
they won't necessarily get that many with a bare string literal.

Since for our purposes there is not, and never will be, any need to
figure out whether a JSON input string is encoded in UTF16 or UTF32,
I find myself agreeing with the camp that says we might as well consider
that our JSON type corresponds to JSON values not JSON texts.  I also
notice that json_in() seems to believe that already.

However, that doesn't mean I'm sold on the idea of getting rid of
array_to_json and row_to_json in favor of a universal "to_json()"
function.  In particular, both of those have optional "pretty_bool"
arguments that don't fit nicely at all in a generic conversion
function.  The meaning of that flag is very closely tied to the
input being an array or record respectively.

I'm inclined to leave these functions as they are, and consider
adding a universal "to_json(anyelement)" (with no options) later.
Because it would not have options, it would not be meant to cover
cases where there's value in formatting or conversion options;
so it wouldn't render the existing functions entirely obsolete,
nor would it mean there would be no need for other specialized
conversion functions.

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2012-05-01 23:22:15
Subject: Re: proposal: additional error fields
Previous:From: Peter GeogheganDate: 2012-05-01 23:07:19
Subject: Re: proposal: additional error fields

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group