Re: to_json(NULL) should to return JSON null instead NULL

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: to_json(NULL) should to return JSON null instead NULL
Date: 2015-08-29 21:42:11
Message-ID: CAKFQuwaVZLfmkE_07Qx0gwVZLuEsdtGdpAOnPyKxMph+gF6C=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Aug 29, 2015 at 4:27 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> writes:
> > On 8/29/15 12:29 PM, Pavel Stehule wrote:
> >> what is correct from JSON perspective? All fields with NULL
>
> > ISTM that the whole purpose of to_json is to properly jsonify something,
> > and the proper json form for "undefined" is 'null', is it not?
>
> What's not entirely clear is what we should do with cases like
>
> regression=# select array_to_json(null::int[]);
> array_to_json
> ---------------
>
> (1 row)
>
> regression=# select row_to_json(null::record);
> row_to_json
> -------------
>
> (1 row)
>
> If we leave those alone (and in the latter case, in particular, there is
> not enough information available to do much else) then it's not so clear
> that changing to_json() is really improving consistency overall.
> For instance, do we really want row_to_json(null::record) and
> to_json(null::record) giving different results? Or if we make them
> both return "null", that breaks the previous invariant that row_to_json
> always yields a JSON object.
>
> An advantage of leaving these things as strict is that the user can easily
> substitute whatever specific behavior she wants for NULLs via coalesce(),
> as was shown upthread. If we put in a different behavior, then the
> only way to override it would be with a CASE, which is tedious and creates
> multiple-evaluation issues.
>

​Wouldn't NULLIF(any, any) be another means of overriding?​

> I'm not necessarily against changing it --- but it doesn't seem entirely
> black-and-white to me, and we do now have a couple of versions worth
> of precedent we'd be breaking with.
>
> If we do vote to change it, I'd want to do so now (ie in 9.5) rather than
> create yet another year's worth of precedent.
>
>
​I don't see COALESCE as being particularly problematic since we are going
to JSON which means that in (nearly?) all situations all of the potential
input values will be homogeneous and a single coalesce can apply the
relevant logic at the point of conversion. Since row_to_json and
record_to_json effectively deal with heterogeneous input types ​the same
opportunity is not available to them and simply incorporating null into the
output for that position is the only reasonable thing to do.

I'm tending to favor removing the strict modifier and having NULL => json
'null' with the advice that the old behavior can be obtained by writing
"NULLIF(to_json(...), json 'null')".

If we go this route we should also consider returning json 'null' for
(record|array)_to_json(NULL) at the same time.

My experience with the json feature is has strictly been the json_to_record
direction...but it seems like a wasted opportunity to be useful in the
default case when JSON provides a null concept that is semantically similar
to the concept in SQL. While I'm hesitant to change this without
representative use cases to work from the theory argument holds enough
weight to strongly consider making the change.

David J.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2015-08-29 22:52:51 Re: buffer README is out of date
Previous Message Alvaro Herrera 2015-08-29 21:36:32 Re: WIP: About CMake v2