Re: data to json enhancements

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: data to json enhancements
Date: 2012-09-27 13:22:40
Message-ID: CA+TgmoZ9DDi8RrkepWzFBBB+b3YRO2=sbhh2XN+PG=LrSNcuMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 26, 2012 at 1:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Also, on reflection I'm not sure about commandeering cast-to-json for
> this --- aren't we really casting to "json member" or something like
> that? The distinction between a container and its contents seems
> important here. With a container type as source, it might be important
> to do something different if we're coercing it to a complete JSON
> value versus something that will be just one member. I'm handwaving
> here because I don't feel like going back to re-read the RFC, but
> it seems like something that should be considered carefully before
> we lock down an assumption that there can never be a difference.

I feel like there are two different behaviors that someone might want
here, and a cast cannot mean both.

1. Please reinterpret the existing value that I have already got as a
JSON object. For example, you might have a text field in which you
have been storing JSON values. Once you upgrade to 9.2, you might
want to reinterpret the existing contents of the field - which are
already valid JSON - as JSON objects.

2. Please convert the value that I have into a JSON object according
to a type-specific rule. For example, you might have a text field in
which you store arbitrary strings. But perhaps you need to store
structured data there, so once you upgrade to 9.2 you might want to
wrap up your strings inside JSON strings.

Now there is some subtle ambiguity here because in some cases the
behavior can be exactly the same in both cases. For example most
numeric values will get the same treatment either way, but NaN cannot.
If you do mynumeric::json, interpretation #1 will fail for NaN but
interpretation #2 will probably produce something like "NaN".
Similarly if the type is boolean, we could likely get away with
producing true and false for either interpretation. If the type is
hstore, then #1 is going to fail, but #2 is going to convert "1"=>"2"
to {"1":"2"}. So in general it might seem that #2 is the better
interpretation, because it gives many casts a sensible interpretation
that is otherwise lacking.

But, what about text? It seems to me that users will count on the
fact that '[1,2,3]'::text::json is going to produce [1,2,3] (a JSON
array containing the first three numbers) and NOT "[1,2,3]" (a JSON
string containing 7 characters). And that is emphatically
interpretation #1. I think it would be an extremely bad idea to
decide that casts should have interpretation #2 for all data types
except things that are kind of like text, which should instead behave
like #1. And if we standardize completely on interpretation #2, then
I think that '[1,2,3]'::json will end up meaning something different
from '[1,2,3]'::text::json, because the former will (IIUC) go through
the type-input function and end up creating a JSON array, whereas the
latter will go through the cast function and end up creating a JSON
string. It would also mean that in more complex queries, you could
get substantially different behavior in simple cases where the parser
maintains literals as unknown vs. more complex cases where it decides
that they must be text for lack of a full type inference system.

Maybe I am being too pedantic about this and there is a way to make it
all work nicely, but it sure feels like using the casting machinery
here is blending together two different concepts that are only
sometimes the same.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] Perhaps something to consider for a future extension of the standard.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-09-27 13:25:14 Re: system_information.triggers & truncate triggers
Previous Message Amit Kapila 2012-09-27 13:08:39 Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation