From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | Andrew Dunstan <andrew(at)dunslane(dot)net>, Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: jsonb, unicode escapes and escaped backslashes |
Date: | 2015-01-27 23:27:17 |
Message-ID: | 10254.1422401237@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Tue, Jan 27, 2015 at 12:40 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> In particular, I would like to suggest that the current representation of
>> \u0000 is fundamentally broken and that we have to change it, not try to
>> band-aid around it. This will mean an on-disk incompatibility for jsonb
>> data containing U+0000, but hopefully there is very little of that out
>> there yet. If we can get a fix into 9.4.1, I think it's reasonable to
>> consider such solutions.
>>
>> The most obvious way to store such data unambiguously is to just go ahead
>> and store U+0000 as a NUL byte (\000). The only problem with that is that
>> then such a string cannot be considered to be a valid value of type TEXT,
>> which would mean that we'd need to throw an error if we were asked to
>> convert a JSON field containing such a character to text.
> Hm, does this include text out operations for display purposes? If
> so, then any query selecting jsonb objects with null bytes would fail.
> How come we have to error out? How about a warning indicating the
> string was truncated?
No, that's not a problem, because jsonb_out would produce an escaped
textual representation, so any null would come out as \u0000 again.
The trouble comes up when you do something that's supposed to produce
a *non escaped* text equivalent of a JSON string value, such as
the ->> operator.
Arguably, ->> is broken already with the current coding, in that
these results are entirely inconsistent:
regression=# select '{"a":"foo\u0040bar"}'::jsonb ->> 'a';
?column?
----------
foo(at)bar
(1 row)
regression=# select '{"a":"foo\u0000bar"}'::jsonb ->> 'a';
?column?
--------------
foo\u0000bar
(1 row)
regression=# select '{"a":"foo\\u0000bar"}'::jsonb ->> 'a';
?column?
--------------
foo\u0000bar
(1 row)
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Nasby | 2015-01-27 23:43:32 | Re: Parallel Seq Scan |
Previous Message | Jim Nasby | 2015-01-27 23:25:43 | Re: proposal: row_to_array function |