Re: json/jsonb inconsistence

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: json/jsonb inconsistence
Date: 2014-05-29 12:12:16
Message-ID: 53872420.2000207@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05/29/2014 07:55 AM, Teodor Sigaev wrote:
> # select '"\uaBcD"'::json;
> json
> ----------
> "\uaBcD"
>
> but
>
> # select '"\uaBcD"'::jsonb;
> ERROR: invalid input syntax for type json
> LINE 1: select '"\uaBcD"'::jsonb;
> ^
> DETAIL: Unicode escape values cannot be used for code point values
> above 007F when the server encoding is not UTF8.
> CONTEXT: JSON data, line 1: ...
>
> and
>
> # select '"\uaBcD"'::json -> 0;
> ERROR: invalid input syntax for type json
> DETAIL: Unicode escape values cannot be used for code point values
> above 007F when the server encoding is not UTF8.
> CONTEXT: JSON data, line 1: ...
> Time: 0,696 ms
>
> More than, json looks strange:
>
> # select '["\uaBcD"]'::json;
> json
> ------------
> ["\uaBcD"]
>
> but
>
> # select '["\uaBcD"]'::json->0;
> ERROR: invalid input syntax for type json
> DETAIL: Unicode escape values cannot be used for code point values
> above 007F when the server encoding is not UTF8.
> CONTEXT: JSON data, line 1: [...
>
> Looks like random parse rules.
>

It is documented that for json we don't check the validity of unicode
escapes until we try to use them. That's because the original json
parser didn't check them, and if we started doing so now users who
pg_upgraded would find themselves with invalid data in the database. The
rules for jsonb are more strict, because we actually resolve the unicode
escapes when constructing the jsonb object. There is nothing at all
random about it, although I agree it's slightly inconsistent. It's been
the subject of some discussion on -hackers previously, IIRC. I actually
referred to this difference in my talk at pgCon last Friday.

Frankly, if you want to use json/jsonb, you are going to be best served
by using a UTF8-encoded database, or not using non-ascii unicode escapes
in json at all.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2014-05-29 12:14:00 Re: [PATCH] Replacement for OSSP-UUID for Linux and BSD
Previous Message Greg Stark 2014-05-29 12:12:14 backup_label revisited