Re: jsonb, unicode escapes and escaped backslashes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb, unicode escapes and escaped backslashes
Date: 2015-01-28 17:36:58
Message-ID: 3373.1422466618@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> It's not clear to me how we should represent a unicode null. i.e. given
> a json of '["foo\u0000bar"]', I get that we'd store the element as
> 'foo\x00bar', but what is the result of

> (jsonb '["foo\u0000bar"')->>0

> It's defined to be text so we can't just shove a binary null in the
> middle of it. Do we throw an error?

Yes, that is what I was proposing upthread. Obviously, this needs some
thought to ensure that there's *something* useful you can do with a field
containing a nul, but we'd have little choice but to throw an error if
the user asks us to convert such a field to unescaped text.

I'd be a bit inclined to reject nuls in object field names even if we
allow them in field values, since just about everything you can usefully
do with a field name involves regarding it as text.

Another interesting implementation problem is what does indexing do with
such values --- ISTR there's an implicit conversion to C strings in there
too, at least in GIN indexes.

Anyway, there is a significant amount of work involved here, and there's
no way we're getting it done for 9.4.1, or probably 9.4.anything. I think
our only realistic choice right now is to throw error for \u0000 so that
we can preserve our options for doing something useful with it later.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-01-28 17:48:45 Re: jsonb, unicode escapes and escaped backslashes
Previous Message Stephen Frost 2015-01-28 17:33:39 pgsql: Fix column-privilege leak in error-message paths