Re: jsonb, unicode escapes and escaped backslashes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb, unicode escapes and escaped backslashes
Date: 2015-01-29 21:59:46
Message-ID: CA+Tgmoa9ZG+Mp04oVeabyCeDiY-n=2BTRqT5ZiZGs8j414JukQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 29, 2015 at 4:33 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> On 01/29/2015 12:10 PM, Robert Haas wrote:
>> On Wed, Jan 28, 2015 at 12:48 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> The cause of this bug is commit 0ad1a816320a2b539a51628e2a0b1e83ff096b1d,
>>> which I'm inclined to think we need to simply revert, not render even
>>> more squirrely.
>> Yes, that commit looks broken. If you convert from text to JSON you
>> should get a JSON string equivalent to the text you started out with.
>> That commit departs from that in favor of allowing \uXXXX sequences in
>> the text being converted to turn into a single character (or perhaps
>> an encoding error) after a text->json->text roundtrip. Maybe I
>> haven't had enough caffeine today, but I can't see how that can
>> possibly be a good idea.
>
> Possibly.
>
> I'm coming down more and more on the side of Tom's suggestion just to ban
> \u0000 in jsonb. I think that would let us have some fairly simple and
> consistent rules. I'm not too worried that we'll be disallowing input that
> we've previously allowed. We've done that often in the past, although less
> often in point releases. I certainly don't want to wait a full release cycle
> to fix this if possible.

I have yet to understand what we fix by banning \u0000. How is 0000
different from any other four-digit hexadecimal number that's not a
valid character in the current encoding? What does banning that one
particular value do?

In any case, whatever we do about that issue, the idea that the text
-> json string transformation can *change the input string into some
other string* seems like an independent problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-01-29 22:02:33 Re: Proposal: two new role attributes and/or capabilities?
Previous Message Roger Pack 2015-01-29 21:58:14 Re: 4B row limit for CLOB tables