Re: Initial Review: JSON contrib modul was: Re: Another swing at JSON

From: Joey Adams <joeyadams3(dot)14159(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bernd Helmle <mailings(at)oopsware(dot)de>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Initial Review: JSON contrib modul was: Re: Another swing at JSON
Date: 2011-07-20 17:01:09
Message-ID: CAARyMpAcC7O99Nk4BGHQqMgLVi7vNTGD9Ff4tHbPVQA2MrHpqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 20, 2011 at 6:49 AM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
> Hm, I agree that we need to handle \uXXXX escapes in JSON input.
> We won't ever produce them during output though, right?

We could, to prevent transcoding errors if the client encoding is
different than the server encoding (and neither is SQL_ASCII, nor is
the client encoding UTF8). For example, if the database encoding is
UTF-8 and the client encoding is WIN1252, I'd think it would be a good
idea to escape non-ASCII characters.

> How does that XML type handle the situation? It seems that it'd have
> the same problem with unicode entity references "&#XXXX;".

From the looks of it, XML operations convert the text to UTF-8 before
passing it to libxml. The XML type does not normalize the input;
SELECT '&#9835;♫'::xml; simply yields &#9835;♫. Escapes of any
character are allowed in any encoding, from the looks of it.

- Joey

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-07-20 17:06:17 Re: Another issue with invalid XML values
Previous Message Kohei Kaigai 2011-07-20 17:00:30 Re: [v9.1] sepgsql - userspace access vector cache