Re: JSON for PG 9.2

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Joey Adams <joeyadams3(dot)14159(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Claes Jakobsson <claes(at)surfar(dot)nu>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Jan Urbański <wulczer(at)wulczer(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, Jan Wieck <janwieck(at)yahoo(dot)com>
Subject: Re: JSON for PG 9.2
Date: 2012-01-22 16:43:34
Message-ID: 4F1C3CB6.6090104@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/22/2012 04:28 AM, Andrew Dunstan wrote:
>
>
> On 01/21/2012 11:40 PM, Jeff Janes wrote:
>> On Sun, Jan 15, 2012 at 8:08 AM, Andrew Dunstan<andrew(at)dunslane(dot)net>
>> wrote:
>>>
>>> On 01/14/2012 03:06 PM, Andrew Dunstan wrote:
>>>>
>>>>
>>>>
>>>> OK, here's a patch that does both query_to_json and array_to_json,
>>>> along
>>>> with docs and regression tests. It include Robert's original patch,
>>>> although
>>>> I can produce a differential patch if required. It can also be
>>>> pulled from
>>>> <https://bitbucket.org/adunstan/pgdevel>
>>>>
>>>>
>>>
>>> Here's an update that adds row_to_json, plus a bit more cleanup.
>> This is bit-rotted such that initdb fails
>>
>> creating template1 database in
>> /tmp/bar/src/test/regress/./tmp_check/data/base/1 ... FATAL: could
>> not create unique index "pg_proc_oid_index"
>> DETAIL: Key (oid)=(3145) is duplicated.
>>
>> I bumped up those oids in the patch, and it passes make check once I
>> figure out how to get the test run under UTF-8. Is it supposed to
>> pass under other encodings? I can't tell from the rest of thread
>> whether it supposed to pass in other encodings or not.
>>
>
> Yeah, regression tests generally are supposed to run in all encodings.
> Either we could knock out the offending test, or we could supply an
> alternative result file. If we do the latter, maybe we should modify
> the query slightly, so it reads
>
> SELECT 'getdatabaseencoding() = 'UTF8' as is_utf8, "\uaBcD"'::json;
>
>

Actually, given recent discussion I think that test should just be
removed from json.c. We don't actually have any test that the code point
is valid (e.g. that it doesn't refer to an unallocated code point). We
don't do that elsewhere either - the unicode_to_utf8() function the
scanner uses to turn \unnnn escapes into utf8 doesn't look for
unallocated code points. I'm not sure how much other validation we
should do - for example on correct use of surrogate pairs. I'd rather
get this as right as possible now - every time we tighten encoding rules
to make sure incorrectly encoded data doesn't get into the database it
causes someone real pain.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaime Casanova 2012-01-22 18:35:19 Re: pg_stat_database deadlock counter
Previous Message Julien Tachoires 2012-01-22 16:04:25 Re: patch : Allow toast tables to be moved to a different tablespace