[PATCH] Generalized JSON output functions

From: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: [PATCH] Generalized JSON output functions
Date: 2015-05-20 13:16:20
Message-ID: CACACo5TjBAHYRJJc-U788DnfXbbPasX_nGUHFd+=cehA6SRs-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Hackers!

Attached is a patch against master to generalize the JSON-producing
functions in utils/adt/json.c and to provide a set of callbacks which can
be overridden the same way that is already provided for *parsing* JSON.

The motivation behind this to be able to produce specially-crafted JSON in
a logical replication output plugin, such that numeric (and bigint) values
are quoted. This requirement, in turn, arises from the fact that
JavaScript specification, which is quite natural to expect as a consumer
for this JSON data, allows to silently drop significant digits when
converting from string to number object.

I believe this is a well-known problem and I'm aware of a number of tricks
that might be used to avoid it, but none of them seems to be optimal from
my standpoint.

I can also imagine this can be used to convert date/time to string
differently, or adding indentation depending on the depth in object
hierarchy, etc.

What this patch does apart from providing callbacks, is abstracting most of
code for producing the correct JSON structure, which was previously
scattered and repeated in a number of functions with slight differences.
In the current code there are 5 styles for producing JSON object string,
differing in whitespace only:

a) no spaces

select to_json(row(1,2));
to_json
-----------------
{"f1":1,"f2":2}

b) some spaces (hstore_to_json)

select hstore(row(1,2))::json;
hstore
------------------------
{"f1": "1", "f2": "2"}

c) spaces around colon

select json_build_object('f1',1,'f2',2);
json_build_object
----------------------
{"f1" : 1, "f2" : 2}

d) spaces around colon *and* curly braces

select json_object_agg(x,x) from unnest('{1,2}'::int[]) x;
json_object_agg
----------------------
{ "1" : 1, "2" : 2 }

e) line feeds (row_to_json_pretty)

select row_to_json(row(1,2), true) as row;
row
----------
{"f1":1,+
"f2":2}

Personally, I think we should stick to (b), however that would break a lot
of test cases that already depend on (a). I've tried hard to minimize the
amount of changes in expected/json.out, but it is quickly becomes
cumbersome trying to support all of the above formats. So I've altered (c)
and (d) to look like (b), naturally only whitespace was affected.

There's one corner case I don't see a sensible way to support:

select json_agg(x) from generate_series(1,5) x;
json_agg
-----------------
[1, 2, 3, 4, 5]

With the patch applied it puts line feeds between the array elements
instead of spaces.

What also bothers me is that I've hard-coded output function oids for
cstring_out, and textout on assumption that they never change, but would
like to know that for sure.

Feedback is very much welcome!
--
Alex

PS: using a different email address this time, same Alex Shulgin. ;-)

PPS: sample code for mentioned use case with quoting numeric and bigint:

static void out_value_quote_numerics(JsonOutContext *out, Datum val,
JsonTypeCategory tcategory, Oid typoid, Oid
outfuncoid,
bool key_scalar) {
char *outputstr;

if (typoid == INT8OID || typoid == NUMERICOID) {
out->before_value(out);

outputstr = OidOutputFunctionCall(outfuncoid, val);
escape_json(&out->result, outputstr);
pfree(outputstr);

out->after_value(out, key_scalar);
} else {
json_out_value(out, val, tcategory, typoid, outfuncoid, key_scalar);
}
}
...
json_out_init_context(out, JSON_OUT_USE_SPACES);
out->value = out_value_quote_numerics;

Attachment Content-Type Size
json-output-generalized-v1.patch text/x-patch 51.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-05-20 13:47:15 Re: Change pg_cancel_*() to ignore current backend
Previous Message Simon Riggs 2015-05-20 13:15:14 Re: a few thoughts on the schedule