Re: patch: Add JSON datatype to PostgreSQL (GSoC, WIP)

From: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Terry Laurenzo <tj(at)laurenzo(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch: Add JSON datatype to PostgreSQL (GSoC, WIP)
Date: 2010-10-18 10:56:03
Message-ID: AANLkTi=vrmC=Z33qmsyDycFdCiUVM490VntJbNRZ6F5e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 17, 2010 at 5:18 AM, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> wrote:
> Reading your proposal, I'm now +1 for BSON-like style. Especially JS
> engine's capabilities to map external data to the language
> representation are good news.

Hmm, we could store postgres' data types as-is with their type oids.
I'm not sure whether it is efficient or worth doing, though.

> I like as simple design as we can accept. ISTM format, I/O interface,
> simple get/set, mapping tuple from/to object, and indexing are minimum
> requirement.

+1 to small start, but simple get/set are already debatable...
For example, text/json conversion:
A. SELECT '<json>'::json;
B. SELECT '<text>'::text::json;

In the git repo, A calls parse_json_to_bson_as_vardata(), so the input
should be a json format. OTOH, B calls pgjson_json_from_text(), so the
input can be any text. Those behaviors are surprising. I think we have
no other choice but to define text-to-json cast as parsing. The same
can be said for json-to-text -- type-output function vs. extracting
text value from json.

I think casting text to/from json should behave in the same way as type
input/output. The xml type works in the same manner. And if so, we might
not have any casts to/from json for consistency, even though there are
no problems in casts for non-text types.

I'll list issues before we start json types even in the simplest cases:
----
1. where to implement json core: external library vs. inner postgres
2. internal format: text vs. binary (*)
3. encoding: always UTF-8 vs. database encoding (*)
4. meaning of casts text to/from json: parse/stringify vs. get/set
5. parser implementation: flex/bison vs. hand-coded.
----
(*) Note that we would have comparison two json values in the future. So,
we might need to normalize the internal format even in text representation.

The most interesting parts of json types, including indexing and jsonpath,
would be made on the json core. We need conclusions about those issues.

--
Itagaki Takahiro

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Itagaki Takahiro 2010-10-18 11:37:24 Re: string function - "format" function proposal
Previous Message Mark Cave-Ayland 2010-10-18 10:41:06 Re: knngist - 0.8