Re: Using JSONB directly from application

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Christian Ohler <ohler(at)shift(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Using JSONB directly from application
Date: 2018-06-24 16:42:55
Message-ID: f490fe7b-c0bd-62fb-7499-e0bd30e379fa@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/22/2018 11:52 PM, Christian Ohler wrote:
> (continuing an old thread from
> https://www.postgresql.org/message-id/CAMsr%2BYEtamQYZ5EocsuthQCvyvmRnQrucDP6GZynPtf0gsMbuw%40mail.gmail.com
> )
>
> Craig Ringer <craig(at)2ndquadrant(dot)com>, 2018-02-26:
>
>> On 26 February 2018 at 04:05, Anthony Communier
> <anthony(dot)communier(at)gmail(dot)com> wrote:
>> 
>> > It would be nice if application connected to a Postrgesql database could
>> > send and receive JSONB in binary. It could save some useless text
>> > conversion. All works could be done on application side which are often
>> > more scalable than database itself.
>>
>> To support this, you'd need to extract PostgreSQL's jsonb support into a C
>> library that could be used independently of backend server infrastructure
>> like 'palloc' and memory contexts, ereport(), etc. Or write a parallel
>> implementation.
>
> At Shift, we also have use cases that would likely be sped up quite a
> bit if we could avoid the conversion from JSONB to JSON, and instead
> pass binary JSONB to the application side and parse it there (in Go).  I
> doubt we'd want to reuse any of Postgres's C code, and would instead go
> with your "parallel implementation" idea; I can't imagine it being
> particularly difficult to implement a JSONB parser from scratch.
>
> All we need, I think, is a Postgres function raw_jsonb(jsonb) that
> returns bytea but is the identity function at the byte level.  (Or allow
> a cast from jsonb to bytea.)
>
> Our Go code would then send queries like SELECT col1, col2,
> raw_jsonb(col3) FROM table1 WHERE ...; I haven't thought in depth about
> how we'd parse the JSONB in Go, but perhaps we can synthesize a stream
> of JSON tokens from the binary JSONB (one token at a time, to avoid
> copies and allocations) and adapt the streaming parser
> https://github.com/json-iterator/go to turn it into Go values.
>
> Sending raw JSONB to Postgres might also be interesting, but I'd start
> with receiving.
>
> Would implementing raw_jsonb be as trivial as it sounds?  What about
> usages like SELECT raw_jsonb(col3->'foo'); does the subobject returned
> by '->' share structure with the containing object, making the
> conversion to a self-contained JSONB value less direct?
>
> Can these conversions be implemented without copying the bytes?
>

I don't think you need the function, actually. PostgreSQL protocol
supports both text and binary mode - in the text mode the server formats
everything as text before sending it to the client. I guess this is what
you mean by "convert to json".

But with the extended protocol you (or rather the connection library
you're using) can specify that the output should be handed in binary,
i.e. as exact copy of the data. This happens at "bind" phase, see the
"Bind" message docs here:

https://www.postgresql.org/docs/current/static/protocol-message-formats.html

>
> An open question about the API contract would be how raw_jsonb would be
> affected if Postgres introduces a version 2 of JSONB encoding.  My
> intuition is to punt that problem to the application, and define that
> raw_jsonb returns whatever version of JSONB is most convenient for
> Postgres for that particular datum; this minimizes conversion work on
> the Postgres side, which is the purpose of the mechanism.  Applications
> that want a stable format can use the conventional textual JSON format. 
> But I could see a case for making the function raw_jsonb(int, jsonb) and
> allowing the caller to specify what (maximum?) version of JSONB they want.
>

I doubt we'll introduce a new JSONB any time soon, so I wouldn't be
particularly worried about this. If it eventually happens, you'll have
to adapt your parser to that, I think.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2018-06-24 17:49:43 Re: Supporting tls-server-end-point as SCRAM channel binding for OpenSSL 1.0.0 and 1.0.1
Previous Message Chris Travers 2018-06-24 14:27:06 Re: utilities to rebuild commit logs from wal