Re: Using JSONB directly from application

From: Christian Ohler <ohler(at)shift(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Using JSONB directly from application
Date: 2018-06-22 21:52:36
Message-ID: CAOsiKEL7+KkV0C_hAJWxqwTg+PYVfiGPQ0yjFww7ECtqwBjb+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(continuing an old thread from
https://www.postgresql.org/message-id/CAMsr%2BYEtamQYZ5EocsuthQCvyvmRnQrucDP6GZynPtf0gsMbuw%40mail.gmail.com
)

Craig Ringer <craig(at)2ndquadrant(dot)com>, 2018-02-26:

> On 26 February 2018 at 04:05, Anthony Communier
<anthony(dot)communier(at)gmail(dot)com> wrote:
>
> > It would be nice if application connected to a Postrgesql database could
> > send and receive JSONB in binary. It could save some useless text
> > conversion. All works could be done on application side which are often
> > more scalable than database itself.
>
> To support this, you'd need to extract PostgreSQL's jsonb support into a C
> library that could be used independently of backend server infrastructure
> like 'palloc' and memory contexts, ereport(), etc. Or write a parallel
> implementation.

At Shift, we also have use cases that would likely be sped up quite a bit
if we could avoid the conversion from JSONB to JSON, and instead pass
binary JSONB to the application side and parse it there (in Go). I doubt
we'd want to reuse any of Postgres's C code, and would instead go with your
"parallel implementation" idea; I can't imagine it being particularly
difficult to implement a JSONB parser from scratch.

All we need, I think, is a Postgres function raw_jsonb(jsonb) that returns
bytea but is the identity function at the byte level. (Or allow a cast
from jsonb to bytea.)

Our Go code would then send queries like SELECT col1, col2, raw_jsonb(col3)
FROM table1 WHERE ...; I haven't thought in depth about how we'd parse the
JSONB in Go, but perhaps we can synthesize a stream of JSON tokens from the
binary JSONB (one token at a time, to avoid copies and allocations) and
adapt the streaming parser https://github.com/json-iterator/go to turn it
into Go values.

Sending raw JSONB to Postgres might also be interesting, but I'd start with
receiving.

Would implementing raw_jsonb be as trivial as it sounds? What about usages
like SELECT raw_jsonb(col3->'foo'); does the subobject returned by '->'
share structure with the containing object, making the conversion to a
self-contained JSONB value less direct?

Can these conversions be implemented without copying the bytes?

An open question about the API contract would be how raw_jsonb would be
affected if Postgres introduces a version 2 of JSONB encoding. My
intuition is to punt that problem to the application, and define that
raw_jsonb returns whatever version of JSONB is most convenient for Postgres
for that particular datum; this minimizes conversion work on the Postgres
side, which is the purpose of the mechanism. Applications that want a
stable format can use the conventional textual JSON format. But I could
see a case for making the function raw_jsonb(int, jsonb) and allowing the
caller to specify what (maximum?) version of JSONB they want.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-06-22 22:18:55 Re: Speeding up INSERTs and UPDATEs to partitioned tables
Previous Message Alvaro Herrera 2018-06-22 21:51:59 Re: bug with expression index on partition