Re: WIP Patch: Add a function that returns binary JSONB as a bytea

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, kevinvan(at)shift(dot)com, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP Patch: Add a function that returns binary JSONB as a bytea
Date: 2018-11-02 19:34:50
Message-ID: 20181102193450.GS4184@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Merlin Moncure (mmoncure(at)gmail(dot)com) wrote:
> On Fri, Nov 2, 2018 at 11:15 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > * Merlin Moncure (mmoncure(at)gmail(dot)com) wrote:
> > > I'll still stand other point I made though; I'd
> > > really want to see some benchmarks demonstrating benefit over
> > > competing approaches that work over the current formats. That should
> > > frame the argument as to whether this is a good idea.
> >
> > What are the 'competing approaches' you're alluding to here? Sending
> > text-format json across as we do today?
>
> Yep -- exactly. For example, write a C client program that recursed
> the structure and dumped it to stdout or assigned to dummy variables
> (being mindful of compiler optimizations). I'd be contrasting this to
> a C parsed json that did essentially the same thing, and rigging a
> high scale test on the back of that. The assumption here is that the
> ultimate consumer is not, say, a browser, but some client app that can
> actually exploit the performance advantages (else, why bother?).

If transferring the data in binary doesn't show a performance
improvement then I could agree that it wouldn't be sensible to do- but I
also find that very unlikely to be the case.

As for what language it's written in- I don't think that matters much.
I'd very much expect it to be more performant to use binary if you're
working in C, of course, but there's no point comparing C-parsed json
into some C structure vs. psycopg2 injesting binary data and building a
Python json object- what matters is if it'd be faster for psycopg2 to
pull in binary-json data and put it into a Python json object, or if
it'd be faster to parse the text-json data and put the result into the
same Python object. In my view, there's something clearly quite wrong
if the text-json data format is faster at that.

> In my experience with arrays and composites, you can see significant
> performance reduction and throughput increase in certain classes of
> queries. However, some of the types that were the worst offenders
> (like timestamps) have been subsequently optimized and/or are
> irrelevant to json since they'd be passed as test anyways.

I've had very good success transferring timestamps as binary, so I'm not
quite sure what you're getting at here.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2018-11-02 20:05:41 Re: WIP Patch: Add a function that returns binary JSONB as a bytea
Previous Message Alvaro Herrera 2018-11-02 18:53:51 Re: partitioned indexes and tablespaces