Re: WIP Patch: Add a function that returns binary JSONB as a bytea

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Christian Ohler <ohler(at)shift(dot)com>, Kevin Van <kevinvan(at)shift(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP Patch: Add a function that returns binary JSONB as a bytea
Date: 2018-11-02 11:08:35
Message-ID: 20181102110835.GQ4184@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Christian Ohler <ohler(at)shift(dot)com> writes:
> > On Wed, Oct 31, 2018 at 7:22 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> If we're going to expose the
> >> internal format, let's just change the definition of the type's binary
> >> I/O format, thereby getting a win for purposes like COPY BINARY as well.
>
> > How would this work from the driver's and application's perspective? What
> > does the driver do when receiving JSONB data?
>
> Well, upthread it was posited that applications that read binary JSONB
> data would be willing to track changes in that format (or else have no
> need to, because they don't do anything with it except feed it back to the
> server). If that isn't the case, then this entire thread is a waste of
> time. I certainly don't buy that exposing the internal format via some
> other mechanism than binary I/O would be a sufficient excuse for not
> worrying about cross-version compatibility.

Yes, I agree that the applications/libraries would need to be expecting
and able to handle changes to the format, though, of course, we'd only
change the format across major versions- we wouldn't do it in a point
release. There might be some argument for supporting multiple versions
also, though we don't really have support for anything like that today,
unfortunately.

> > The idea behind the proposal is to improve efficiency by avoiding
> > conversions, and the most straightforward way to do that is for every layer
> > to pass through the raw bytes.
>
> This argument is, frankly, as bogus as it could possibly be. In the first
> place, you're essentially saying that ignoring version compatibility
> considerations entirely is the way to avoid future version compatibility
> problems. I don't buy it. In the second place, you already admitted
> that format conversion *is* necessary; what PG finds best internally is
> unlikely to be exactly what some other piece of software will want.
> So we'd be better off agreeing on some common interchange format.

There will definitely need to be *some* kind of conversion happening,
the point of doing this would be to give us similar benefits as we have
when passing binary data with the PG protocol today- sure, the data as
the application gets it back probably isn't exactly in the format the
application would like, but at least it's just moving bytes around
typically (network byte order to host byte order, putting things into
some application-side structure, etc). Today, with jsonb, we have the
data stored as binary, but we have to go through the
binary->text->binary conversion for every binary value and that's far
from free.

> I'm still bemused by the proposition that that common interchange format
> shouldn't be, um, JSON. We've already seen BSON, BJSON, etc die
> well-deserved deaths. Why would jsonb internal format, which was never
> for one second intended to be seen anywhere outside PG, be a better
> interchange-format design than those were?

I'd suggest that it's because our jsonb format doesn't have the
limitations that the others had. That said, I also agree that we
wouldn't stream out the *exact* format that's on disk, but the point, at
least to me, is to avoid bouncing back and forth between binary and text
representation of things like integers, floats, timestamps, etc, where
we already support binary-format results being sent out of PG to an
application.

Thanks!

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2018-11-02 11:12:29 Re: zheap: a new storage format for PostgreSQL
Previous Message Peter Eisentraut 2018-11-02 11:08:09 Re: [PATCH] Change simple_heap_insert() to a macro