Re: jsonb concatenate operator's semantics seem questionable

From: Ryan Pedela <rpedela(at)datalanche(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Ilya Ashchepkov <koctep(at)gmail(dot)com>
Subject: Re: jsonb concatenate operator's semantics seem questionable
Date: 2015-05-18 20:04:56
Message-ID: CACu89FSmZiPDXAEbnKAXmL5xABkHhmO63AshRcxAYt+64EG4Mw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 18, 2015 at 12:24 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>
> On 05/18/2015 08:57 AM, Ryan Pedela wrote:
> > If not, deep concatenation would solve this problem, but I can also see
> > another solution. Use + for shallow concatenation since it really means
> > "add element to top-level path" as Peter suggests. Then add another
> > function: jsonb_add( target jsonb, path text[], new jsonb ) to add
> > element at any arbitrary path. Then leave || for deep concatenation in
> > 9.6 or whenever.
>
> Since swapping the operator seems still on the table, is there any
> particular reason why you think "+" is more suited to shallow
> concatination? Both you and Peter have said this, but as a heavy user
> of JSON/JSONB, to me it seems the other way around. That is, "+" says
> "add to arbitrary nested node" to me more than "||" does.
>

Let me back up a little. I always like to think about what is the ideal
interface first and then worry about implementation because implementation
can always be changed but interface can't. I think the current concat/merge
interface is the ideal. It should be || because that means concat/merge
everywhere else in the PG interface that I am aware of. In the case of JSON
which is a hierarchically data structure, it should be implemented as a
deep merge which by definition satisfies a shallow merge. This is what I
would expect as a user and I would think there was a bug if it didn't
perform deep merge. I expect this because I can implement shallow merge
easily myself using Javascript, Python, etc but deep merge is non-trivial.
Therefore I would expect a special JSON concat/merge library function to do
deep merge. I would rather the interface stay the same and it documented
that the current implementation is a shallow merge and may become a deep
merge in the future.

In the context of splitting shallow and deep merge into two operators, I
think + is better for shallow and || better for deep. The reason for + is
because many programming languages have this behavior. If I see the below
code in language I have never used before:

objC = objA + objB

My default assumption is that + performs a shallow merge. Like I said, I
would rather there just be one operator.

> > If jsonb_replace() satisfies #4 then I think everything is fine. Without
> > #4 however, jsonb would remain an incomplete document database solution
> > in my opinion.
>
> Oh, no question, we're still incomplete. Aside from nested append, we
> kinda lack easy sharded scale-out, which is a rather more major feature,
> no?

I think it depends on the point of view which is more important. If you
have a massive dataset, then obviously sharding is more important. But my
own take on why NoSQL became so popular has only a little to do with
sharding. MongoDB pitched to tech entrepreneurs "use our database and
implement your MVP 10x faster/easier and we have sharding when you become
the next Google". And it worked brilliantly. Many tech entrepreneurs are
worried about time constraints and dream of becoming the next Google
(myself included). But the reality is that most fail and the majority who
don't fail achieve moderate success, only a handful reach Google-level
success. Therefore the vast majority end up never needing sharding, but
they all experience that advertised 10x development speed improvement. I
doubt it really is 10x, but JSON maps very well to programming language
data structures (no impedence mismatch) so it is usually faster to build
prototypes with MongoDB.

If jsonb supported nested append, then I think that would be enough for
people who care most about development speed which I think is a larger
group than the group with massive datasets. In addition, sharding seems
like a server-level or database-level issue rather than a data type issue.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2015-05-18 20:10:44 Re: jsonb concatenate operator's semantics seem questionable
Previous Message Robert Haas 2015-05-18 20:03:45 Re: jsonb concatenate operator's semantics seem questionable