Re: jsonb_set() strictness considered harmful to data

From: "Mark Felder" <feld(at)FreeBSD(dot)org>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: jsonb_set() strictness considered harmful to data
Date: 2019-10-18 19:10:51
Message-ID: d7a4e2b2-e009-4d79-aa5b-82bcf96aad2f@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Fri, Oct 18, 2019, at 12:37, Ariadne Conill wrote:
> Hello,
>
> I am one of the primary maintainers of Pleroma, a federated social
> networking application written in Elixir, which uses PostgreSQL in
> ways that may be considered outside the typical usage scenarios for
> PostgreSQL.
>
> Namely, we leverage JSONB heavily as a backing store for JSON-LD
> documents[1]. We also use JSONB in combination with Ecto's "embedded
> structs" to store things like user preferences.
>
> The fact that we can use JSONB to achieve our design goals is a
> testament to the flexibility PostgreSQL has.
>
> However, in the process of doing so, we have discovered a serious flaw
> in the way jsonb_set() functions, but upon reading through this
> mailing list, we have discovered that this flaw appears to be an
> intentional design.[2]
>
> A few times now, we have written migrations that do things like copy
> keys in a JSONB object to a new key, to rename them. These migrations
> look like so:
>
> update users set info=jsonb_set(info, '{bar}', info->'foo');
>
> Typically, this works nicely, except for cases where evaluating
> info->'foo' results in an SQL null being returned. When that happens,
> jsonb_set() returns an SQL null, which then results in data loss.[3]
>
> This is not acceptable. PostgreSQL is a database that is renowned for
> data integrity, but here it is wiping out data when it encounters a
> failure case. The way jsonb_set() should fail in this case is to
> simply return the original input: it should NEVER return SQL null.
>
> But hey, we've been burned by this so many times now that we'd like to
> donate a useful function to the commons, consider it a mollyguard for
> the real jsonb_set() function.
>
> create or replace function safe_jsonb_set(target jsonb, path
> text[], new_value jsonb, create_missing boolean default true) returns
> jsonb as $$
> declare
> result jsonb;
> begin
> result := jsonb_set(target, path, coalesce(new_value,
> 'null'::jsonb), create_missing);
> if result is NULL then
> return target;
> else
> return result;
> end if;
> end;
> $$ language plpgsql;
>
> This safe_jsonb_set() wrapper should not be necessary. PostgreSQL's
> own jsonb_set() should have this safety feature built in. Without it,
> using jsonb_set() is like playing russian roulette with your data,
> which is not a reasonable expectation for a database renowned for its
> commitment to data integrity.
>
> Please fix this bug so that we do not have to hack around this bug.
> It has probably ruined countless people's days so far. I don't want
> to hear about how the function is strict, I'm aware it is strict, and
> that strictness is harmful. Please fix the function so that it is
> actually safe to use.
>
> [1]: JSON-LD stands for JSON Linked Data. Pleroma has an "internal
> representation" that shares similar qualities to JSON-LD, so I use
> JSON-LD here as a simplification.
>
> [2]: https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1(at)blaine(dot)gmane(dot)org
>
> [3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an
> example of data loss induced by this issue.
>
> Ariadne
>

This should be directed towards the hackers list, too.

What will it take to change the semantics of jsonb_set()? MySQL implements safe behavior here. It's a real shame Postgres does not. I'll offer a $200 bounty to whoever fixes it. I'm sure it's destroyed more than $200 worth of data and people's time by now, but it's something.

Kind regards,

--
Mark Felder
ports-secteam & portmgr alumni
feld(at)FreeBSD(dot)org

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Christoph Moench-Tegeder 2019-10-18 21:50:18 Re: jsonb_set() strictness considered harmful to data
Previous Message Ariadne Conill 2019-10-18 17:37:24 jsonb_set() strictness considered harmful to data

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Moench-Tegeder 2019-10-18 21:50:18 Re: jsonb_set() strictness considered harmful to data
Previous Message Justin Pryzby 2019-10-18 18:26:27 Re: v12.0: reindex CONCURRENTLY: lock ShareUpdateExclusiveLock on object 14185/39327/0 is already held