Re: [HACKERS] [PATCH] Generic type subscripting

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Dian M Fay <dian(dot)m(dot)fay(at)gmail(dot)com>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, David Steele <david(at)pgmasters(dot)net>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Oleksandr Shulgin <oleksandr(dot)shulgin(at)zalando(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [PATCH] Generic type subscripting
Date: 2021-01-12 19:02:59
Message-ID: CAFj8pRBcsLTFi0hT6Uz2K7zkgNianV4Zk8xqranEm3wcvEv+Mg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

ne 10. 1. 2021 v 19:52 odesílatel Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
napsal:

> Hi
>
>
>> I'm thinking of the update path as a kind of implicit schema. JSON is
>> intentionally not bound to any schema on creation, so I don't see a
>> failure to enforce another schema at runtime (and outside the WHERE
>> clause, at that) as an error exactly.
>>
>
> This concept is not consistent with other implemented behaviour.
>
> 1. The schema is dynamically enhanced - so although the path doesn't
> exists, it is created and data are changed
>
> postgres=# create table foo(a jsonb);
> CREATE TABLE
> postgres=# insert into foo values('{}');
> INSERT 0 1
> postgres=# update foo set a['a']['a'][10] = '0';
> UPDATE 1
> postgres=# select * from foo;
>
> ┌───────────────────────────────────────────────────────────────────────────────┐
> │ a
> │
>
> ╞═══════════════════════════════════════════════════════════════════════════════╡
> │ {"a": {"a": [null, null, null, null, null, null, null, null, null, null,
> 0]}} │
>
> └───────────────────────────────────────────────────────────────────────────────┘
> (1 row)
>
> So although the path [a,a,10] was not exists, it was created.
>
> 2. this update fails (and it is correct)
>
> postgres=# update foo set a['a']['a']['b'] = '0';
> ERROR: path element at position 3 is not an integer: "b"
>
> although the path [a,a,b] doesn't exists, and it is not ignored.
>
> This implementation doesn't do only UPDATE (and then analogy with WHERE
> clause isn't fully adequate). It does MERGE. This is necessary, because
> without it, the behaviour will be pretty unfriendly - because there is not
> any external schema. I think so this is important - and it can be little
> bit messy. I am not sure if I use correct technical terms - we try to use
> LAX update in first step, and if it is not successful, then we try to do
> LAX insert. This is maybe correct from JSON semantic - but for developer it
> is unfriendly, because he hasn't possibility to detect if insert was or was
> not successful. In special JSON functions I can control behave and can
> specify LAX or STRICT how it is necessity. But in this interface
> (subscripting) this possibility is missing.
>
> I think so there should be final check (semantically) if value was
> updated, and if the value was changed. If not, then error should be raised.
> It should be very similar like RLS update. I know and I understand so there
> should be more than one possible implementations, but safe is only one -
> after successful update I would to see new value inside, and when it is not
> possible, then I expect exception. I think so it is more practical too. I
> can control filtering with WHERE clause. But I cannot to control MERGE
> process. Manual recheck after every update can be terrible slow.
>

I tested behaviour and I didn't find anything other than the mentioned
issue.

Now I can check this feature from plpgsql, and it is working. Because there
is no special support in plpgsql runtime, the update of jsonb is
significantly slower than in update of arrays, and looks so update of jsonb
has O(N2) cost. I don't think it is important at this moment - more
important is fact, so I didn't find any memory problems.

postgres=# do $$
declare v int[] = array_fill(0, ARRAY[10,10,10,20]);
begin
for i1 in 1..10 loop
for i2 in 1..10 loop
for i3 in 1..10 loop
for i4 in 1..20 loop
v[i1][i2][i3][i4] = 10;
raise notice '% % % %', i1, i2, i3, i4;
end loop;
end loop;
end loop;
end loop;
end;
$$;

postgres=# do $$
declare v jsonb;
begin
for i1 in 1..10 loop
for i2 in 1..10 loop
for i3 in 1..10 loop
for i4 in 1..20 loop
v[i1][i2][i3][i4] = '10'::jsonb;
raise notice '% % % %', i1, i2, i3, i4;
end loop;
end loop;
end loop;
end loop;
end;
$$;

There are some unwanted white spaces

+ Jsonb *jsonbSource = DatumGetJsonbP(*op->resvalue);
+ sbsrefstate->prevvalue = jsonb_get_element(jsonbSource,
+ sbsrefstate->upperindex,
+ sbsrefstate->numupper,
+ &sbsrefstate->prevnull,
+ false);

+ workspace = palloc0(MAXALIGN(sizeof(JsonbSubWorkspace)) +
+ nupper * (sizeof(Datum) + sizeof(Oid)));
+ workspace->expectArray = false;
+ ptr = ((char *) workspace) + MAXALIGN(sizeof(JsonbSubWorkspace));

Regards

Pavel

> Regards
>
> Pavel
>
>
>
>> But I looked into the bulk case a little further, and "outside the
>> WHERE clause" cuts both ways. The server reports an update whether or
>> not the JSON could have been modified, which suggests triggers will
>> fire for no-op updates. That's more clearly a problem.
>>
>> insert into j (val) values
>> ('{"a": 100}'),
>> ('{"a": "200"}'),
>> ('{"b": "300"}'),
>> ('{"c": {"d": 400}}'),
>> ('{"a": {"z": 500}}');
>>
>> INSERT 0 5
>> update j set val['a']['z'] = '600' returning *;
>> val
>> ────────────────────────────────────
>> {"a": 100}
>> {"a": "200"}
>> {"a": {"z": 600}, "b": "300"}
>> {"a": {"z": 600}, "c": {"d": 400}}
>> {"a": {"z": 600}}
>> (5 rows)
>>
>> *UPDATE 5*
>>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-01-12 19:30:35 Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits
Previous Message Stephen Frost 2021-01-12 18:57:11 Re: Key management with tests