| From: | Florents Tselai <florents(dot)tselai(at)gmail(dot)com> |
|---|---|
| To: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
| Cc: | pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: More jsonpath methods: translate, split, join |
| Date: | 2026-06-23 15:30:47 |
| Message-ID: | CA+v5N41w29NfC5cjZn5nmy7M0EHFysED52tT8JLaYYOL1H+hvg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Jun 23, 2026 at 6:16 PM Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
wrote:
> On Fri, Jun 19, 2026 at 3:19 AM Florents Tselai <florents(dot)tselai(at)gmail(dot)com>
> wrote:
>
>>
>> On Tue, Jun 16, 2026 at 10:53 PM Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
>> wrote:
>>
>>> On Mon, Apr 13, 2026 at 5:57 AM Florents Tselai <
>>> florents(dot)tselai(at)gmail(dot)com> wrote:
>>>
>>>> Hello hackers,
>>>>
>>>> This is a follow-up to the work recently merged in bd4f879.
>>>> In hindsight, I regret not pushing these through for the previous cycle,
>>>> as they represent the "missing pieces" for users trying to perform data
>>>> cleaning entirely within the JSONPath engine.
>>>> With these we can significantly reduce the need for users to "drop out"
>>>> of JSONPath
>>>> into standard SQL for basic string-to-string-or-array-and-back
>>>> workflows.
>>>>
>>>>
>>> Seems this one got overlooked, so I took a look. The first patch
>>> applies, but the second needs a rebase.
>>>
>>
>> Thanks for having a look Corey,
>>
>> here's a consolidated v2 .
>>
>
> My familiarity with the the internal json stuff is limited to my
> experience in reworking the I/O format of pg_ndistinct and pg_dependencies,
> plus some other misadventures in statistics export, so I apologize if some
> of these questions are naive.
>
> jsonpath.h:
>
> jpiStrJoin is added in the middle of the JsonPathItemType, but jpiStrSplit
> and jpiStrTranslate were added to the end of the enum. The comment
> specifically states that "the order must not be changed and new values must
> be added to the end" - is there some exception to this rule? If there is an
> exception should we modify the comment and have TAP test to show that it
> doesn't break pg_upgrade?
>
>
> jsonpath.c:
>
> In jspInitBuffer(), you add 3 new cases, but they aren't in a row, aren't
> organized alphabetically, or any other guiding organization that I can
> derive.
>
> The order of new types in the Assert() in jspGetLeftArg() is different
> than the order in jspGetRightArg(). Is there a reason they're different?
>
> jsonpath_exec.c:
>
> Everything here checks out. There were some patterns that initially
> concerned me, but they all have precedent elsewhere in the file.
>
> jsonpath_gram.y and jsonpath_scan.l:
> LGTM
>
> jsonb_jsonpath.out/sql
>
> +-- Empty string delimiter (splits into individual characters)
> +select jsonb_path_query('"abc"', '$.split("")');
> + jsonb_path_query
> +------------------
> + ["abc"]
>
> Like Zsolt, I am confused by this test. The description does not match the
> result. Based on the description, I would expect ["a", "b", "c"], but the
> output here looks like it was from a '$.split(",")' or any delimiter that
> isn't one of the letters in the string.
>
I have this planned, that comment is wrong and should read ;
+-- Empty string delimiter (returns the original string as a single
element)
>
> the underlying function text_to_array is accessible via SQL
> string_to_array, and there we get:
>
> corey=# select string_to_array('foo', '');
> string_to_array
> -----------------
> {foo}
> (1 row)
>
> corey=# select string_to_array('foo', NULL);
> string_to_array
> -----------------
> {f,o,o}
> (1 row)
>
> Assuming we don't find that behavior to be a bug, we should change the
> description of the test, or change the second parameter from '' to NULL.
>
> Moving along, the first "silent => true" query could use a comment. In
> general a comment above every test is nice for telling future developers
> what specifically is being tested.
>
> func-json.sgml:
>
> The entry for split() makes no mention of the proper behavior when the
> delimiter is empty
>
> Summary:
>
> The things I'm noticing seem like very random coding choices, which makes
> me think that they weren't choices, they're just randomness introduced by
> squashing and rebasing on top of bd4f879a9cdd. Could you check into that?
>
Thanks for your thoroughness.
From a quick review I think most of your comments are spot on, and the enum
ordering especially should be considered a bit critical as it could break
backwards compatibility and pg_upgrade.
I'll incorporate these in an upcoming v4.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Corey Huinker | 2026-06-23 15:32:48 | Re: use of SPI by postgresImportForeignStatistics |
| Previous Message | Corey Huinker | 2026-06-23 15:16:30 | Re: More jsonpath methods: translate, split, join |