| From: | Florents Tselai <florents(dot)tselai(at)gmail(dot)com> |
|---|---|
| To: | pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | More jsonpath methods: translate, split, join |
| Date: | 2026-04-13 09:56:56 |
| Message-ID: | CA+v5N41e1PgC3yHPQN9Zbo46Pq5a0vVf04P3MXtC_cpWoJT6HA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello hackers,
This is a follow-up to the work recently merged in bd4f879.
In hindsight, I regret not pushing these through for the previous cycle,
as they represent the "missing pieces" for users trying to perform data
cleaning entirely within the JSONPath engine.
With these we can significantly reduce the need for users to "drop out" of
JSONPath
into standard SQL for basic string-to-string-or-array-and-back workflows.
select jsonb_path_query('" A,b,C "',
'$.btrim().lower().split(",").join("-").replace("a","x").upper() starts
with "X-B"');
jsonb_path_query
------------------
true
(1 row)
This patch series adds three new methods to the jsonpath engine:
$.translate(from, to)
A straightforward wrapper around the standard translate() function.
It handles character-by-character mapping and is a natural companion to the
recently merged .replace().
$.split(delimiter [, null_string])
A wrapper around string_to_array().
While we already have .split_part(), that only returns a single token.
.split() allows for a full "explosion" of a string into a JSON array.
$.join(delimiter [, null_string])
The inverse of .split(), wrapping array_to_string().
The input must be an array of strings or nulls.
No implicit casting of numbers or booleans is attempted. This is consistent
with how
other jsonpath string methods handle type mismatches, and can always
be relaxed in a follow-up if there's appetite for it.
A Note on Lax vs. Strict Semantics:
In this implementation, I have kept .join() behavior consistent between lax
and strict modes regarding type mismatches (i.e., both will currently error
on non-string elements).
While lax mode traditionally handles auto-unwrapping of sequences, .join()
is unique
in that it operates on the array as a collective unit rather than iterating
through it to produce multiple results.
I've left the behavior as "strict-equivalent" for now to remain
conservative,
but I am open to discussion on whether lax should instead skip non-string
elements or attempt to "auto-wrap" scalars into single-element arrays.
The .split() and .join() methods introduce a shift in how we handle item
methods.
Historically, most string methods in our engine are scalar-to-scalar,
with keyvalue() being the only exception so far.
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Add-.translate-from-to-jsonpath-method.patch | application/octet-stream | 15.0 KB |
| v1-0002-Add-.split-delimiter-null_string-jsonpath-method.patch | application/octet-stream | 17.9 KB |
| v1-0003-Add-jsonpath-.join-sep-method.patch | application/octet-stream | 19.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | shveta malik | 2026-04-13 10:16:30 | Re: Support EXCEPT for ALL SEQUENCES publications |
| Previous Message | CharSyam | 2026-04-13 09:52:57 | Re: [PATCH] Reduce pg_class scans in GRANT/REVOKE ON ALL TABLES IN SCHEMA |