| From: | Florents Tselai <florents(dot)tselai(at)gmail(dot)com> |
|---|---|
| To: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
| Cc: | pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Patch: Add tsmatch JSONPath operator for granular Full Text Search |
| Date: | 2026-02-27 05:59:38 |
| Message-ID: | CA+v5N43E0eC82QiLKrUBN7FDO8OmBDYN6Jrs1mGfh=t4Zf5pdg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Feb 26, 2026 at 8:48 AM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>
>
> > On Feb 1, 2026, at 19:02, Florents Tselai <florents(dot)tselai(at)gmail(dot)com>
> wrote:
> >
> >
> >
> >
> > On Mon, Jan 26, 2026 at 7:22 PM Florents Tselai <
> florents(dot)tselai(at)gmail(dot)com> wrote:
> > Hi,
> >
> > in real-life I work a lot with json & fts search, here's a feature I've
> always wished I had,
> > but never tackle it. Until yesterday that is.
> >
> > SELECT jsonb_path_query(doc, '$.comments[*] ? (@.user == "Alice" &&
> @.body tsmatch "performance")');
> >
> > This patch introduces a tsmatch boolean operator to the JSONPath engine.
> > By integrating FTS natively into path expressions,
> > this operator allows for high-precision filtering of nested JSONB
> structures—
> > solving issues with structural ambiguity and query complexity.
> >
> > Currently, users must choose between two suboptimal paths for FTS-ing
> nested JSON:
> > - Imprecise Global Indexing
> > jsonb_to_tsvector aggregates text into a flat vector.
> > This ignores JSON boundaries, leading to false positives when the same
> key (e.g., "body")
> > appears in different contexts (e.g., a "Product Description" vs. a
> "Customer Review").
> >
> > - Complex SQL Workarounds
> > Achieving 100% precision requires unnesting the document via
> jsonb_array_elements and LATERAL joins.
> > This leads to verbose SQL and high memory overhead from generating
> intermediate heap tuples.
> >
> > One of the most significant advantages of tsmatch is its ability to
> participate in multi-condition predicates
> > within the same JSON object - something jsonb_to_tsvector cannot do.
> >
> > SELECT jsonb_path_query(doc, '$.comments[*] ? (@.user == "Alice" &&
> @.body tsmatch "performance")');
> >
> > In a flat vector, the association between "Alice" and "performance" is
> lost.
> > tsmatch preserves this link by evaluating the FTS predicate in-place
> during path traversal.
> >
> > While the SQL/JSON standard (ISO/IEC 9075-2) does not explicitly define
> an FTS operator,
> > tsmatch is architecturally modeled after the standard-defined like_regex.
> >
> > The implementation follows the like_regex precedent:
> > it is a non-indexable predicate that relies on GIN path-matching for
> pruning and heap re-checks for precision.
> > Caching is scoped to the JsonPathExecContext,
> > ensuring 'compile-once' efficiency per execution without violating the
> stability requirements of prepared statements.
> >
> > This initial implementation uses plainto_tsquery.
> > However, the grammar is designed to support a "mode" flag (similar to
> like_regex flags)
> > in future iterations to toggle between to_tsquery, websearch_to_tsquery,
> and phraseto_tsquery.
> >
> > Here's a v2, that implements the tsqparser clause
> >
> > So this should now work too
> >
> > select jsonb_path_query_array('["fast car", "slow car", "fast and
> furious"]', '$[*] ? (@ tsmatch "fast car" tsqparser "w")
> <v2-0001-Add-tsmatch-JSONPath-operator-for-granular-Full-T.patch>
>
> Hi Florents,
>
> Grant pinged me about this. I can review it in coming days. Can you please
> rebase it? I failed to apply to current master. Also, the CF reported a
> failure test case, please take a look.
Hi Evan,
thanks for having a look. The conflict was due to the intro of
pg_fallthrough. Not related to this patch .
I noticed the failure too, but I'm having a hard time reproducing it tbh.
This fails for Debian Trixie with Meson. The same with Autoconf passes...
https://github.com/Florents-Tselai/postgres/runs/65098077968
| Attachment | Content-Type | Size |
|---|---|---|
| v3-0001-Add-tsmatch-JSONPath-operator-for-granular-Full-T.patch | application/octet-stream | 31.0 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Kapila | 2026-02-27 06:01:14 | Re: [PATCH] Support automatic sequence replication |
| Previous Message | Chao Li | 2026-02-27 05:58:36 | Re: Show comments in \dRp+, \dRs+, and \dX+ psql meta-commands |