Re: Extract numeric filed in JSONB more effectively

From: Haibo Yan <tristan(dot)yim(at)gmail(dot)com>
To: Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>, Chapman Flack <chap(at)anastigmatix(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Extract numeric filed in JSONB more effectively
Date: 2026-04-07 08:31:16
Message-ID: CABXr29GhnWxrGXD2YijYSDxwObGLqOsf8gvfzrFu=UcYU71XYA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 6, 2026 at 11:14 PM Andy Fan <zhihuifan1213(at)163(dot)com> wrote:

>
> Hi Haibo,
>
> > I’d like to continue pushing this patch forward.
> > Based on the earlier discussion, I reworked the patch into a smaller
> > stage-1 version with a narrower scope and a simpler rewrite
> > strategy. ..
> > and does not yet try to cover array/path extraction or integer/float
> > typed extractors.
>
> Thanks for working on this. I did a quick comparison between this
> version and my last patch v18 [1], here is the difference.
>
> My previous v18[1]: 542 insertions(+), 59 deletions(-)
> Your patch: 543 insertions(+), 5 deletions(-)
>
> However as what you have realized, your current patch still lack of many
> optimizations, Not only the integer/float stuff, but also many
> functions, e.g. jsonb_array_element, jsonb_extract_path,
> jsonb_path_query and jsonb_path_query_first. After considering this,
> what patch will look like in your approach? I guess you can see many
> duplications. Less code doesn't always mean better, but I think this
> still be a key consideration to address/check.
>

Yes, I agree the current patch is still missing a large part of the full
matrix from v18 — not only int4/int8/float8, but also other extractor
families such as jsonb_array_element, jsonb_extract_path, jsonb_path_query,
and jsonb_path_query_first.

My intent with this version was not to cover the whole space at once, but
to first reduce it to a smaller stage-1 subset that is easier to review and
easier to reason about. In particular, I wanted to validate the simpler
rewrite shape first: keep the support-function-based approach, but rewrite
directly to explicit typed extractor functions, without the previous
start/finish/internal pipeline.

I agree that if this approach is extended to the full matrix naively,
duplication will become a real issue. So I am not claiming that the current
patch shape should simply be copied mechanically across every
extractor/type combination. My thought was to first see whether this
narrower subset is acceptable in principle, and then decide how best to
extend it without reintroducing too much abstraction or too much
duplication.

So for now I would prefer to keep the current scope intentionally narrow:

-

jsonb_object_field / -> / equivalent subscripting form
-

casts to numeric and bool

and treat the rest as follow-up work, rather than trying to solve the
entire matrix in one patch.

>
> > I also ran a small microbenchmark to isolate the cast-over-object-field
> path. On my setup, the current patch shows the following gains:
> > Query Before After
> Speedup
> >
> --------------------------------------------------------------------------------
> > SELECT sum((j->'n')::numeric) FROM t 118.028 ms 56.082 ms
> 2.10x
> > SELECT count(*) FROM t WHERE (j->'b')::bool 115.665 ms 51.945 ms
> 2.23x
> >
> --------------------------------------------------------------------------------
>
> Thanks for running the test.
>
> [1] https://www.postgresql.org/message-id/87ttk0lgcx.fsf%40163.com
>
> --
> Best Regards
> Andy Fan
>
Thanks,
Haibo

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2026-04-07 08:40:25 Re: Adding REPACK [concurrently]
Previous Message Ilya Cherdakov 2026-04-07 08:31:03 Re: Environment variable to disable diffs file output