| From: | Haibo Yan <tristan(dot)yim(at)gmail(dot)com> |
|---|---|
| To: | Andy Fan <zhihuifan1213(at)163(dot)com> |
| Cc: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>, Chapman Flack <chap(at)anastigmatix(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Extract numeric filed in JSONB more effectively |
| Date: | 2026-04-08 19:50:50 |
| Message-ID: | CABXr29ETPyi1Tt+X29oUa6Q9AmT3j7RZSdsp12H_6EzGw7RqnQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Apr 7, 2026 at 6:21 PM Haibo Yan <tristan(dot)yim(at)gmail(dot)com> wrote:
> On Tue, Apr 7, 2026 at 5:00 PM Andy Fan <zhihuifan1213(at)163(dot)com> wrote:
>
>> Haibo Yan <tristan(dot)yim(at)gmail(dot)com> writes:
>>
>> Hi Haibo,
>>
>> > I agree that if this approach is extended to the full matrix naively,
>> > duplication will become a real issue.
>>
>> Could you summary how it would be? I think it would be helpful for
>> others to review. Otherwise every reviewer needs to count them many
>> times.
>>
>> --
>> Best Regards
>> Andy Fan
>>
> Hi Andy,
> Sure.
>
> My current thought is to extend it in stages, rather than trying to solve
> the full matrix in a single patch.
>
> A rough plan would be:
>
> 1. Keep the current stage-1 patch small and validate the basic approach
> first
>
>
> -
>
> jsonb_object_field / -> / equivalent subscripting form
> -
>
> casts to numeric and bool
> -
>
> support-function rewrite directly to explicit typed extractor functions
>
> 2. Extend target types before extending extractor families
>
>
> -
>
> add int4 / int8 / float8 for the same object-field family first
> -
>
> keep the SQL-visible rewrite targets explicit, e.g.
>
> -
>
> jsonb_object_field_int4
> -
>
> jsonb_object_field_int8
> -
>
> jsonb_object_field_float8
>
> -
>
> avoid the previous numeric-intermediate rewrite shape
>
> 3. Then extend to other extractor families with the same overall pattern
>
>
> -
>
> likely starting with jsonb_array_element and jsonb_extract_path
> -
>
> and possibly jsonb_path_query_first later
> -
>
> each family would still rewrite to explicit typed extractor entry
> points, e.g.
>
> -
>
> jsonb_array_element_numeric
> -
>
> jsonb_extract_path_bool
> -
>
> jsonb_path_query_first_int4
>
>
> 4. Keep duplication manageable by sharing the implementation underneath
>
>
> -
>
> keep the SQL/catalog-level rewrite targets explicit for readability
> and reviewability
> -
>
> but factor the C implementation into:
>
> -
>
> extractor-family lookup helpers
> -
>
> target-type conversion helpers
> -
>
> thin wrappers, possibly generated with small macros
>
> So the idea would be: explicit rewrite targets at the SQL/catalog level,
> but shared lookup/conversion code underneath, instead of going back to the
> earlier start/finish/internal pipeline.
>
> I agree that if this is extended naively across the full matrix,
> duplication will become a real issue. My reason for keeping the current
> patch narrow is that I wanted to first validate this simpler rewrite shape
> on a small subset before deciding how best to scale it further.
>
> Regards,
>
> Haibo
>
> Hi all,
Following up on our previous discussion, I want to clarify the current
patch plan together with the updated first patch.
Earlier I described this work roughly as a 4-patch line. After iterating on
the implementation and trying to keep each step reviewable, I now think the
cleaner split is a 5-patch series:
1.
object-field casts to scalar types
2.
array-element casts to scalar types
3.
extract-path casts to scalar types
4.
multi-subscript casts via extract-path lowering
5.
jsonpath-first casts to scalar types (jsonb_path_query_first and _tz)
The overall design is unchanged: use the cast function’s support hook to
recognize cast(extract(...)) over scalar-returning jsonb extraction
functions, and rewrite that directly to explicit typed extractor calls.
Supported target types remain:
- numeric
- bool
- int4
- int8
- float8
One point I also want to make explicit is that I do not plan to include
jsonb_path_query in this series.
After looking at it more carefully, I do not think it fits the same model
as the rest of the series. The patches here are all about scalar-returning
extraction functions, where the cast prosupport hook can see and rewrite a
scalar expression pair. jsonb_path_query is set-returning, so optimizing
casts over it would likely need a different mechanism, probably at planner
or executor level, rather than one more patch in this prosupport-based
series.
Attached here is the updated first patch in the current plan.
This patch covers object-field extraction only:
- jsonb_object_field
- -> with text key
- key subscripting
and rewrites casts to:
- numeric
- bool
- int4
- int8
- float8
to direct typed extractor calls.
Thanks again for the earlier comments. I plan to send the remaining patches
in follow-up emails in the order above.
Regards,
Haibo
| Attachment | Content-Type | Size |
|---|---|---|
| v4-0001-jsonb-optimize-object-field-casts-to-scalar-types.patch | application/octet-stream | 35.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Daniel Gustafsson | 2026-04-08 20:03:28 | Re: DOCS: Update data_checksums documentation to reflect enum change |
| Previous Message | Andres Freund | 2026-04-08 19:44:02 | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |