Re: Extract numeric filed in JSONB more effectively

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Andy Fan <zhihuifan1213(at)163(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Chapman Flack <chap(at)anastigmatix(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Extract numeric filed in JSONB more effectively
Date: 2024-03-04 12:33:12
Message-ID: 8102ff5b-b156-409e-a48f-e53e63a39b36@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09.02.24 10:05, Andy Fan wrote:
> 2. Where is the current feature blocked for the past few months?
>
> It's error message compatible issue! Continue with above setup:
>
> master:
>
> select * from tb where (a->'b')::numeric > 3::numeric;
> ERROR: cannot cast jsonb string to type numeric
>
> select * from tb where (a->'b')::int4 > 3::numeric;
> ERROR: cannot cast jsonb string to type integer
>
> You can see the error message is different (numeric vs integer).
>
>
> Patched:
>
> We still can get the same error message as master BUT the code
> looks odd.
>
> select * from tb where (a->'b')::int4 > 3;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------
> Seq Scan on public.tb
> Output: a
> Filter: ((jsonb_finish_numeric(jsonb_object_field_start((tb.a)::internal, 'b'::text), '23'::oid))::integer > 3)
> (3 rows)
>
> You can see "jsonb_finish_numeric(.., '23::oid)" the '23::oid' is just
> for the *"integer"* output in error message:
>
> "cannot cast jsonb string to type*integer*"
>
> Now the sistuation is either we use the odd argument (23::oid) in
> jsonb_finish_numeric, or we use a incompatible error message with the
> previous version. I'm not sure which way is better, but this is the
> place the current feature is blocked.

I'm not bothered by that. It also happens on occasion in the backend C
code that we pass around extra information to be able to construct
better error messages. The functions here are not backend C code, but
they are internal functions, so similar considerations can apply.

But I have a different question about this patch set. This has some
overlap with the JSON_VALUE function that is being discussed at [0][1].
For example, if I apply the patch
v39-0001-Add-SQL-JSON-query-functions.patch from that thread, I can run

select count(*) from tb where json_value(a, '$.a' returning numeric) = 2;

and I get a noticeable performance boost over

select count(*) from tb where cast (a->'a' as numeric) = 2;

So some questions to think about:

1. Compare performance of base case vs. this patch vs. json_value.

2. Can json_value be optimized further?

3. Is this patch still needed?

3a. If yes, should the internal rewriting make use of json_value or
share code with it?

[0]:
https://www.postgresql.org/message-id/flat/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg(at)mail(dot)gmail(dot)com
[1]: https://commitfest.postgresql.org/47/4377/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2024-03-04 12:34:20 Re: Dump-restore loosing 'attnotnull' bit for DEFERRABLE PRIMARY KEY column(s).
Previous Message Aleksander Alekseev 2024-03-04 12:13:36 Re: PostgreSQL Contributors Updates