Re: Extract numeric filed in JSONB more effectively

From: zhihuifan1213(at)163(dot)com
To: Chapman Flack <chap(at)anastigmatix(dot)net>
Cc: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Extract numeric filed in JSONB more effectively
Date: 2023-11-06 03:26:28
Message-ID: 87a5rry0bz.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Chapman Flack <chap(at)anastigmatix(dot)net> writes:

(This is Andy Fan and I just switch to my new email address).

Hi Chap,

Thanks for alway keep an eye on this!

> Adding this comment via the CF app so it isn't lost, while an
> improperly-interpreted-DKIM-headers issue is still preventing me from
> mailing directly to -hackers.
>
> It was my view that the patch was getting close by the end of the last
> commitfest, but still contained a bit of a logic wart made necessary by
> a questionable choice of error message wording, such that in my view it
> would be better to determine whether a different error message would
> better conform to ISO SQL in the first place, and obviate the need for
> the logic wart.
>
> There seemed to be some progress possible on that when petere had time
> to weigh in on the standard shortly after the last CF ended.
>
> So, it would not have been my choice to assign RfC status before
> getting to a resolution on that.

I agree with this.

>
> Also, it is possible for a JsonbValue to hold a timestamp (as a result
> of a jsonpath evaluation, I don't think that can happen any other
> way),

I believe this is where our disagreement lies.

CREATE TABLE employees (
id serial PRIMARY KEY,
data jsonb
);

INSERT INTO employees (data) VALUES (
'{
"employees":[
{
"firstName":"John",
"lastName":"Doe",
"hireDate":"2022-01-01T09:00:00Z",
"age": 30
},
{
"firstName":"Jane",
"lastName":"Smith",
"hireDate":"2022-02-01T10:00:00Z",
"age": 25
}
]
}'
);

select
jsonb_path_query_tz(data, '$.employees[*] ? (@.hireDate >=
"2022-02-01T00:00:00Z" && @.hireDate < "2022-03-01T00:00:00Z")')
from employees;

select jsonb_path_query_tz(data, '$.employees[*].hireDate ? (@ >=
"2022-02-01T00:00:00Z" && @ < "2022-03-01T00:00:00Z")') from employees;
select pg_typeof(jsonb_path_query_tz(data, '$.employees[*].hireDate ? (@
>= "2022-02-01T00:00:00Z" && @ < "2022-03-01T00:00:00Z")')) from
employees;

select jsonb_path_query_tz(data, '$.employees[*].hireDate ? (@
>= "2022-02-01T00:00:00Z" && @ < "2022-03-01T00:00:00Z")')::timestamp
from employees;
select jsonb_path_query_tz(data, '$.employees[*].hireDate ? (@
>= "2022-02-01T00:00:00Z" && @ < "2022-03-01T00:00:00Z")')::timestamptz
from employees;

I tried all of the above queires and can't find a place where this
optimization would apply. am I miss something?

> and if such a jsonpath evaluation were to be the source expression of a
> cast to SQL timestamp, that situation seems exactly analogous to the
> other situations being optimized here and would require only a few more
> lines in the exact pattern here introduced.

Could you provide an example of this?

> While that could be called
> out of scope when this patch's title refers to "numeric field"
> specifically, it might be worth considering for completeness. The patch
> does, after all, handle boolean already, as well as numeric.

I'd never arugment for this, at this point at least.

v15 is provides without any fundamental changes. Just rebase to the
lastest code and prepared a better commit message.

Attachment Content-Type Size
v15-0001-Improve-the-performance-of-Jsonb-extraction.patch text/x-diff 38.6 KB
unknown_filename text/plain 27 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-11-06 03:38:53 Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)
Previous Message Hayato Kuroda (Fujitsu) 2023-11-06 03:23:52 RE: Is this a problem in GenericXLogFinish()?