Re: Extract numeric filed in JSONB more effectively

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: Chapman Flack <chap(at)anastigmatix(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Extract numeric filed in JSONB more effectively
Date: 2024-01-08 00:00:00
Message-ID: CACJufxH-6DkAGr0GRFSS7BAwKG7uVFceXtxHLM7Ub7oGagQGKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 7, 2024 at 3:26 PM Andy Fan <zhihuifan1213(at)163(dot)com> wrote:
>
>
> Hi,
>
> > hi.
> > you don't need to change src/include/catalog/catversion.h
> > as mentioned in https://wiki.postgresql.org/wiki/Committing_checklist
> > Otherwise, cfbot will fail many times.
>
> Thanks for the wiki.
>
> I checked the wiki and search "catversion", the only message I got is:
>
> "Consider the need for a catversion bump."
>
> How could this be explained as "no need to change ../catversion.h"?

that means catversion.h changes is the committer's responsibility, IMHO.

IMHO, main reason is every time the catversion.h change, cfbot
http://cfbot.cputube.org will fail.
one patch took very long time to be committable.
you don't need update your patch for the every catversion.h changes.

> >
> > +typedef enum JsonbValueTarget
> > +{
> > + JsonbValue_AsJsonbValue,
> > + JsonbValue_AsJsonb,
> > + JsonbValue_AsText
> > +} JsonbValueTarget;
> >
> > change to
> >
> > +typedef enum JsonbValueTarget
> > +{
> > + JsonbValue_AsJsonbValue,
> > + JsonbValue_AsJsonb,
> > + JsonbValue_AsText,
> > +} JsonbValueTarget;
> >

reason: https://git.postgresql.org/cgit/postgresql.git/commit/?id=611806cd726fc92989ac918eac48fd8d684869c7

> > currently cannot do `git apply`.
>
> OK, I guess it's something about whitespaces, my git-commit hook has
> been configured to capture this during commit. After we reach an
> agreement about the 'catversion.h' stuff, the next version of patch
> should fix this issue.

Anyway, I made the following change:
remove catversion.h changes.
refactored the tests. Some of the explain(costs off, verbose) output
is very very long.
it's unreadable on the web browser. so I cut them into small pieces.
resolve duplicate OID issues.
slight refactored jsonbvalue_covert function, for the switch
statement, add a default branch.
see file v16-0001-Improve-the-performance-of-Jsonb-extraction.patch

you made a lot of changes, that might not be easy to get committed, i think.
Maybe we can split the patch into several pieces.
The first part is the original idea that: pattern: (jsonb(object) ->
'key')::numerica_data_type can be optimized.
The second part: is other cases where cast jsonb to scalar data type
can also be optimized.

So, I refactor your patch. only have optimized casts for:
(jsonb(object) -> 'key')::numerica_data_type.
We can optimize more cast cases, but IMHO,
make it as minimal as possible, easier to review, easier to understand.
If people think this performance gain is good, then later we can add
more on top of it.

summary: 2 files attached.
v16-0001-Improve-the-performance-of-Jsonb-extraction.patch
refactored of your patch, that covers all the cast optimization cases,
this file will run the CI test.

v1-0001-Improve-performance-of-Jsonb-extract-via-key-and-c.no-cfbot
this one also based on your patch. but as a minimum patch to optimize
(jsonb(object) -> 'key')::numerica_data_type case only. (this one will
not run CI test).

Attachment Content-Type Size
v16-0001-Improve-the-performance-of-Jsonb-extraction.patch text/x-patch 42.5 KB
v1-0001-Improve-performance-of-Jsonb-extract-via-key-and-c.no-cfbot application/octet-stream 20.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-01-08 00:16:19 Re: Add a perl function in Cluster.pm to generate WAL
Previous Message Alexander Cheshev 2024-01-07 23:49:11 Re: Multidimensional Histograms