Re: Is it possible to index "deep" into a JSONB column?

From: Shaheed Haque <shaheedhaque(at)gmail(dot)com>
To: Imre Samu <pella(dot)samu(at)gmail(dot)com>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Is it possible to index "deep" into a JSONB column?
Date: 2022-05-30 18:59:54
Message-ID: CAHAc2jdnrMe0N6hTjM2tzWR9C2m4MiqpdqJ0P36Bjm-6WCAxtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Imre,

I'm gradually working my way into the combination of SQL, JSONB and
jsonpath that this involves even without the indexing, but this looks
very helpful/promising, especially on the indexing. Thanks a lot for
the input,

Shaheed

On Mon, 30 May 2022 at 11:50, Imre Samu <pella(dot)samu(at)gmail(dot)com> wrote:
>
> > In principle, I believe this allows index-assisted access to keys and
> > values nested in arrays and inner objects but in practice, it seems
> > the planner "often" decides to ignore the index in favour of a table scan.
>
> part II. index usage ; see the "Bitmap Index Scan on jpqarr_idx"
>
> SET enable_seqscan = OFF;
> select * from jsonb_table
> where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth') @> to_jsonb('1971-02-02'::TEXT);
> ;
> +----+---------------------------------------------------------------------------------------------------------------+
> | id | jsonb_col |
> +----+---------------------------------------------------------------------------------------------------------------+
> | 2 | {"stuff": {}, "employee": {"7011": {"date_of_birth": "1970-01-01"}, "7122": {"date_of_birth": "1971-02-02"}}} |
> +----+---------------------------------------------------------------------------------------------------------------+
> (1 row)
>
>
> EXPLAIN ANALYZE select * from jsonb_table
> where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth') @> to_jsonb('1971-02-02'::TEXT);
> ;
> +---------------------------------------------------------------------------------------------------------------------------------------------------------+
> | QUERY PLAN |
> +---------------------------------------------------------------------------------------------------------------------------------------------------------+
> | Bitmap Heap Scan on jsonb_table (cost=3.00..4.52 rows=1 width=36) (actual time=0.056..0.059 rows=1 loops=1) |
> | Recheck Cond: (jsonb_path_query_array(jsonb_col, '$."employee".*."date_of_birth"'::jsonpath, '{}'::jsonb, false) @> to_jsonb('1971-02-02'::text)) |
> | Heap Blocks: exact=1 |
> | -> Bitmap Index Scan on jpqarr_idx (cost=0.00..3.00 rows=1 width=0) (actual time=0.026..0.027 rows=1 loops=1) |
> | Index Cond: (jsonb_path_query_array(jsonb_col, '$."employee".*."date_of_birth"'::jsonpath, '{}'::jsonb, false) @> to_jsonb('1971-02-02'::text)) |
> | Planning Time: 0.255 ms |
> | Execution Time: 0.122 ms |
> +---------------------------------------------------------------------------------------------------------------------------------------------------------+
> (7 rows)
>
> regards,
> Imre
>
>
> Imre Samu <pella(dot)samu(at)gmail(dot)com> ezt írta (időpont: 2022. máj. 30., H, 12:30):
>>
>> Hi Shaheed,
>>
>> > WHAT GOES HERE
>>
>> imho check the: jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' )
>>
>> may example:
>>
>> CREATE TABLE jsonb_table (
>> id serial primary key,
>> jsonb_col JSONB
>> );
>>
>> INSERT INTO jsonb_table(jsonb_col)
>> VALUES
>> ('{"stuff": {},"employee": {"8011": {"date_of_birth": "1980-01-01"},"8222": {"date_of_birth": "1982-02-02"}}}'),
>> ('{"stuff": {},"employee": {"7011": {"date_of_birth": "1970-01-01"},"7122": {"date_of_birth": "1971-02-02"}}}'),
>> ('{"stuff": {},"employee": {"a12": {"date_of_birth": "2000-01-01"},"b56": {"date_of_birth": "2000-02-02"}}}')
>> ;
>>
>> select jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' ) from jsonb_table;
>>
>> -- create index
>> create index jpqarr_idx
>> on jsonb_table
>> using gin ( jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' ) );
>> -- tests:
>> select id from jsonb_table
>> where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth') @> to_jsonb('2000-01-01'::TEXT);
>> ;
>> DROP TABLE
>> CREATE TABLE
>> INSERT 0 3
>> +------------------------------+
>> | jsonb_path_query_array |
>> +------------------------------+
>> | ["1980-01-01", "1982-02-02"] |
>> | ["1970-01-01", "1971-02-02"] |
>> | ["2000-01-01", "2000-02-02"] |
>> +------------------------------+
>> (3 rows)
>>
>> CREATE INDEX
>> +----+
>> | id |
>> +----+
>> | 3 |
>> +----+
>> (1 row)
>>
>>
>> Regards,
>> Imre
>>
>> Shaheed Haque <shaheedhaque(at)gmail(dot)com> ezt írta (időpont: 2022. máj. 29., V, 22:53):
>>>
>>> Suppose I have a JSONB field called "snapshot". I can create a GIN
>>> index on it like this:
>>>
>>> create index idx1 on mytable using gin (snapshot);
>>>
>>> In principle, I believe this allows index-assisted access to keys and
>>> values nested in arrays and inner objects but in practice, it seems
>>> the planner "often" decides to ignore the index in favour of a table
>>> scan. (As discussed elsewhere, this is influenced by the number of
>>> rows, and possibly other criteria too).
>>>
>>> Now, I know it is possible to index inner objects, so that is snapshot
>>> looks like this:
>>>
>>> {
>>> "stuff": {},
>>> "more other stuff": {},
>>> "employee": {
>>> "1234": {"date_of_birth": "1970-01-01"},
>>> "56B789": {"date_of_birth": "1971-02-02"},
>>> }
>>> }
>>>
>>> I can say:
>>>
>>> create index idx2 on mytable using gin ((snapshot -> 'employee'));
>>>
>>> But what is the syntax to index only on date_of_birth? I assume a
>>> btree would work since it is a primitive value, but WHAT GOES HERE in
>>> this:
>>>
>>> create index idx3 on mytable using btree ((snapshot ->'employee' ->
>>> WHAT GOES HERE -> 'date_of_birth'));
>>>
>>> I believe an asterisk "*" would work if 'employee' was an array, but
>>> here it is nested object with keys. If it helps, the keys are
>>> invariably numbers (in quoted string form, as per JSON).
>>>
>>> Thanks, Shaheed
>>>
>>>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Janes 2022-05-30 19:29:31 Re: JSONB index not in use, but is TOAST the real cause of slow query?
Previous Message Fabien COELHO 2022-05-30 17:53:33 Re: psql 15beta1 does not print notices on the console until transaction completes