From: | Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | index on jsonb col with 2D array inside the json |
Date: | 2019-02-19 15:59:01 |
Message-ID: | CA+t6e1nFWy5n7B5RP3Ri=UFoZuV5tRiD20ozubMaDf_3PCbtuA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi,
I have a table with json col : R(object int, data jsonb).
Example for content :
object | data
----------------+---------------------------------------
50 | {"ranges": [[1, 1]]}
51 | {"ranges": [[5, 700],[1,5],[9,10}
52 | {"ranges": [[4, 200],[2,4],[3,4]]}
53 | {"ranges": [[2, 2]]}
54 | {"ranges": [[5, 10]]}
Now I tried to query for all the objects that contains a specific range,
for example [2,2] :
explain analyze SELECT *
FROM R d
WHERE EXISTS (
SELECT FROM jsonb_array_elements(R.data -> 'ranges') rng
WHERE (rng->>0)::bigint <= 2 and (rng->>1)::bigint >= 2
);
I saw that the gin index isnt suitable for this type of comparison.
However, I saw that the gist index is suitable to handle ranges. Any idea
of I can implement a gist index here ?
In addition, I saved the same data in relational table
R2(object,range_first,range_last).
The previous data in this format :
object range_first range_last
50 1 1
51 5 700
51 1 5
51 9 10
i compared the first query with :
explain analyze select * from R2 where range_first <=2 and range_last
>= 2; (I have an index on range_first,range_last that is used)
The query on the jsonb column was 100x slower (700 m/s vs 7m/s). The
question is, Am I missing an index or the jsonb datatype isnt suitable for
this structure of data. The R2 table contains 500K records while the R
table contains about 200K records.
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Lewis | 2019-02-19 16:28:25 | Re: index on jsonb col with 2D array inside the json |
Previous Message | Jung, Jinho | 2019-02-18 21:08:52 | Re: Performance regressions found using sqlfuzz |