| From: | Shaheed Haque <shaheedhaque(at)gmail(dot)com> | 
|---|---|
| To: | pgsql-general(at)lists(dot)postgresql(dot)org | 
| Subject: | JSONB index not in use, but is TOAST the real cause of slow query? | 
| Date: | 2022-05-28 17:53:48 | 
| Message-ID: | CAHAc2jf6Yp2uddfKQyGp=byvOuUzYpsHoPjyKXnUhS5+=Acuww@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
Hi,
I have a database table with a modest number of rows (<1000) but where
one column in the table is a JSONB "snapshot" which can be a few MB in
size. Generally, this is a great fit for the read-write access
patterns involved, but there is a read-query which is VERY slow.
I've searched via Google and in this mailing list archive and found
lots of material, but nothing that explains what is going on...or how
to fix it. The table columns look like this:
    ...a primary key...
    ...various other keys including a couple of FKs...
    snapshot JSONB
and before I did anything, the indices looked like this:
   "paiyroll_payrun_pkey" PRIMARY KEY, btree (process_ptr_id)
   "paiyroll_payrun_company_id_ce341888" btree (company_id)
   "paiyroll_payrun_schedule_id_1593f55f" btree (schedule_id)
The data in one row's "snapshot" looks a bit like this:
{
    "stuff": {},
    "more other stuff": {},
    "employee": {
         "1234": {},
         "56789": {},
    }
}
The query that is slow can be approximated like this:
   SELECT snapshot ->'stuff'->'item' FROM paiyroll_payrun WHERE
snapshot ->'employee' ? '2209';
When I add this index:
CREATE INDEX idx1 ON paiyroll_payrun USING gin ((snapshot ->'employee'));
the analyser says this:
======================
foo =# explain analyse SELECT snapshot ->'company'->'legal_name' FROM
paiyroll_payrun WHERE snapshot ->'employee' ? '2209';
                                                  QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Seq Scan on paiyroll_payrun  (cost=0.00..29.13 rows=9 width=32)
(actual time=50.185..2520.983 rows=104 loops=1)
  Filter: ((snapshot -> 'employee'::text) ? '2209'::text)
  Rows Removed by Filter: 835
Planning Time: 0.075 ms
Execution Time: 2521.004 ms
(5 rows)
======================
So, over 2.5 seconds to sequentially scan ~850 rows. Am I right to
presume the INDEX is not used because of the number of rows? Is there
a way to verify that? And how can I understand the dreadful amount of
time (of course, this is just on my dev machine, but still...)? Is
there a way to see/tweak what TOAST costs or indeed to confirm if it
is even in use?
Any help appreciated.
Thanks, Shaheed
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2022-05-28 18:41:56 | Re: JSONB index not in use, but is TOAST the real cause of slow query? | 
| Previous Message | 浩辰 何 | 2022-05-28 12:11:57 | Improve configurability for IO related behavoir |