Re: querying with index on jsonb slower than standard column. Why?

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tim Dudgeon <tdudgeon(dot)ml(at)gmail(dot)com>, pgsql-sql(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: querying with index on jsonb slower than standard column. Why?
Date: 2014-12-08 21:14:48
Message-ID: 548614C8.90203@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance pgsql-sql

On 12/08/2014 12:53 PM, Tom Lane wrote:
> I wrote:
>> Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> writes:
>>> Seems work_mem is the key:
>
>> Fascinating. So there's some bad behavior in the lossy-bitmap stuff
>> that's exposed by one case but not the other.
>
> Meh. I was overthinking it. A bit of investigation with oprofile exposed
> the true cause of the problem: whenever the bitmap goes lossy, we have to
> execute the "recheck" condition for each tuple in the page(s) that the
> bitmap has a lossy reference to. So in the fast case we are talking about
>
> Recheck Cond: ((assay1_ic50 > 90::double precision) AND (assay2_ic50 < 10::double precision))
>
> which involves little except pulling the float8 values out of the tuple
> and executing float8gt and float8lt. In the slow case we have got
>
> Recheck Cond: ((((data ->> 'assay1_ic50'::text))::double precision > 90::double precision) AND (((data ->> 'assay2_ic50'::text))::double precision < 10::double precision))
>
> which means we have to pull the JSONB value out of the tuple, search
> it to find the 'assay1_ic50' key, convert the associated value to text
> (which is not exactly cheap because *the value is stored as a numeric*),
> then reparse that text string into a float8, after which we can use
> float8gt. And then probably do an equivalent amount of work on the way
> to making the other comparison.
>
> So this says nothing much about the lossy-bitmap code, and a lot about
> how the JSONB code isn't very well optimized yet. In particular, the
> decision not to provide an operator that could extract a numeric field
> without conversion to text is looking pretty bad here.
>

I think I understand the above.

I redid the test on my 32-bit machine, setting work_mem=16MB, and I got comparable results
to what I saw on the 64-bit machine. So, what I am still am puzzled by is why work_mem seems
to make the two paths equivalent in time?:

Fast case, assay1_ic50 > 90 and assay2_ic50 < 10:
1183.997 ms

Slow case, (data->>'assay1_ic50')::float > 90 and (data->>'assay2_ic50')::float < 10;:
1190.187 ms

>
> regards, tom lane
>
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2014-12-08 21:22:56 Re: [SQL] querying with index on jsonb slower than standard column. Why?
Previous Message Tom Lane 2014-12-08 20:53:09 Re: [SQL] querying with index on jsonb slower than standard column. Why?

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2014-12-08 21:22:56 Re: [SQL] querying with index on jsonb slower than standard column. Why?
Previous Message Tom Lane 2014-12-08 20:53:09 Re: [SQL] querying with index on jsonb slower than standard column. Why?