Re: index on ILIKE/LIKE - PostgreSQL 9.2

From: Melvin Davidson <melvin6925(at)gmail(dot)com>
To: Lucas Possamai <drum(dot)lucas(at)gmail(dot)com>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: index on ILIKE/LIKE - PostgreSQL 9.2
Date: 2016-05-12 01:30:18
Message-ID: CANu8FizFEiaF=W_0NrO1u3LFJadaL+_1MZ3XFbqsMZiCoooccA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, May 11, 2016 at 8:30 PM, Lucas Possamai <drum(dot)lucas(at)gmail(dot)com>
wrote:

> Hi there!
>
> I've got a simple but slow query:
>
> SELECT DISTINCT title
>> FROM ja_jobs WHERE title ILIKE '%RYAN WER%'
>> and clientid = 31239 AND time_job > 1457826264
>> order BY title
>> limit 10
>
>
> Explain analyze:
>
> Limit (cost=5946.40..5946.41 rows=1 width=19) (actual
>> time=2746.759..2746.772 rows=1 loops=1)
>> -> Unique (cost=5946.40..5946.41 rows=1 width=19) (actual
>> time=2746.753..2746.763 rows=1 loops=1)
>> -> Sort (cost=5946.40..5946.41 rows=1 width=19) (actual
>> time=2746.750..2746.754 rows=4 loops=1)
>> Sort Key: "title"
>> Sort Method: quicksort Memory: 25kB
>> -> Bitmap Heap Scan on "ja_jobs" (cost=49.02..5946.39
>> rows=1 width=19) (actual time=576.275..2746.609 rows=4 loops=1)
>> Recheck Cond: (("clientid" = 31239) AND ("time_job" >
>> 1457826264))
>> Filter: (("title")::"text" ~~* '%RYAN WER%'::"text")
>> Rows Removed by Filter: 791
>> -> Bitmap Index Scan on "ix_jobs_client_times"
>> (cost=0.00..49.02 rows=1546 width=0) (actual time=100.870..100.870
>> rows=795 loops=1)
>> Index Cond: (("clientid" = 31239) AND
>> ("time_job" > 1457826264))
>> Total runtime: 2746.879 ms
>
>
> Then, I created a trgm index:
>
> CREATE INDEX ix_ja_jobs_trgm_gin ON public.ja_jobs USING gin (title
> gin_trgm_ops);
>
> Explain analyze after the index: (Yes, I ran the analyze)
>
> Limit (cost=389.91..389.91 rows=1 width=20) (actual
>> time=3720.511..3720.511 rows=0 loops=1)
>> -> Unique (cost=389.91..389.91 rows=1 width=20) (actual
>> time=3720.507..3720.507 rows=0 loops=1)
>> -> Sort (cost=389.91..389.91 rows=1 width=20) (actual
>> time=3720.505..3720.505 rows=0 loops=1)
>> Sort Key: "title"
>> Sort Method: quicksort Memory: 25kB
>> -> Bitmap Heap Scan on "ja_jobs" (cost=385.88..389.90
>> rows=1 width=20) (actual time=3720.497..3720.497 rows=0 loops=1)
>> Recheck Cond: (("clientid" = 31239) AND ("time_job" >
>> 1457826264) AND (("title")::"text" ~~ '%RYAN WER%'::"text"))
>> Rows Removed by Index Recheck: 4
>> -> BitmapAnd (cost=385.88..385.88 rows=1 width=0)
>> (actual time=3720.469..3720.469 rows=0 loops=1)
>> -> Bitmap Index Scan on "ix_jobs_client_times"
>> (cost=0.00..50.00 rows=1644 width=0) (actual time=0.142..0.142 rows=795
>> loops=1)
>> Index Cond: (("clientid" = 31239) AND
>> ("time_job" > 1457826264))
>> -> Bitmap Index Scan on "ix_ja_jobs_trgm_gin"
>> (cost=0.00..335.63 rows=484 width=0) (actual time=3720.213..3720.213
>> rows=32 loops=1)
>> Index Cond: (("title")::"text" ~~ '%RYAN
>> WER%'::"text")
>> Total runtime: 3720.653 ms
>
>
>
> so.. the query is still slow..
> Do you guys know what can be done ? related to the ILIKE?
>
> cheers
> Lucas
>

The main problem is WHERE title ILIKE '%RYAN WER%'
When you put a % on the left of the text, there is no way to optimize that,
so yes, it will be slow.

If you can eliminate the leading percent and just have trailing, it will be
much faster.

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Melvin Davidson 2016-05-12 01:33:32 Re: Scaling Database for heavy load
Previous Message Scott Marlowe 2016-05-12 00:52:16 Re: Scaling Database for heavy load