Re: Help with Query Tuning

From: Reid Thompson <Reid(dot)Thompson(at)ateb(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Help with Query Tuning
Date: 2011-03-16 17:24:30
Message-ID: 4D80F24E.1090807@ateb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 03/16/2011 05:13 AM, Adarsh Sharma wrote:
> Dear all,
>
> I am facing a problem while creating the index to make the below query run faster. My table size is near about 1065 MB and 428467 rows.
>
> explain analyze select count(*) from page_content where publishing_date like '%2010%' and content_language='en' and content is not
> null and isprocessable = 1 and (content like '%Militant%'
> OR content like '%jihad%' OR content like '%Mujahid%' OR
> content like '%fedayeen%' OR content like '%insurgent%' OR content like '%terrorist%' OR
> content like '%cadre%' OR content like '%civilians%' OR content like '%police%' OR content like '%defence%' OR content like '%cops%'
> OR content like '%crpf%' OR content like '%dsf%' OR content like '%ssb%') AND (content like '%kill%' or content like '%injure%');
>
> *Output:
>
> * Aggregate (cost=107557.78..107557.79 rows=1 width=0) (actual time=18564.631..18564.631 rows=1 loops=1)
> -> Seq Scan on page_content (cost=0.00..107466.82 rows=36381 width=0) (actual time=0.146..18529.371 rows=59918 loops=1)
> Filter: ((content IS NOT NULL) AND (publishing_date ~~ '%2010%'::text) AND (content_language = 'en'::bpchar) AND (isprocessable = 1)
> AND (((content)
> ::text ~~ '%kill%'::text) OR ((content)::text ~~ '%injure%'::text)) AND (((content)::text ~~ '%Militant%'::text) OR ((content)::text
> ~~ '%jihad%'::text) OR (
> (content)::text ~~ '%Mujahid%'::text) OR ((content)::text ~~ '%fedayeen%'::text) OR ((content)::text ~~ '%insurgent%'::text) OR
> ((content)::text ~~ '%terrori
> st%'::text) OR ((content)::text ~~ '%cadre%'::text) OR ((content)::text ~~ '%civilians%'::text) OR ((content)::text ~~
> '%police%'::text) OR ((content)::text
> ~~ '%defence%'::text) OR ((content)::text ~~ '%cops%'::text) OR ((content)::text ~~ '%crpf%'::text) OR ((content)::text ~~
> '%dsf%'::text) OR ((content)::text
> ~~ '%ssb%'::text)))
> Total runtime: 18564.673 ms
>

You should read the documentation regarding indices and pattern matching as well as fts.

http://www.postgresql.org/docs/8.3/static/indexes-types.html

The optimizer can also use a B-tree index for queries involving the pattern matching operators LIKE and ~ if the pattern is a
constant and is anchored to the beginning of the string — for example, col LIKE 'foo%' or col ~ '^foo', but not col LIKE '%bar'.
However, if your server does not use the C locale you will need to create the index with a special operator class to support
indexing of pattern-matching queries. See Section 11.9 below. It is also possible to use B-tree indexes for ILIKE and ~*, but only
if the pattern starts with non-alphabetic characters, i.e. characters that are not affected by upper/lower case conversion.

I believe that your query as written using '%pattern%' will always be forced to use sequential scans.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Euler Taveira de Oliveira 2011-03-16 17:27:36 Re: pg_xlog size
Previous Message Shaun Thomas 2011-03-16 17:05:06 Re: Adding additional index causes 20,000x slowdown for certain select queries - postgres 9.0.3