Re: Help with Query Tuning

From: Adarsh Sharma <adarsh(dot)sharma(at)orkash(dot)com>
To: Kenneth Marshall <ktm(at)rice(dot)edu>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Help with Query Tuning
Date: 2011-03-17 04:48:36
Message-ID: 4D8192A4.6080101@orkash.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks Marshall, would I need to change the data type of *content
*column to tsvector and create a Gist Index on it.

Best Regards,
Adarsh

Kenneth Marshall wrote:
> On Wed, Mar 16, 2011 at 02:43:38PM +0530, Adarsh Sharma wrote:
>
>> Dear all,
>>
>> I am facing a problem while creating the index to make the below query run
>> faster. My table size is near about 1065 MB and 428467 rows.
>>
>> explain analyze select count(*) from page_content where publishing_date
>> like '%2010%' and content_language='en' and content is not null and
>> isprocessable = 1 and (content like '%Militant%'
>> OR content like '%jihad%' OR content like '%Mujahid%' OR
>> content like '%fedayeen%' OR content like '%insurgent%' OR content like
>> '%terrorist%' OR
>> content like '%cadre%' OR content like '%civilians%' OR content like
>> '%police%' OR content like '%defence%' OR content like '%cops%' OR content
>> like '%crpf%' OR content like '%dsf%' OR content like '%ssb%') AND (content
>> like '%kill%' or content like '%injure%');
>>
>> *Output:
>>
>> * Aggregate (cost=107557.78..107557.79 rows=1 width=0) (actual
>> time=18564.631..18564.631 rows=1 loops=1)
>> -> Seq Scan on page_content (cost=0.00..107466.82 rows=36381 width=0)
>> (actual time=0.146..18529.371 rows=59918 loops=1)
>> Filter: ((content IS NOT NULL) AND (publishing_date ~~
>> '%2010%'::text) AND (content_language = 'en'::bpchar) AND (isprocessable =
>> 1) AND (((content)
>> ::text ~~ '%kill%'::text) OR ((content)::text ~~ '%injure%'::text)) AND
>> (((content)::text ~~ '%Militant%'::text) OR ((content)::text ~~
>> '%jihad%'::text) OR (
>> (content)::text ~~ '%Mujahid%'::text) OR ((content)::text ~~
>> '%fedayeen%'::text) OR ((content)::text ~~ '%insurgent%'::text) OR
>> ((content)::text ~~ '%terrori
>> st%'::text) OR ((content)::text ~~ '%cadre%'::text) OR ((content)::text ~~
>> '%civilians%'::text) OR ((content)::text ~~ '%police%'::text) OR
>> ((content)::text
>> ~~ '%defence%'::text) OR ((content)::text ~~ '%cops%'::text) OR
>> ((content)::text ~~ '%crpf%'::text) OR ((content)::text ~~ '%dsf%'::text)
>> OR ((content)::text
>> ~~ '%ssb%'::text)))
>> Total runtime: 18564.673 ms
>>
>>
>> *Index on that Table :
>>
>> *CREATE INDEX idx_page_id
>> ON page_content
>> USING btree
>> (crawled_page_id);
>>
>> *Index I create :*
>> CREATE INDEX idx_page_id_content
>> ON page_content
>> USING btree
>> (crawled_page_id,content_language,publishing_date,isprocessable);
>>
>> *Index that fail to create:
>>
>> *CREATE INDEX idx_page_id_content1
>> ON page_content
>> USING btree
>> (crawled_page_id,content);
>>
>> Error :-ERROR: index row requires 13240 bytes, maximum size is 8191
>> ********** Error **********
>>
>> ERROR: index row requires 13240 bytes, maximum size is 8191
>> SQL state: 54000
>>
>> How to resolve this error
>> Please give any suggestion to tune the query.
>>
>> Thanks & best Regards,
>>
>> Adarsh Sharma
>>
>>
>
> You should probably be looking at using full-text indexing:
>
> http://www.postgresql.org/docs/9.0/static/textsearch.html
>
> or limit the size of content for the index.
>
> Cheers,
> Ken
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Adarsh Sharma 2011-03-17 06:25:21 Re: Help with Query Tuning
Previous Message Fujii Masao 2011-03-17 03:09:20 Re: pg_xlog size