Re: Help with Query Tuning

From: Adarsh Sharma <adarsh(dot)sharma(at)orkash(dot)com>
To: tv(at)fuzzy(dot)cz
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Help with Query Tuning
Date: 2011-03-18 04:17:38
Message-ID: 4D82DCE2.5020902@orkash.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks , it works now .. :-)

Here is the output :

pdc_uima=# SELECT count(*) from page_content WHERE publishing_date like
'%2010%' and
pdc_uima-# content_language='en' and content is not null and
isprocessable = 1 and
pdc_uima-# to_tsvector('english',content) @@
to_tsquery('english','Mujahid' || ' | '
pdc_uima(# || 'jihad' || ' | ' || 'Militant' || ' | ' || 'fedayeen' || ' | '
pdc_uima(# || 'insurgent' || ' | ' || 'terrORist' || ' | ' || 'cadre' ||
' | '
pdc_uima(# || 'civilians' || ' | ' || 'police' || ' | ' || 'cops' ||
'crpf' || ' | '
pdc_uima(# || 'defence' || ' | ' || 'dsf' || ' | ' || 'ssb' );

count
--------
137193
(1 row)

Time: 195441.894 ms

But my original query is to use AND also i.e

select count(*) from page_content where publishing_date like '%2010%'
and content_language='en' and content is not null and isprocessable = 1
and (content like '%Militant%'
OR content like '%jihad%' OR content like '%Mujahid%' OR
content like '%fedayeen%' OR content like '%insurgent%' OR content
like '%terrORist%' OR
content like '%cadre%' OR content like '%civilians%' OR content like
'%police%' OR content like '%defence%' OR content like '%cops%' OR
content like '%crpf%' OR content like '%dsf%' OR content like '%ssb%')
AND (content like '%kill%' OR content like '%injure%');

count
-------
57061
(1 row)

Time: 19423.087 ms

Now I have to add AND condition ( AND (content like '%kill%' OR content
like '%injure%') ) also.

Thanks & Regards,
Adarsh Sharma

tv(at)fuzzy(dot)cz wrote:
>> tv(at)fuzzy(dot)cz wrote:
>>
>>>> Yes , I think we caught the problem but it results in the below error :
>>>>
>>>> SELECT count(*) from page_content
>>>> WHERE publishing_date like '%2010%' and content_language='en' and
>>>> content is not null and isprocessable = 1 and
>>>> to_tsvector('english',content) @@ to_tsquery('english','Mujahid ' ||
>>>> 'jihad ' || 'Militant ' || 'fedayeen ' || 'insurgent ' || 'terrORist '
>>>> || 'cadre ' || 'civilians ' || 'police ' || 'defence ' || 'cops ' ||
>>>> 'crpf ' || 'dsf ' || 'ssb');
>>>>
>>>> ERROR: syntax error in tsquery: "Mujahid jihad Militant fedayeen
>>>> insurgent terrORist cadre civilians police defence cops crpf dsf ssb"
>>>>
>>>>
>>> The text passed to to_tsquery has to be a proper query, i.e. single
>>> tokens
>>> separated by boolean operators. In your case, you should put there '|'
>>> (which means OR) to get something like this
>>>
>>> 'Mujahid | jihad | Militant | ...'
>>>
>>> or you can use plainto_tsquery() as that accepts simple text, but it
>>> puts
>>> '&' (AND) between the tokens and I guess that's not what you want.
>>>
>>> Tomas
>>>
>>>
>>>
>> What to do to make it satisfies the OR condition to match any of the
>> to_tsquery values as we got it right through like '%Mujahid' or .....
>> or ....
>>
>
> You can't force the plainto_tsquery to somehow use the OR instead of AND.
> You need to modify the piece of code that produces the search text to put
> there '|' characters. So do something like this
>
> SELECT count(*) from page_content WHERE publishing_date like '%2010%' and
> content_language='en' and content is not null and isprocessable = 1 and
> to_tsvector('english',content) @@ to_tsquery('english','Mujahid' || ' | '
> || 'jihad' || ' | ' || 'Militant' || ' | ' || 'fedayeen);
>
> Not sure where does this text come from, but you can do this in a higher
> level language, e.g. in PHP. Something like this
>
> $words = implode(' | ', explode(' ',$text));
>
> and then pass the $words into the query. Or something like that.
>
> Tomas
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jesper Krogh 2011-03-18 06:19:04 Re: Request for feedback on hardware for a new database server
Previous Message Scott Marlowe 2011-03-18 03:02:28 Re: Request for feedback on hardware for a new database server