Re: Full Text index is not using during OR operation

From: AI Rumman <rummandba(at)gmail(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Full Text index is not using during OR operation
Date: 2010-11-29 14:03:24
Message-ID: AANLkTikDbHyix8NZzzueA3S3bLaiUnXNyCSpk38Lkvn8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Oh! Actualy it is:
select replace('Dhaka University of Bangladesh:*', ' ',':* & ');
No space at start.

On Mon, Nov 29, 2010 at 6:37 PM, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:

> What does replace(' Dhaka University of Bangladesh:*', ' ',':* & ') means ?
> I see it produces something wrong for to_tsquery:
>
> test=# select replace(' Dhaka University of Bangladesh:*', ' ',':* & ');
>
> replace
> ---------------------------------------------------
> :* & Dhaka:* & University:* & of:* & Bangladesh:*
> (1 row)
>
> Oleg
>
>
> On Mon, 29 Nov 2010, AI Rumman wrote:
>
> explain
>> SELECT crmentity.crmid, crmentity.setype, crmentity.modifiedtime,
>> activity.subject,case when ( users.user_name not like '') then
>> users.user_name else groups.groupname end as user_name,
>> activity.date_start
>> FROM crmentity INNER JOIN activity ON crmentity.crmid =
>> activity.activityid
>> and crmentity.deleted = 0
>> LEFT JOIN activitygrouprelation ON activitygrouprelation.activityid =
>> crmentity.crmid
>> LEFT JOIN groups ON groups.groupname = activitygrouprelation.groupname
>> LEFT join users ON crmentity.smownerid= users.id
>> WHERE
>> to_tsvector(' en', for_fts( activity.subject)) @@ to_tsquery(' en',
>> replace(' Dhaka University of Bangladesh:*', ' ',':* & '))
>> or
>> to_tsvector(' en', for_fts( crmentity.description)) @@ to_tsquery(' en',
>> replace(' Dhaka University of Bangladesh:*', ' ',':* & '))
>> ORDER BY crmentity.modifiedtime DESC LIMIT 100
>>
>> QUERY PLAN
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------
>> Limit (cost=112724.54..112724.54 rows=1 width=99)
>> -> Sort (cost=112724.54..112724.54 rows=1 width=99)
>> Sort Key: crmentity.modifiedtime
>> -> Nested Loop Left Join (cost=0.00..112724.53 rows=1 width=99)
>> -> Nested Loop Left Join (cost=0.00..112724.24 rows=1 width=82)
>> -> Nested Loop Left Join (cost=0.00..112723.96 rows=1 width=79)
>> -> Nested Loop (cost=0.00..112723.68 rows=1 width=56)
>> Join Filter: ((to_tsvector('en'::regconfig,
>> regexp_replace((activity.subject)::text,
>> '(&[^;]+;)|(<[^>]+>)|([\\s\\r\\n\\t]+)'::text, ''::text, 'gs'::text)) @@
>> '''
>> Dhaka'':* & ''univers'':* & ''bangladesh'':*'::tsquery) OR
>> (to_tsvector('en'::regconfig, regexp_replace(crmentity.description,
>> '(&[^;]+;)|(<[^>]+>)|([\\s\\r\\n\\t]+)'::text, ' '::text, 'gs'::text)) @@
>> ''' Dhaka'':* & ''univers'':* & ''bangladesh'':*'::tsquery))
>> -> Index Scan using activity_pkey on activity (cost=0.00..10223.89
>> rows=343070 width=36)
>> -> Index Scan using crmentity_pkey on crmentity (cost=0.00..0.27 rows=1
>> width=151)
>> Index Cond: (crmentity.crmid = activity.activityid)
>> Filter: (crmentity.deleted = 0)
>> -> Index Scan using activitygrouprelation_activityid_idx on
>> activitygrouprelation (cost=0.00..0.27 rows=1 width=27)
>> Index Cond: (activitygrouprelation.activityid = crmentity.crmid)
>> -> Index Scan using groups_groupname_idx on groups (cost=0.00..0.27
>> rows=1
>> width=26)
>> Index Cond: ((groups.groupname)::text =
>> (activitygrouprelation.groupname)::text)
>> -> Index Scan using users_pkey on users (cost=0.00..0.27 rows=1 width=25)
>> Index Cond: (crmentity.smownerid = users.id)
>>
>>
>> The above query are not using fts indexes, even hang the server.
>>
>> But,
>>
>>
>> explain
>> SELECT crmentity.crmid, crmentity.setype, crmentity.modifiedtime,
>> activity.subject,case when ( users.user_name not like '') then
>> users.user_name else groups.groupname end as user_name,
>> activity.date_start
>> FROM crmentity INNER JOIN activity ON crmentity.crmid =
>> activity.activityid
>> and crmentity.deleted = 0
>> LEFT JOIN activitygrouprelation ON activitygrouprelation.activityid =
>> crmentity.crmid
>> LEFT JOIN groups ON groups.groupname = activitygrouprelation.groupname
>> LEFT join users ON crmentity.smownerid= users.id
>> WHERE
>> to_tsvector(' en', for_fts( activity.subject)) @@ to_tsquery(' en',
>> replace(' Dhaka University of Bangladesh:*', ' ',':* & '))
>> ORDER BY crmentity.modifiedtime DESC LIMIT 100
>>
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> Limit (cost=1.46..1.47 rows=1 width=99) (actual time=0.824..0.824 rows=0
>> loops=1)
>> -> Sort (cost=1.46..1.47 rows=1 width=99) (actual time=0.819..0.819
>> rows=0
>> loops=1)
>> Sort Key: crmentity.modifiedtime
>> Sort Method: quicksort Memory: 17kB
>> -> Nested Loop Left Join (cost=0.27..1.45 rows=1 width=99) (actual
>> time=0.752..0.752 rows=0 loops=1)
>> -> Nested Loop Left Join (cost=0.27..1.17 rows=1 width=82) (actual
>> time=0.750..0.750 rows=0 loops=1)
>> -> Nested Loop Left Join (cost=0.27..0.88 rows=1 width=79) (actual
>> time=0.748..0.748 rows=0 loops=1)
>> -> Nested Loop (cost=0.27..0.60 rows=1 width=56) (actual
>> time=0.746..0.746
>> rows=0 loops=1)
>> -> Bitmap Heap Scan on activity (cost=0.27..0.30 rows=1 width=36) (actual
>> time=0.744..0.744 rows=0 loops=1)
>> Recheck Cond: (to_tsvector('en'::regconfig,
>> regexp_replace((subject)::text,
>> '(&[^;]+;)|(<[^>]+>)|([\\s\\r\\n\\t]+)'::text, ' '::text,
>> 'gs'::text)) @@ ''' Dhaka'':* & ''univers'':* &
>> ''bangladesh'':*'::tsquery)
>> -> Bitmap Index Scan on ftx_en_activity_subject (cost=0.00..0.27 rows=1
>> width=0) (actual time=0.740..0.740 rows=0 loops=1)
>> Index Cond: (to_tsvector('en'::regconfig, regexp_replace((subject)::text,
>> '(&[^;]+;)|(<[^>]+>)|([\\s\\r\\n\\t]+)'::text, ' '::te
>> xt, 'gs'::text)) @@ ''' Dhaka'':* & ''univers'':* &
>> ''bangladesh'':*'::tsquery)
>> -> Index Scan using crmentity_pkey on crmentity (cost=0.00..0.29 rows=1
>> width=24) (never executed)
>> Index Cond: (crmentity.crmid = activity.activityid)
>> Filter: (crmentity.deleted = 0)
>> -> Index Scan using activitygrouprelation_activityid_idx on
>> activitygrouprelation (cost=0.00..0.27 rows=1 width=27) (never executed)
>> Index Cond: (activitygrouprelation.activityid = crmentity.crmid)
>> -> Index Scan using groups_groupname_idx on groups (cost=0.00..0.27
>> rows=1
>> width=26) (never executed)
>> Index Cond: ((groups.groupname)::text =
>> (activitygrouprelation.groupname)::text)
>> -> Index Scan using users_pkey on users (cost=0.00..0.27 rows=1 width=25)
>> (never executed)
>> Index Cond: (crmentity.smownerid = users.id)
>> Total runtime: 1.188 ms
>>
>>
>>
>>
>> explain
>> SELECT crmentity.crmid, crmentity.setype, crmentity.modifiedtime,
>> activity.subject,case when ( users.user_name not like '') then
>> users.user_name else groups.groupname end as user_name,
>> activity.date_start
>> FROM crmentity INNER JOIN activity ON crmentity.crmid =
>> activity.activityid
>> and crmentity.deleted = 0
>> LEFT JOIN activitygrouprelation ON activitygrouprelation.activityid =
>> crmentity.crmid
>> LEFT JOIN groups ON groups.groupname = activitygrouprelation.groupname
>> LEFT join users ON crmentity.smownerid= users.id
>> WHERE
>> to_tsvector(' en', for_fts( crmentity.description)) @@ to_tsquery(' en',
>> replace(' Dhaka University of Bangladesh:*', ' ',':* & '))
>> ORDER BY crmentity.modifiedtime DESC LIMIT 100
>>
>> QUERY PLAN
>>
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> Limit (cost=1.50..1.51 rows=1 width=99) (actual time=5.044..5.047 rows=1
>> loops=1)
>> -> Sort (cost=1.50..1.51 rows=1 width=99) (actual time=5.041..5.042
>> rows=1
>> loops=1)
>> Sort Key: crmentity.modifiedtime
>> Sort Method: quicksort Memory: 17kB
>> -> Nested Loop Left Join (cost=0.27..1.49 rows=1 width=99) (actual
>> time=4.998..5.012 rows=1 loops=1)
>> -> Nested Loop Left Join (cost=0.27..1.19 rows=1 width=82) (actual
>> time=4.952..4.961 rows=1 loops=1)
>> -> Nested Loop Left Join (cost=0.27..0.90 rows=1 width=79) (actual
>> time=4.949..4.956 rows=1 loops=1)
>> -> Nested Loop (cost=0.27..0.60 rows=1 width=56) (actual
>> time=4.943..4.948
>> rows=1 loops=1)
>> -> Bitmap Heap Scan on crmentity (cost=0.27..0.30 rows=1 width=24)
>> (actual
>> time=4.727..4.799 rows=3 loops=1)
>> Recheck Cond: (to_tsvector('en'::regconfig, regexp_replace(description,
>> '(&[^;]+;)|(<[^>]+>)|([\\s\\r\\n\\t]+)'::text, ' '::text, 'gs'
>> ::text)) @@ ''' Dhaka'':* & ''univers'':* & ''bangladesh'':*'::tsquery)
>> Filter: (deleted = 0)
>> -> Bitmap Index Scan on ftx_en_crmentity_description (cost=0.00..0.27
>> rows=1 width=0) (actual time=4.687..4.687 rows=3 loops=1)
>> Index Cond: (to_tsvector('en'::regconfig, regexp_replace(description,
>> '(&[^;]+;)|(<[^>]+>)|([\\s\\r\\n\\t]+)'::text, ' '::text,
>> 'gs'::text)) @@ ''' Dhaka'':* & ''univers'':* &
>> ''bangladesh'':*'::tsquery)
>> -> Index Scan using activity_pkey on activity (cost=0.00..0.29 rows=1
>> width=36) (actual time=0.043..0.043 rows=0 loops=3)
>> Index Cond: (activity.activityid = crmentity.crmid)
>> -> Index Scan using activitygrouprelation_activityid_idx on
>> activitygrouprelation (cost=0.00..0.29 rows=1 width=27) (actual
>> time=0.003..0.003
>> rows=0 loops=1)
>> Index Cond: (activitygrouprelation.activityid = crmentity.crmid)
>> -> Index Scan using groups_groupname_idx on groups (cost=0.00..0.27
>> rows=1
>> width=26) (actual time=0.001..0.001 rows=0 loops=1)
>> Index Cond: ((groups.groupname)::text =
>> (activitygrouprelation.groupname)::text)
>> -> Index Scan using users_pkey on users (cost=0.00..0.29 rows=1 width=25)
>> (actual time=0.033..0.035 rows=1 loops=1)
>> Index Cond: (crmentity.smownerid = users.id)
>> Total runtime: 5.229 ms
>> (22 rows)
>>
>>
>>
>> \d crmentity
>> Table "public.crmentity"
>> Column | Type | Modifiers
>> --------------+-----------------------------+--------------------
>> crmid | integer | not null
>> smcreatorid | integer | not null default 0
>> smownerid | integer | not null default 0
>> modifiedby | integer | not null default 0
>> setype | character varying(30) | not null
>> description | text |
>> createdtime | timestamp without time zone | not null
>> modifiedtime | timestamp without time zone | not null
>> viewedtime | timestamp without time zone |
>> status | character varying(50) |
>> version | integer | not null default 0
>> presence | integer | default 1
>> deleted | integer | not null default 0
>> Indexes:
>> "crmentity_pkey" PRIMARY KEY, btree (crmid)
>> "crmentity_createdtime_idx" btree (createdtime)
>> "crmentity_modifiedby_idx" btree (modifiedby)
>> "crmentity_modifiedtime_idx" btree (modifiedtime)
>> "crmentity_smcreatorid_idx" btree (smcreatorid)
>> "crmentity_smownerid_idx" btree (smownerid)
>> "ftx_en_crmentity_description" gin (to_tsvector('vcrm_en'::regconfig,
>> for_fts(description)))
>> "crmentity_deleted_idx" btree (deleted)
>> Referenced by:
>> TABLE "service" CONSTRAINT "fk_1_service" FOREIGN KEY (serviceid)
>> REFERENCES crmentity(crmid) ON DELETE CASCADE
>> TABLE "cc2crmentity" CONSTRAINT "fk_cc2crmentity_crmentity" FOREIGN KEY
>> (crm_id) REFERENCES crmentity(crmid) ON UPDATE CASCADE ON DELETE CASCADE
>>
>>
>> \d activity
>>
>> Table "public.activity"
>> Column | Type | Modifiers
>>
>> ------------------+------------------------+-------------------------------------------
>> activityid | integer | not null default 0
>> subject | character varying(250) | not null
>> semodule | character varying(20) |
>> activitytype | character varying(200) | not null
>> date_start | date | not null
>> due_date | date |
>> time_start | character varying(50) |
>> time_end | character varying(50) |
>> sendnotification | character varying(3) | not null default '0'::character
>> varying
>> duration_hours | character varying(2) |
>> duration_minutes | character varying(200) |
>> status | character varying(200) |
>> eventstatus | character varying(200) |
>> priority | character varying(200) |
>> location | character varying(150) |
>> notime | character varying(3) | not null default '0'::character varying
>> visibility | character varying(50) | not null default 'all'::character
>> varying
>> recurringtype | character varying(200) |
>> end_date | date |
>> end_time | character varying(50) |
>> Indexes:
>> "activity_pkey" PRIMARY KEY, btree (activityid)
>> "activity_activitytype_idx" btree (activitytype)
>> "activity_date_start_idx" btree (date_start)
>> "activity_due_date_idx" btree (due_date)
>> "activity_eventstatus_idx" btree (eventstatus)
>> "activity_status_idx" btree (status)
>> "activity_subject_idx" btree (subject)
>> "activity_time_start_idx" btree (time_start)
>> "ftx_en_activity_subject" gin (to_tsvector('vcrm_en'::regconfig,
>> for_fts(subject::text)))
>>
>>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tobias Brox 2010-11-29 14:32:16 Re: Full Text index is not using during OR operation
Previous Message Oleg Bartunov 2010-11-29 12:37:54 Re: Full Text index is not using during OR operation