| From: | Sushant Sinha <sushant354(at)gmail(dot)com> | 
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org | 
| Cc: | shamnad(at)gmail(dot)com | 
| Subject: | dot to be considered as a word delimiter? | 
| Date: | 2009-05-30 05:59:29 | 
| Message-ID: | 1243663169.12123.244.camel@dragflick | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Currently it seems like that dot is not considered as a word delimiter
by the english parser.
lawdb=# select to_tsvector('english', 'Mr.J.Sai Deepak');
       to_tsvector       
-------------------------
 'deepak':2 'mr.j.sai':1
(1 row)
So the word obtained is "mr.j.sai" rather than three words "mr", "j",
"sai"
It does it correctly if there is space in between, as space is
definitely a word delimiter.
lawdb=# select to_tsvector('english', 'Mr. J. Sai Deepak');
           to_tsvector           
---------------------------------
 'j':2 'mr':1 'sai':3 'deepak':4
(1 row)
I think that dot should be considered by as a word delimiter because
when dot is not followed by a space, most of the time it is an error in
typing. Beside they are not many valid english words that have dot in
between.
-Sushant.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David E. Wheeler | 2009-05-30 06:19:00 | Re: search_path improvements WAS: search_path vs extensions | 
| Previous Message | Robert Haas | 2009-05-30 03:48:54 | explan refactoring |