Re: BUG #13964: unexpected result from to_tsvector

From: Ruxandra Durus <Ruxandra(dot)Durus(at)vauban(dot)ro>
To: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13964: unexpected result from to_tsvector
Date: 2016-02-18 12:33:12
Message-ID: A11D563F5C691648B6B9BF9568EADB967A29466E@ExchangeVB.hub.vauban.ro
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello,

Thank you for your fast response, but I cannot install patches, and I don't know how. I hope this fix will be included in a version of PostgreSQL in the future.

Thank you for your time,
Ruxandra Durus

-----Original Message-----
From: Artur Zakirov [mailto:a(dot)zakirov(at)postgrespro(dot)ru]
Sent: Thursday, February 18, 2016 12:54 PM
To: Ruxandra Durus; pgsql-bugs(at)postgresql(dot)org
Subject: Re: [BUGS] BUG #13964: unexpected result from to_tsvector

On 17.02.2016 11:00, ruxandra(dot)durus(at)vauban(dot)ro wrote:
>
> My version of PostgreSQL is:
> "PostgreSQL 9.5beta1 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
> 4.4.7
> 20120313 (Red Hat 4.4.7-16), 64-bit"
>
> More details about the operating system:
> Linux javatesting 2.6.32-573.7.1.el6.x86_64 #1 SMP Tue Sep 22 22:00:00
> UTC
> 2015 x86_64 x86_64 x86_64 GNU/Linux
>
> I am using pgAdmin version 1.20.0 to query the database.
>
> I am using your full text search (which works great), but i have a
> small
> problem:
> SELECT to_tsvector('simple', 'test(at)vauban-reg(dot)ro');
>
> returns "'test(at)vauban-reg(dot)ro':1"
>
> which is exactly what I need.
>
>
> But when I run :
>
> SELECT to_tsvector('simple', 'test(at)123-reg(dot)ro');
>
> I get:
> "'123':2 'reg.ro':3 'test':1"
>
> instead of "'test(at)123-reg(dot)ro':1"
>
>>From the documentation here
> http://www.postgresql.org/docs/current/static/pgtrgm.html , point
> F.30.4. I understood that with "simple" option only space is a
> separator for the stems. Is it a bug or am I doing something wrong?
>
> Thank you for your time,
> Ruxandra Durus
>

Hi,

It seems that this is a text search parser issue. More informative queries:

=> SELECT * FROM ts_debug('simple', 'test(at)vauban-reg(dot)ro');
alias | description | token | dictionaries | dictionary
| lexemes
-------+---------------+--------------------+--------------+------------+----------------------
email | Email address | test(at)vauban-reg(dot)ro | {simple} | simple
| {test(at)vauban-reg(dot)ro}
(1 row)

=> SELECT * FROM ts_debug('simple', 'test(at)123-reg(dot)ro');
alias | description | token | dictionaries | dictionary |
lexemes
-----------+------------------+--------+--------------+------------+----
-----------+------------------+--------+--------------+------------+----
-----------+------------------+--------+--------------+------------+--
asciiword | Word, all ASCII | test | {simple} | simple | {test}
blank | Space symbols | @ | {} | |
uint | Unsigned integer | 123 | {simple} | simple | {123}
blank | Space symbols | - | {} | |
host | Host | reg.ro | {simple} | simple |
{reg.ro}
(5 rows)

Attached patch can fix it. Is this a bug? Should I create a record in the commitfest?

This patch also allows to parser work with emails '123(at)123-reg(dot)ro' and 'test(at)123_reg(dot)ro' correctly.

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com Russian Postgres Company

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message tarkhil 2016-02-18 12:43:00 BUG #13971: SysV shm is used regardless of config
Previous Message brian 2016-02-18 10:56:33 BUG #13970: Vacuum hangs on particular table; cannot be terminated - requires `kill -QUIT pid`