text search and "filenames"

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: text search and "filenames"
Date: 2007-10-25 13:47:40
Message-ID: 20071025134740.GK5661@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I noticed that the default parser does not recognize Windows-style
filenames:

alvherre=# SELECT alias, description, token FROM ts_debug(e'c:\\archivos');
alias | description | token
-----------+-----------------+----------
asciiword | Word, all ASCII | c
blank | Space symbols | :\
asciiword | Word, all ASCII | archivos
(3 lignes)

I played with it a bit (see attached patch -- basically I added \ in all
places where a / was being parsed, in the file-path states) and managed
to have it parse some naive versions, like

alvherre=# SELECT alias, description, token FROM ts_debug(e'c:\\archivos\\foo');
alias | description | token
-------+-------------------+-----------------
file | File or path name | c:\archivos\foo
(1 ligne)

However it fails as soon as you have a space, which is quite common on
Windows, for example

alvherre=# SELECT alias, description, token FROM ts_debug(e'c:\\Program Files\\');
alias | description | token
-----------+-------------------+------------
file | File or path name | c:\Program
blank | Space symbols |
asciiword | Word, all ASCII | Files
blank | Space symbols | \
(4 lignes)

It also fails to recognize "network" file names, like

alvherre=# SELECT alias, description, token FROM ts_debug(e'\\\\server\\archivos\\foo');
alias | description | token
-----------+-----------------+----------
blank | Space symbols | \\
asciiword | Word, all ASCII | server
blank | Space symbols | \
asciiword | Word, all ASCII | archivos
blank | Space symbols | \
asciiword | Word, all ASCII | foo
(6 lignes)

Is this something worth worrying about?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachment Content-Type Size
tsearch-win-files.patch text/x-diff 2.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zdenek Kotala 2007-10-25 13:51:35 Datum should be defined outside postgres.h
Previous Message Magnus Hagander 2007-10-25 13:06:09 Re: 8.3 GSS Issues