From: | Artur Zakirov <zaartur(at)gmail(dot)com> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | tomas(dot)vondra(at)2ndquadrant(dot)com, matti(dot)linnanvuori(at)portalify(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, noreply(at)postgresql(dot)org |
Subject: | Re: BUG #16337: Finnish Ispell dictionary cannot be created |
Date: | 2020-04-14 03:44:44 |
Message-ID: | 7ce82fff-5368-47b8-671e-31ea340b0cde@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hello Horiguchi-san,
On 4/13/2020 5:36 PM, Kyotaro Horiguchi wrote:
> Looking man 5 ispell, "Any character with special meaning to parser
> can be changed to an uniterpreted token by backslashing it". It
> depends on how we sholud be strict on that, but I think it is safer
> that we think that any character prefixed by a backslash is an word
> character. (I don't understand how '-' can be in a word by the
> definition in the .affix file, though.)
>
> Since a escaped character is intended to be a part of a word, there's
> no point in identifying minus-sign ad-hockerly, I think.
Thank you to pay attention to the patch.
I don't mind if the patch will work in more broad cases. But I tested
ispell utility with other characters other than '-' before. It seems
that it ignores such affixes or doesn't work properly. But in general
maybe it is better to stick closer with the man page description.
I attached new version of the patch. It fixes only PAE_INFIND and
PAE_INREPL cases. I think we shouldn't allow to escape all cases and it
is safer to have some exceptions:
- In PAE_WAIT_MASK we shouldn't escape comment string which starts with '#'
- PAE_INMASK case is handled by regcomp.c separately and maybe it is
better to leave the string as-is
- PAE_WAIT_FIND can start only with '-'
- I don't think that there is a sense in escaping PAE_WAIT_REPL
And in PAE_INFIND and PAE_INREPL I think we shouldn't allow to escape
',' and '#'.
The condition:
if (t_iseq(str, '\\') && !isescaped &&
(state == PAE_INFIND || state == PAE_INREPL))
maybe is not great, but I cannot come up with a better solution.
--
Artur
Attachment | Content-Type | Size |
---|---|---|
tsearch_escape_hyphen_v2.patch | text/plain | 3.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2020-04-14 04:31:50 | Re: backend crash |
Previous Message | ChiJin Zhou | 2020-04-14 03:42:49 | Buffer overflow when continuously send SIGHUP to postgres |