Re: BUG #16337: Finnish Ispell dictionary cannot be created

From: Artur Zakirov <zaartur(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: matti(dot)linnanvuori(at)portalify(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, PG Bug reporting form <noreply(at)postgresql(dot)org>
Subject: Re: BUG #16337: Finnish Ispell dictionary cannot be created
Date: 2020-04-12 14:13:26
Message-ID: CAKNkYnxeHJJDkw3_s908oMgiv4pn0ODkqGXUxME0FvMDxhu0=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Apr 3, 2020 at 5:55 PM Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> I'm not sure if it's a valid ispell format (it might be, but I'm not
> very good in reading the ispell manpage). But if it is, we should fix
> the code to be able to read it.

I attached the simple patch which fixes PAE_INREPL state.

I don't fully understand the ispell manpage either. I've looked the
ispell source code. They
use yacc for parsing. I'm not good at yacc but it seems that the
escape symbol is used
for all fields. But the patch fixes only PAE_INREPL state.

Also I did some tests with ispell utility. For simplicity I fixed the
.aff file in the following way:

flag *E:
. > YLI
. > YLI\-

And I got the following results:

word: ylijohdon
ok (derives from root JOHDON)

word: yli-johdon
ok (derives from root JOHDON)

word: yly-johdon
how about: yli-johdon

So hyphen escaping works. And results for PostgreSQL with the patch
and the .aff file
fix:

=# select ts_lexize('finnish_ispell', 'yli-johdon');
ts_lexize
-------------------
{johdon,johdossa}
=# select ts_lexize('finnish_ispell', 'ylijohdon');
ts_lexize
-------------------
{johdon,johdossa}

--
Artur

Attachment Content-Type Size
tsearch_escape_hyphen.patch text/x-patch 2.8 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2020-04-13 07:14:14 Re: [BUG] non archived WAL removed during production crash recovery
Previous Message Tomas Vondra 2020-04-11 18:35:52 Re: BUG #16112: large, unexpected memory consumption