Re: [PROPOSAL] Improvements of Hunspell dictionaries support

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, Alexander Lebedev <a(dot)lebedev(at)postgrespro(dot)ru>, Oleg Bartunov <obartunov(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PROPOSAL] Improvements of Hunspell dictionaries support
Date: 2016-01-09 02:38:35
Message-ID: 20160109023835.GA670563@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Artur Zakirov wrote:

> Now almost all dictionaries are loaded into PostgreSQL. But the da_dk
> dictionary does not load. I see the following error:
>
> ERROR: invalid regular expression: quantifier operand invalid
> CONTEXT: line 439 of configuration file
> "/home/artur/progs/pgsql/share/tsearch_data/da_dk.affix": "SFX 55 0 s
> +GENITIV
>
> If you open the affix file in editor you can see that there is incorrect
> format of the affix 55 in 439 line (screen1.png):

[ another email ]

> I also had implemented a patch that fixes an error from the e-mail
> http://www.postgresql.org/message-id/562E1073.8030805@postgrespro.ru
> This patch just ignore that error.

I think it's a bad idea to just ignore these syntax errors. This affix
file is effectively corrupt, after all, so it seems a bad idea that we
need to cope with it. I think it would be better to raise the error
normally and instruct the user to fix the file; obviously it's better if
the upstream provider of the file fixes it.

Now, if there is proof somewhere that the file is correct, then the code
must cope in some reasonable way. But in any case I don't think this
change is acceptable ... it can only cause pain, in the long run.

> *** 429,443 **** NIAddAffix(IspellDict *Conf, int flag, char flagflags, const char *mask, const c
> err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
> REG_ADVANCED | REG_NOSUB,
> DEFAULT_COLLATION_OID);
> if (err)
> ! {
> ! char errstr[100];
> !
> ! pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
> ! ereport(ERROR,
> ! (errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
> ! errmsg("invalid regular expression: %s", errstr)));
> ! }
> }
>
> Affix->flagflags = flagflags;
> --- 429,437 ----
> err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
> REG_ADVANCED | REG_NOSUB,
> DEFAULT_COLLATION_OID);
> + /* Ignore regular expression error and do not add wrong affix */
> if (err)
> ! return;
> }
>
> Affix->flagflags = flagflags;

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vitaly Burovoy 2016-01-09 02:41:28 Re: New feature "... ALTER CONSTRAINT ... VERIFY USING INDEX"
Previous Message Alvaro Herrera 2016-01-08 23:22:30 Re: snapshot too old, configured by time