Re: [PROPOSAL] Improvements of Hunspell dictionaries support

From: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PROPOSAL] Improvements of Hunspell dictionaries support
Date: 2015-11-06 09:33:43
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello again!


I had implemented support for FLAG long, FLAG num and AF parameters. I
attached patch to the e-mail (hunspell-dict.patch).

This patch allow to use Hunspell dictionaries listed in the previous
e-mail: ar, br_fr, ca, ca_valencia, en_ca, en_gb, en_us, fr, gl_es,
hu_hu, is, ne_np, nl_nl, si_lk.

The most part of changes was in spell.c in the affix file parsing code.
The following are dictionary structures changes:
- useFlagAliases and flagMode fields had been added to the IspellDict
- flagval array size had been increased from 256 to 65000;
- flag field of the AFFIX struct also had been increased.

I also had implemented a patch that fixes an error from the e-mail
This patch just ignore that error.


Extention test dictionaries for loading into PostgreSQL and for
normalizing with ts_lexize function can be downloaded from

It would be nice if somebody can do additional tests of dictionaries of
well known languages. Because I do not know many of them.

Other Improvements

There are also some parameters for compound words. But I am not sure
that we want use this parameters.

Artur Zakirov
Postgres Professional:
Russian Postgres Company

Attachment Content-Type Size
hunspell-dict.patch text/x-patch 15.7 KB
hunspell-dict-da_dk.patch text/x-patch 883 bytes

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Albe Laurenz 2015-11-06 09:45:41 Re: [PATCH] RFC: Add length parameterised dmetaphone functions
Previous Message Kyotaro HORIGUCHI 2015-11-06 08:35:34 Re: SortSupport for UUID type