Re: Mac OS: invalid byte sequence for encoding "UTF8"

From: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
To: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Mac OS: invalid byte sequence for encoding "UTF8"
Date: 2016-01-27 10:46:20
Message-ID: CACACo5TvwOJ_7xbsyf8MPF2kkTfQ6knXerRcJN3DYKpEruX9Vw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 27, 2016 at 10:59 AM, Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
wrote:

> Hello.
>
> When a user try to create a text search dictionary for the russian
> language on Mac OS then called the following error message:
>
> CREATE EXTENSION hunspell_ru_ru;
> + ERROR: invalid byte sequence for encoding "UTF8": 0xd1
> + CONTEXT: line 341 of configuration file
> "/Users/stas/code/postgrespro2/tmp_install/Users/stas/code/postgrespro2/install/share/tsearch_data/ru_ru.affix":
> "SFX Y хаться шутся хаться
>
> Russian dictionary was downloaded from
> http://extensions.openoffice.org/en/project/slovari-dlya-russkogo-yazyka-dictionaries-russian
> Affix and dictionary files was extracted from the archive and converted to
> UTF-8. Also a converted dictionary can be downloaded from
> https://github.com/select-artur/hunspell_dicts/tree/master/ru_ru

Not sure why the file uses "SET KOI8-R" directive then?

This behavior occurs on:
> - Mac OS X 10.10 Yosemite and Mac OS X 10.11 El Capitan.
> - latest PostgreSQL version from git and PostgreSQL 9.5 (probably also on
> 9.4.5).
>
> There is also the test to reproduce this bug in the attachment.
>

What error message do you get with this test program? (I don't get any,
but I'm not on Mac OS.)

--
Alex

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-01-27 10:49:03 Re: pgbench stats per script & other stuff
Previous Message Shulgin, Oleksandr 2016-01-27 10:34:52 Trivial doc fix in logicaldecoding.sgml