pgsql: Avoid use of sscanf() to parse ispell dictionary files.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Avoid use of sscanf() to parse ispell dictionary files.
Date: 2016-02-11 00:30:52
Message-ID: E1aTf9o-0001ga-LG@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Avoid use of sscanf() to parse ispell dictionary files.

It turns out that on FreeBSD-derived platforms (including OS X), the
*scanf() family of functions is pretty much brain-dead about multibyte
characters. In particular it will apply isspace() to individual bytes
of input even when those bytes are part of a multibyte character, thus
allowing false recognition of a field-terminating space.

We appear to have little alternative other than instituting a coding
rule that *scanf() is not to be used if the input string might contain
multibyte characters. (There was some discussion of relying on "%ls",
but that probably just moves the portability problem somewhere else,
and besides it doesn't fully prevent BSD *scanf() from using isspace().)

This patch is a down payment on that: it gets rid of use of sscanf()
to parse ispell dictionary files, which are certainly at great risk
of having a problem. The code is cleaner this way anyway, though
a bit longer.

In passing, improve a few comments.

Report and patch by Artur Zakirov, reviewed and somewhat tweaked by me.
Back-patch to all supported branches.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/51e78ab4ff3282963f5e8ba2633040829413aefa

Modified Files
--------------
src/backend/tsearch/spell.c | 166 ++++++++++++++++++++++++++++++++++++++++----
1 file changed, 153 insertions(+), 13 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Noah Misch 2016-02-11 01:40:41 pgsql: Accept pg_ctl timeout from the PGCTLTIMEOUT environment variable
Previous Message Tom Lane 2016-02-10 21:02:48 pgsql: Revert "Temporarily make pg_ctl and server shutdown a whole lot