[PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

From: Evan Jones <evan(dot)jones(at)datadoghq(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific
Date: 2023-06-05 15:26:56
Message-ID: CA+HWA9awUW0+RV_gO9r1ABZwGoZxPztcJxPy8vMFSTbTfi4jig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This patch fixes a rare parsing bug with unicode characters on Mac OS X.
The problem is that isspace() on Mac OS X changes its behaviour with the
locale. Use scanner_isspace instead, which only returns true for ASCII
whitespace. It appears other places in the Postgres code have already run
into this, since a number of places use scanner_isspace instead. However,
there are still a lot of other calls to isspace(). I'll try to take a quick
look to see if there might be other instances of this bug.

The bug is that in the following hstore value, the unicode character
"disappears", and is replaced with "key\xc4", because it is parsed
incorrectly:

select E'keyą=>value'::hstore;
hstore
-----------------
"keyą"=>"value"
(1 row)

select 'keyą=>value'::hstore::text::bytea;
bytea
----------------------------------
\x226b6579c4223d3e2276616c756522
(1 row)

The correct result should be:

hstore
-----------------
"keyą"=>"value"
(1 row)

That query is added to the regression test. The query works on Linux, but
failed on Mac OS X.

For a more detailed explanation of how isspace() works, on Mac OS X, see:
https://github.com/evanj/isspace_locale

Thanks!

Evan Jones

Attachment Content-Type Size
hstore-isspace.patch application/octet-stream 2.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tristan Partin 2023-06-05 15:28:50 Re: Let's make PostgreSQL multi-threaded
Previous Message Jacob Champion 2023-06-05 15:22:01 Re: Docs: Encourage strong server verification with SCRAM