Re: pg should ignore u+200b zero width space

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: James Cloos <cloos(at)jhcloos(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: pg should ignore u+200b zero width space
Date: 2020-11-03 16:19:04
Message-ID: 922842.1604420344@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> On 03/11/2020 16:52, Tom Lane wrote:
>> Perhaps it'd be all right to confine the change in behavior to
>> just modifying the error text in cases where we were going to
>> throw an error anyway. But I think this is much harder than
>> it sounds to do in a valid, safe way.

> Yeah, my thinking was to just add a hint when you're throwing a syntax
> error anyway. Something simple like check if client_encoding is utf8 and
> there is a U+200b in the query string, and add the hint if so. It
> doesn't need to catch all cases, and rare false positives are OK too.

TBH, that's exactly the kind of under-baked solution I *don't* want.

For starters, &zwsp is hardly the only problem here; even more common
is &nbsp, which exists in most encodings not only UTF8. And once we
had that, there'd no doubt be pressure to ignore BOMs. And so on.

Also I don't really want us throwing such errors when the funny space is
inside a literal or comment; chasing false positives like that will cost
users way more time than they could save when the hint is on-point.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2020-11-03 16:22:59 Re: User with BYPASSRLS privilege can't change password
Previous Message Wolfgang Walther 2020-11-03 15:51:07 User with BYPASSRLS privilege can't change password