Unicode escapes with any backend encoding

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Chapman Flack <chap(at)anastigmatix(dot)net>
Subject: Unicode escapes with any backend encoding
Date: 2020-01-13 23:31:56
Message-ID: 2393.1578958316@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I threatened to do this in another thread [1], so here it is.

This patch removes the restriction that the server encoding must
be UTF-8 in order to write any Unicode escape with a value outside
the ASCII range. Instead, we'll allow the notation and convert to
the server encoding if that's possible. (If it isn't, of course
you get an encoding conversion failure.)

In the cases that were already supported, namely ASCII characters
or UTF-8 server encoding, this should be only immeasurably slower
than before. Otherwise, it calls the appropriate encoding conversion
procedure, which of course will take a little time. But that's
better than failing, surely.

One way in which this is slightly less good than before is that
you no longer get a syntax error cursor pointing at the problematic
escape when conversion fails. If we were really excited about that,
something could be done with setting up an errcontext stack entry.
But that would add a few cycles, so I wasn't sure whether to do it.

Grepping for other direct uses of unicode_to_utf8(), I notice that
there are a couple of places in the JSON code where we have a similar
restriction that you can only write a Unicode escape in UTF8 server
encoding. I'm not sure whether these same semantics could be
applied there, so I didn't touch that.

Thoughts?

regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CACPNZCvaoa3EgVWm5yZhcSTX6RAtaLgniCPcBVOCwm8h3xpWkw%40mail.gmail.com

Attachment Content-Type Size
unicode-escapes-with-other-server-encodings-1.patch text/x-diff 17.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2020-01-13 23:49:40 Re: Amcheck: do rightlink verification with lock coupling
Previous Message Andres Freund 2020-01-13 23:18:04 Re: Why is pq_begintypsend so slow?