Quick Links

Unicode escapes with any backend encoding

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc:	Chapman Flack <chap(at)anastigmatix(dot)net>
Subject:	Unicode escapes with any backend encoding
Date:	2020-01-13 23:31:56
Message-ID:	2393.1578958316@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I threatened to do this in another thread [1], so here it is.

This patch removes the restriction that the server encoding must
be UTF-8 in order to write any Unicode escape with a value outside
the ASCII range. Instead, we'll allow the notation and convert to
the server encoding if that's possible. (If it isn't, of course
you get an encoding conversion failure.)

In the cases that were already supported, namely ASCII characters
or UTF-8 server encoding, this should be only immeasurably slower
than before. Otherwise, it calls the appropriate encoding conversion
procedure, which of course will take a little time. But that's
better than failing, surely.

One way in which this is slightly less good than before is that
you no longer get a syntax error cursor pointing at the problematic
escape when conversion fails. If we were really excited about that,
something could be done with setting up an errcontext stack entry.
But that would add a few cycles, so I wasn't sure whether to do it.

Grepping for other direct uses of unicode_to_utf8(), I notice that
there are a couple of places in the JSON code where we have a similar
restriction that you can only write a Unicode escape in UTF8 server
encoding. I'm not sure whether these same semantics could be
applied there, so I didn't touch that.

Thoughts?

regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CACPNZCvaoa3EgVWm5yZhcSTX6RAtaLgniCPcBVOCwm8h3xpWkw%40mail.gmail.com

Attachment	Content-Type	Size
unicode-escapes-with-other-server-encodings-1.patch	text/x-diff	17.4 KB

Responses

Re: Unicode escapes with any backend encoding at 2020-01-14 01:44:16 from Andrew Dunstan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Geoghegan	2020-01-13 23:49:40	Re: Amcheck: do rightlink verification with lock coupling
Previous Message	Andres Freund	2020-01-13 23:18:04	Re: Why is pq_begintypsend so slow?