Re: pg_conversion seems rather strangely defined

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: noah(at)leadboat(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_conversion seems rather strangely defined
Date: 2016-01-08 04:35:25
Message-ID: 2053.1452227725@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
>> It would only be important to be able to do it like that if different
>> users of the same database had conflicting ideas about what was the
>> correct conversion between client and database encodings. I submit
>> that that's somewhere around epsilon probability, considering we have
>> not even heard of anyone replacing the system conversions at all.

> I used to had a customer who needs to have different client and
> database encoding than the default. That is, a slightly different
> mapping between Shift-JIS and other database encoding. Due to
> unfortunate historical reasons, there are several Shift-JIS variants
> (in addition to the standard defined by government, there are IBM, NEC
> and Microsoft versions). This is the reason why I wanted to have the
> functionality at that time. I'm not sure the customer still uses the
> functionality, but it is possible that there are other users who have
> similar use cases, since the Shift-JIS variants are still used.

Hm. Odd that we've not heard complaints about the removal of
CONVERT(... USING ...), then.

I think it would be a good idea at least to put back some equivalent
of CONVERT(... USING ...), if for no other reason than that it would
ease testing. As I understood it, the argument to remove it was not
that the functionality was bad, but that we were using a SQL-standard
syntax for what we concluded was not SQL-standard functionality.
I'd propose putting it back with a syntax of, say,

convert_with(input bytea, conversion_name text) returns bytea

As for the client encoding conversion case, I still say a
search-path-based lookup is a horrible idea, and furthermore there
seems no very good reason why it has to be restricted to default
conversions. Aside from other arguments, that tends to push people
to mark *every* conversion as default, which is outright silly if
you have several competing ones.

As a sketch of an alternative, consider inventing a GUC named
preferred_conversions or some such, which is a list of
possibly-schema-qualified conversion names. When establishing an
original or new value of client_encoding, we look through the list
for the first entry that exists and performs the desired encoding
conversion (whether or not it is default). If there is no match,
look up the (unique) default conversion for the encoding pair, and
use that. (Obviously this has to be done twice, once for each
direction, when setting up client encoding conversions.) We could
apply the same rules for identifying which specific conversion to use
in convert() and siblings.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2016-01-08 04:53:22 Re: Support for N synchronous standby servers - take 2
Previous Message Tatsuo Ishii 2016-01-08 03:57:44 Re: pg_conversion seems rather strangely defined