From: | Greg Stark <gsstark(at)mit(dot)edu> |
---|---|
To: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Cc: | Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Per-column collation, work in progress |
Date: | 2010-09-26 13:37:02 |
Message-ID: | AANLkTimb3+_E7=3o9u6_GEV7V3w=FmRMroSgw60SNWrJ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Sep 26, 2010 at 1:15 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> Is there any reason why you prohibit a different encodings per one
> database? Actually people expect from collate per column a possibility
> to store a two or more different encodings per one database.
These are two completely separate problems that only look related. The
main difference is that while collation is a property of the
comparison or sort you're performing encoding is actually a property
of the string itself. It doesn't make sense to specify a different
encoding than what the string actually contains.
You could actually do what you want now by using bytea columns and
convert_to/convert_from and it wouldn't be much easier if the support
were built into text since you would still have to keep track of the
encoding it's in and the encoding you want. We could have a
encoded_text data type which includes both the encoding and the string
and which any comparison function automatically handles conversion
based on the encoding of the collation requested -- but I wouldn't
want that to be the default text datatype. It would impose a lot of
overhead on the basic text operations and magnify the effects of
choosing the wrong collation.
--
greg
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2010-09-26 13:39:19 | Re: Stalled post to pgsql-committers |
Previous Message | Pavel Stehule | 2010-09-26 12:15:25 | Re: Per-column collation, work in progress |