Quick Links

Re: PGDay.it collation discussion notes

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Dave Gudeman <dave(dot)gudeman(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Gregory Stark <stark(at)enterprisedb(dot)com>, Postgres <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: PGDay.it collation discussion notes
Date:	2008-10-23 06:25:02
Message-ID:	490018BE.4020101@enterprisedb.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Dave Gudeman wrote:
> On Mon, Oct 20, 2008 at 2:28 AM, Heikki Linnakangas <
> heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> Tom Lane wrote:
>>> Another objection to this design is that it's completely unclear that
>>> functions from text to text should necessarily yield the same collation
>>> that went into them, but if you treat collation as a hard-wired part of
>>> the expression syntax tree you aren't going to be able to do anything
>>> else.
>>> (What will you do about functions/operators taking more than one text
>>> argument?)
>>>
>> Whatever the spec says. Collation is intimately associated with the
>> comparison operations, and doesn't make any sense anywhere else.
>
> Of course the comparison operator is involved in many areas such as index
> creation, ORDER BY, GROUP BY, etc. In order to support GROUP BY and hash
> joins on values with a collation type, you need to have a hash function
> corresponding to the collation.

Yeah, those are all related to comparison operators.

>> Looking at an individual value, collation just doesn't make sense.
>> Collation is property of the comparison operation, not of a value.
>
> Collation can't be a property of the comparison operation because you don't
> know what comparison to use until you know the collation type of the value.
> Collation is a property of string values, just like scale and precision are
> properties of numeric values. And like those properties of numeric values,
> collation can be statically determined. The rules for determining what
> collation to use in an expression are similar in kind to the rules for
> determining what the resulting scale and precision of an arithmetic
> expression are. If you consider collation as just part of the type, a lot of
> things are easier.

Yeah, the typmod of numerics and varchars is a good analogue, in the
parser. The current rules for those are probably not exactly the same
that the spec requires for collation, but it's definitely similar.

> This is a good way to implement collated comparisons, but it's not a new
> concept, just an additional argument to the comparison operator. It isn't
> necessary to create new concepts to handle collation when it fits so well
> into an existing concept, the type. For example, the difference between two
> indexes with collation is a difference in the type of the index --just like
> the difference between a DECIMAL(10,4) index and a DECIMAL(20,2) index.

Hmm. That could work. So collation would be an extra typemod on the
string data types, and casting can be used to force a specific
collation. I think we're missing some pieces, like passing the typmod to
the comparison function; numeric comparison doesn't depend on the scale
and precision, while collation would depend on the typemods.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Re: PGDay.it collation discussion notes at 2008-10-22 17:43:06 from Dave Gudeman

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Charles Duffy	2008-10-23 06:54:41	Re: Making pg_standby compression-friendly
Previous Message	Heikki Linnakangas	2008-10-23 06:15:39	Re: Making pg_standby compression-friendly