| From: | "Radek Strnad" <radek(dot)strnad(at)gmail(dot)com> | 
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org | 
| Subject: | [WIP] collation support revisited (phase 1) | 
| Date: | 2008-07-10 21:24:29 | 
| Message-ID: | de5165440807101424l14fb535byf43fc665351c4dfd@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi,
after long discussion with Mr. Kotala, we've decided to redesign our
collation support proposal.
For those of you who aren't familiar with my WIP patch and comments from
other hackers here's the original mail:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg00019.php
In a few sentences - I'm writing collation support for PostgreSQL that is
almost independent on used collating function. I will implement POSIX
locales but switch to ICU will be quite easy. Collations and character sets
defined by SQL standard will be hard coded so we avoid non-existence in some
functions.
The whole project will be divided into two phases:
phase 1
Implement "sort of framework" so the PostgreSQL will have basic guts
(pg_collation & pg_charset catalogs, CREATE COLLATION, add collation support
for each type needed) and will support collation at database level. This
phase has been accepted as a Google Summer of Code project.
phase 2
Implement the rest - full collation at column level. I will continue working
on this after finishing phase one and it will be my master degree thesis.
How will the first part work?
Catalogs
- new catalogs pg_collation and pg_charset will be defined
- pg_collation and pg_charset will contain SQL standard collations +
optional default collation (when set other than SQL standard one)
- pg_type, pg_attribute, pg_namespace will be extended with references to
default records in pg_collation and pg_charset
initdb
- pg_collation & pg_charset will contain each pre-defined records regarding
SQL standard and optionally one record that will be non-standard set when
creating initdb (the one using system locales)
- these two records will be referenced by pg_type, pg_attribute,
pg_namespace in concerned columns and will be concidered as default
collation that will be inherited
CREATE DATABASE ... COLLATE ...
- after copying the new database the collation will be default (same as
cluster collation) or changed by COLLATE statement. Then we update pg_type,
pg_attribute and pg_namespace catalogs
- reindex database
When changing databases the database collation will be retrieved from type
text from pg_type. This part should be the only one that will be deleted
when proceeding with phase 2. But that will take a while :-)
Thanks for all your comments
Regards
        Radek Strnad
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jan Urbański | 2008-07-10 21:26:35 | Re: gsoc, text search selectivity and dllist enhancments | 
| Previous Message | Michelle Caisse | 2008-07-10 21:24:27 | Re: Generating code coverage reports |