Re: Use arrays to store multilanguage texts

From: Joe Conway <mail(at)joeconway(dot)com>
To: Michal Táborský <michal(at)taborsky(dot)cz>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Use arrays to store multilanguage texts
Date: 2004-05-29 15:30:56
Message-ID: 40B8ACB0.8040308@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Michal Táborský wrote:
> I am wondering, if it's effective to use text arrays to store
> multilanguage information.

[...snip...]

> SELECT id, price, name[1], description[1] FROM product ORDER BY name[1]
>
> Is it possible to build an index, which will be used in such query? I
> had no luck with CREATE INDEX product_name1 ON product (r[1]), but maybe
> the syntax is just somehow different.

Maybe something like this:

CREATE TABLE product (
id serial NOT NULL,
price float4,
name text[],
description text[]
);

insert into product (price, name, description)
values (10,
array['apples-english','apples-spanish','apples-german'],
array['big bunch of apples-english','...-spanish','...-german']
);

insert into product (price, name, description)
values (42,
array['pears-english'],
array['big bunch of pears-english']
);

create or replace function get_lang(text) returns int as '
select case
when $1 = ''english'' then 1
when $1 = ''spanish'' then 2
when $1 = ''german'' then 3
else 1
end
' language sql strict immutable;

create or replace function get_lang_str(text[], int)
returns text as '
select coalesce($1[$2], $1[1])
' language sql strict immutable;

create index product_idx1_english on
product(get_lang_str(name,get_lang('english')));
create index product_idx1_spanish on
product(get_lang_str(name,get_lang('spanish')));
create index product_idx1_german on
product(get_lang_str(name,get_lang('german')));

set enable_seqscan to off;

explain analyze
select
id,
price,
get_lang_str(name,get_lang('spanish')) as name,
get_lang_str(description,get_lang('spanish')) as description
from
product
order by
get_lang_str(name,get_lang('spanish'));
QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------
Index Scan using product_idx1_spanish on product (cost=0.00..6.03
rows=2 width=72) (actual time=0.147..0.193 rows=2 loops=1)
Total runtime: 0.246 ms
(2 rows)

select
id,
price,
get_lang_str(name,get_lang('foo')) as name,
get_lang_str(description,get_lang('foo')) as description
from
product
order by
get_lang_str(name,get_lang('foo'));
id | price | name | description
----+-------+----------------+-----------------------------
1 | 10 | apples-english | big bunch of apples-english
2 | 42 | pears-english | big bunch of pears-english
(2 rows)

select
id,
price,
get_lang_str(name,get_lang('spanish')) as name,
get_lang_str(description,get_lang('spanish')) as description
from
product
order by
get_lang_str(name,get_lang('spanish'));
id | price | name | description
----+-------+----------------+----------------------------
1 | 10 | apples-spanish | ...-spanish
2 | 42 | pears-english | big bunch of pears-english
(2 rows)

> Are there any more drawbacks or limitations, that maybe I am not aware
> of, which would discourage you from doing the multilanguage support this
> way? Should we consider this or stick to the old ways?

Notice I was trying to be careful about cases where a bad language
string is used, or the array is missing languages other than english.
I.e. I picked english as the default language and would need to ensure
every array had at least that string.

I'm not sure how maintainable it will be. You'd have to play with it for
a while and decide for yourself.

It would be interesting to see a speed comparison in a real-life
application -- so if you give it a try, please let us know how it turns out.

HTH,

Joe

In response to

Browse pgsql-general by date

  From Date Subject
Next Message vicky 2004-05-29 20:26:13 Embedded SQL - Unable to connect to PostgreSQL Database
Previous Message Tom Lane 2004-05-29 15:18:06 Re: filesystem option tuning