pgsql: improve support of agglutinative languages (query with compound

From: teodor(at)svr1(dot)postgresql(dot)org (Teodor Sigaev)
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: improve support of agglutinative languages (query with compound
Date: 2005-01-25 15:24:40
Message-ID: 20050125152440.B65363A571B@svr1.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Log Message:
-----------
improve support of agglutinative languages (query with compound words).

regression=# select to_tsquery( '\'fotballklubber\'');
to_tsquery
------------------------------------------------
'fotball' & 'klubb' | 'fot' & 'ball' & 'klubb'
(1 row)

So, changed interface to dictionaries, lexize method of dictionary shoud return
pointer to aray of TSLexeme structs instead of char**. Last element should
have TSLexeme->lexeme == NULL.

typedef struct {
/* number of variant of split word , for example
Word 'fotballklubber' (norwegian) has two varian to split:
( fotball, klubb ) and ( fot, ball, klubb ). So, dictionary
should return:
nvariant lexeme
1 fotball
1 klubb
2 fot
2 ball
2 klubb

*/
uint16 nvariant;

/* currently unused */
uint16 flags;

/* C-string */
char *lexeme;
} TSLexeme;

Modified Files:
--------------
pgsql/contrib/tsearch2:
dict.c (r1.6 -> r1.7)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/dict.c.diff?r1=1.6&r2=1.7)
dict.h (r1.2 -> r1.3)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/dict.h.diff?r1=1.2&r2=1.3)
dict_ex.c (r1.3 -> r1.4)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/dict_ex.c.diff?r1=1.3&r2=1.4)
dict_ispell.c (r1.5 -> r1.6)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/dict_ispell.c.diff?r1=1.5&r2=1.6)
dict_snowball.c (r1.3 -> r1.4)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/dict_snowball.c.diff?r1=1.3&r2=1.4)
dict_syn.c (r1.4 -> r1.5)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/dict_syn.c.diff?r1=1.4&r2=1.5)
query.c (r1.12 -> r1.13)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/query.c.diff?r1=1.12&r2=1.13)
ts_cfg.c (r1.11 -> r1.12)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/ts_cfg.c.diff?r1=1.11&r2=1.12)
ts_cfg.h (r1.4 -> r1.5)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/ts_cfg.h.diff?r1=1.4&r2=1.5)
pgsql/contrib/tsearch2/gendict:
dict_tmpl.c.IN (r1.2 -> r1.3)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/gendict/dict_tmpl.c.IN.diff?r1=1.2&r2=1.3)
pgsql/contrib/tsearch2/ispell:
spell.c (r1.18 -> r1.19)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/ispell/spell.c.diff?r1=1.18&r2=1.19)
spell.h (r1.8 -> r1.9)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/ispell/spell.h.diff?r1=1.8&r2=1.9)

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Eisentraut 2005-01-25 17:30:02 pgsql: Translation update
Previous Message Michael Meskes 2005-01-25 12:51:31 pgsql: Fixed segfault due to freeing a struct definition twice if it was