tsearch in core patch

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: tsearch in core patch
Date: 2007-06-21 17:44:33
Message-ID: 467AB901.3060907@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

http://www.sigaev.ru/misc/tsearch_core-0.52.gz

Plan was:

1) rename FULLTEXT to TEXT SEARCH in SQL command
done

2) rework Snowball stemmer's as Tom suggested
done

3) ALTER FULLTEXT CONFIGURATION cfgname ADD/ALTER/DROP MAPPING
done

4) remove support of default configuration per scheme. Default configuration
will be only one per locale.
done

5) single encoded files. That will touch snowball, ispell, synonym, thesaurus
and simple dictionaries
done

6) use encoding names instead of locale's names in configuration
Ugh. I missed that knowledge of encoding doesn't allow to determine exact
language --- how do many languages use ISO8859-1 locale?. So, it's not done. Tom
pointed that locale's name isn't portable, but there isn't a lot of names of the
same locale (ru_RU.UTF-8, ru_RU.UTF8 for example). So it's possible to use array
of locales instead of one name.

I didn't see comments about security hole pointed by Tom, so I repeat:

About security holes in PARSER/DICTIONARY. I see following ways to resolve it now:
1) Allow to superuser only to do CREATE/ALTER/DROP PARSER/DICTIONARY
Disadvantage: hosting users will not be able to change dictionaries
2) Remove CREATE/ALTER/DROP PARSER, split pg_ts_dict to pg_ts_dict_template
and pg_ts_dict and accordingly change CREATE/ALTER/DROP DICTIONARY
Disadvantage: parser and dictionary's template will not dump/restore,
it should be restored manually (just a INSERT into
pg_ts_parser/pg_ts_dict_template)
3) Similar to previous point, but:
* CREATE/ALTER/DROP PARSER - super-user only
* CREATE/ALTER/DROP DICTIONARY TEMPLATE - super-user only
* CREATE/ALTER/DROP DICTIONARY - allowed to non-superuser
Disadvantage: new command CREATE/ALTER/DROP DICTIONARY TEMPLATE
Which way do we choose? or I miss some variant?

I would like to go by 3) way... Comments?

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-06-21 18:31:16 Re: What does Page Layout version mean? (Was: Re: Reducing NUMERIC size for 8.3)
Previous Message Darcy Buskermolen 2007-06-21 17:25:27 Re: GUC time unit spelling a bit inconsistent