From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paesold <mpaesold(at)gmx(dot)at>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers(at)postgresql(dot)org |
Subject: | default_text_search_config and expression indexes |
Date: | 2007-07-26 22:23:51 |
Message-ID: | 200707262223.l6QMNpo23400@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-advocacy pgsql-hackers |
Oleg Bartunov wrote:
> >> Second, I can't figure out how to reference a non-default
> >> configuration.
> >
> > See the multi-argument versions of to_tsvector etc.
> >
> > I do see a problem with having to_tsvector(config, text) plus
> > to_tsvector(text) where the latter implicitly references a config
> > selected by a GUC variable: how can you tell whether a query using the
> > latter matches a particular index using the former? There isn't
> > anything in the current planner mechanisms that would make that work.
>
> Probably, having default text search configuration is not a good idea
> and we could just require it as a mandatory parameter, which could
> eliminate many confusion with selecting text search configuration.
We have to decide if we want a GUC default_text_search_config, and if so
when can it be changed.
Right now there are three ways to create a tsvector (or tsquery)
::tsvector
to_tsvector(value)
to_tsvector(config, value)
(ignoring plainto_tsvector)
Only the last one specifies the configuration. The others use the
configuration specified by default_text_search_config. (We had an
previous discussion on what the default value of
default_text_search_config should be, and it was decided it should be
set via initdb based on a flag or the locale.)
Now, because most people use a single configuration, they can just set
default_text_search_config and there is no need to specify the
configuration name.
However, expression indexes cause a problem here:
http://momjian.us/expire/fulltext/HTML/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX
We recommend that users create an expression index on the column they
want to do a full text search on, e.g.
CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector(body));
However, the big problem is that the expressions used in expression
indexes should not change their output based on the value of a GUC
variable (because it would corrupt the index), but in the case above,
default_text_search_config controls what configuration is used, and
hence the output of to_tsvector is changed if default_text_search_config
changes.
We have a few possible options:
1) Document the problem and do nothing else.
2) Make default_text_search_config a postgresql.conf-only
setting, thereby making it impossible to change by non-super
users, or make it a super-user-only setting.
3) Remove default_text_search_config and require the
configuration to be specified in each function call.
If we remove default_text_search_config, it would also make ::tsvector
casting useless as well.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2007-07-26 22:24:52 | Re: Linux World at San Francisco |
Previous Message | Tatsuo Ishii | 2007-07-26 22:01:16 | Re: Linux World at San Francisco |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-07-27 00:43:45 | Re: stats_block_level |
Previous Message | Dave Page | 2007-07-26 21:39:55 | Re: stats_block_level |